question

Hochmanj-5166 avatar image
0 Votes"
Hochmanj-5166 asked MartinJaffer-MSFT commented

Help escaping with External File Format

I'm trying to create an external file format to create an external table off of a csv file with a simplified example row shown below consisting of three string fields (the last being html):


  foo,bar,"<div id=""myId"">\n<p>hello</p>\n</div>"\n

it seems, that all the '\n''s including those inside the last field are creating new lines.

Is there a way to escape the html \n while honoring the 'actual' \n that occurs at the end of the row?

azure-synapse-analytics
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

RyanAbbey-0701 avatar image
0 Votes"
RyanAbbey-0701 answered MartinJaffer-MSFT commented

In your dataset, set the Quote character to '"' (double quote)

· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Isorry, i should have specified i'm using tsql with something like ..

 CREATE EXTERNAL FILE FORMAT CSV2

WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS ( FIELD_TERMINATOR = ',', USE_TYPE_DEFAULT = FALSE
);

that being said, taking your advice i did notice i can correctly parse it in a dataset if i use escape double quote. Perhaps I should copy in a pipeline to fix the parsing then create an external table?

for further context, the csv's are from Power BI data flows

0 Votes 0 ·

In that case, yes, you need to add a QUOTE to your FORMAT_OPTIONS... can't see it in the format_options list on MS help site so not sure if it should be QUOTE but potentially could be FIELD_OPTIONALLY_ENCLOSED_BY, no idea if there's been standardisation on that part

I wouldn't go for using escape (although who knows with Microsoft), the standardisation for escaping the quote character is the quote character but not necessarily for other characters so you do run the risk of generating unexpected results (.e.g normally \t represents a tab, if you start your string as "t, it will get interpreted as a tab rather than a string starting with "t"

0 Votes 0 ·

I notice one issue with the quote character. There are quotes inside, as well as surrounding the data. This would break, unless the double-double quote is a form of escaping.

I don't think the ask is possible as-is, but I have been wrong before.

0 Votes 0 ·