How to read a CSV file with Line break and Quotes through PolyBase in Synapse

Priya Jha 866 Reputation points
2020-09-11T06:51:43.757+00:00

Hi All,

I have an automated source that generates Excel file with data that can contain line breaks or double quotes or some delimiter values.

I am converting this Excel file to CSV using ADF Copy activity and to handle column delimiter in data i have used Quote char property in copy activity.

Now i am reading this CSV file through Polybase and i have mentioned String_Delimiter as Double quotes in my External file format definition.

But i am getting error for 2 scenarios:

1) If the file contains line breaks, this is the error: No Closing String Delimiter
2) If the data also has double quotes: Could not find a delimiter after String Delimiter

How to handle these two scenarios in Polybase.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,395 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,031 Reputation points
    2020-09-21T23:01:37.263+00:00

    I have tried, and it seems not to be possible. Polybase does not appear to be compatible with having the row terminator inside of a quoted field.

    You can either change the data, such as escaping, or use something else.

    Using DataFactory , Polybase works when staging is enabled to accomplish this. Data Factory uses the staging to make the format acceptable to Polybase. I tested, and the above sample came out correct.

    @Priya Jha @Nandan Hegde