question

Rahul-7321 avatar image
0 Votes"
Rahul-7321 asked Rahul-7321 rolled back

Unable to load data into Azure Synapse from ADLS Gen 2 .csv file

Hi @ChiragMishraMSFT-1092

Upon trying to load the data from ADLS(csv files) into Azure DW, am encountering a error in loading the data in DW. Here is the error message:
"message": "Failure happened on 'Sink' side. ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'TABLENAME.csv' with row number 187985: found more columns than expected column count 199.,Source=Microsoft.DataTransfer.Common,'",


Below settings are made in ADF for ADLS while loading the data:
Column delimiter: | (Pipe,Comma)
Encoding: UTF-16 (due to the presence of special characters in the data source)
Escape character: No Escape Character
Quote character: No Quote Character

Please note that if am using Escape Character as Backslash(), the rows having '\' getting distorted and the data is missing in few rows. Using 'No Escape Character' is solving the issue only when the affected rows are filtered out and loaded into Azure DW but when I tried loading in bulk, the above error is thrown.

Below are the troubleshooting steps followed to fix this issues but of no luck:
1. Column mapping is done and there is 100% match of columns. No errors in mapping
2. Tried autocreate table option, still the same error was seen
3. No additional columns noticed by importing schemas to check any column count at Source and Target

Please let me know if I need to clarify my query more.
Would be of great help if you could help me find a fix to handle such issues now and in future.

Looking forward for the response at the earliest.

Thanks in advance, Cheers



azure-synapse-analyticsazure-data-lake-storage
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Could you try opening the CSV in text editor and inspect line no - 187985. It might have more columns I guess

0 Votes 0 ·

Thanks for your response @VaibhavChaudhari.

There are lot many rows with the more columns expected issue where the data contains backslash() because of which the data is getting disturbed.

To overcome this issue, in ADF used the below setting for ADLS, but the setting is not working when I do the bulk load into ADW
Column delimiter: | (Pipe,Comma)
Encoding: UTF-16 (due to the presence of special characters in the data source)
Escape character: No Escape Character
Quote character: No Quote Character

Kindly, let me know if needed more information on the same.

Please help me with some workaround to fix it.

Thanks a mill.

0 Votes 0 ·

1 Answer

Rahul-7321 avatar image
0 Votes"
Rahul-7321 answered Rahul-7321 rolled back

The issue is sorted.

Below is the quick walkthrough of my issue with resolution:

Root Cause: The table has some text columns and the data in few rows are ending with backslash() due to which the data is merged into other columns resulting in data inaccuracy. Here is the setting made in ADF for ADLS as Source & Sink to overcome the problem:

Column delimiter: | (Pipe,Comma)
Encoding: UTF-8 or UTF-16 (if there are special characters in the data source to be properly inserted into DW)
Escape character: Escape Character as Backslash()
Quote character: No Quote Character

The above solution worked for me and hope this would help others as well.

Thanks.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.