question

AzeemK-2483 avatar image
0 Votes"
AzeemK-2483 asked MarkKromer-2402 commented

move || colum delimited file to Synapse using ADF

I want move || column delimited file from on prem to Synapse using ADF V2 , I am keep getting ADF error that null values are not allowed in my Source data set What's the best way to ingest || column delimited file using ADF


sample:
phone_id||user_id||customer_id||area_code||phone_number||extension||phone_type

100||1||1||512||5551234||23||Home

122533-pipedelimited.jpg


azure-data-factory
pipedelimited.jpg (285.3 KiB)
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

In the dataset definition, click "Edit" under column delimiter. Enter || for the delimiter.

0 Votes 0 ·

Hey @MarkKromer-MSFT ,
Dual delimiter is not supported in ADF . Though you would be able to preview the data but not use that dataset:

123009-dualdeli.png


0 Votes 0 ·
dualdeli.png (14.3 KiB)

That's because you're using Copy. Use data flow where multi-character delimiter is supported.

0 Votes 0 ·

1 Answer

NandanHegde-7720 avatar image
0 Votes"
NandanHegde-7720 answered MarkKromer-2402 commented

Hey,
Since your file delimiter is || and ADF till date supports only single character for column delimiter,

for you to consume the file into synapse ; I would suggest the below steps:

1) Create a blob storage
2) Copy the file from On prem into blob storage as is ( use binary type in dataset as source and sink)
3) In synapse create external table and all its necessary parameters :
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables?tabs=hadoop
You can specify || as the field delimiter in synapse which is what we require
4) Create a Stored procedure that would load data from external table into your actual table
5) Call that SP in ADF after file copy activity

· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

how do I do binary type in data set source and sink??

0 Votes 0 ·

You need to create a new dataset of type File (my assumption is File was also your original source as On prem) and then in format you need to select Binary :

123065-dataset.png

123023-format.png

And make sure your sink dataset is blob storage with binary format.

For more details :
Dataset: choose the format you want. To copy file as-is, select "Binary" type.
https://docs.microsoft.com/en-us/azure/data-factory/format-binary

0 Votes 0 ·
dataset.png (31.8 KiB)
format.png (20.8 KiB)

Just use data flow instead. Copy doesn't support multi-char delimiter.

0 Votes 0 ·