question

MehmetBASERDEM-9440 avatar image
0 Votes"
MehmetBASERDEM-9440 asked ShaikMaheer-MSFT commented

Updates and Inserts on CDM Folders

Hi all,


After having the initial load into the CDM folders, how can I accomplish incremental data loading on CDM folders. I have delta records coming from Change Data Capture. How can I reflect JUST these delta records to CDM folders. I don't want to full data upload. My data set is huge around 3TB.

Thanks

Mehmet Başerdem

azure-data-factory
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @MehmetBASERDEM-9440 ,

Thank you for posting query in Microsoft Q&A platform.

Incremental load in to file is not possible. All new incoming rows can be planned to place in new file because files will always override they cant be updated.

Could you please share below details to understand it better and share detailed resolution.

  • What is your source and sink dataset types?

  • What is file format in target?

Thank you.

0 Votes 0 ·

Hello Shaik Maheer,

Source will be SQL database and sink will be a CDM folder on Azure Data Lake / Azure Storage.

Output will be CDM format / JSON & CSV files. Format details can be found here: https://docs.microsoft.com/en-us/azure/data-factory/format-common-data-model



If incremental load is not possible, Updates are not possible at all.

0 Votes 0 ·

Hi @MehmetBASERDEM-9440 ,

Following up to check, is below provided answer helps you? If yes, Please Accept Answer so that community will also get benefit. Thank you.

0 Votes 0 ·

1 Answer

MarkKromer-MSFT avatar image
0 Votes"
MarkKromer-MSFT answered

Since CDM folders are stored as either CSV or Parquet files in ADLS Gen2 storage accounts, you will need to implement a partitioned folder strategy to represent changes by day inside folders that indicate that partition

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.