question

CloudRock avatar image
0 Votes"
CloudRock asked Samy-7940 edited

Azure Data Factory - Degree of copy parallelism

Hello,

I'm running an Azure Data Factory that copies multiple tables from on prem SQL server to an Azure Data Lake.

So, I set many Copy activities through Az Data Factory Designer to execute parallel copies (each activity is carrying on the extract of one table).

For better resources optimization, I would like to know if there is a way to copy multiple tables with one Copy activity ?

I heard of "degree of copy parallelism", but don't know how to use it ?

Rgds,



azure-data-factory
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

VaibhavChaudhari avatar image
0 Votes"
VaibhavChaudhari answered Samy-7940 edited

For better resources optimization, I would like to know if there is a way to copy multiple tables with one Copy activity ?

No, each copy activity will copy single table. To copy 10 tables data, you would need to run 10 copy activities

I heard of "degree of copy parallelism", but don't know how to use it ?

This is to increase maximum number of threads. for ex. if we copy a folder from one data lake to another, increasing this number will increase the copy throughput as it will copy more number of files at once.

For reference - copy-activity-performance-features

For better orchestration and to avoid creating pipeline, copy activity every time - you can have a flow based on control table as shown here - tutorial-bulk-copy-portal



Please don't forget to Accept Answer and Up-vote if the response helped -- Vaibhav

· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @VaibhavChaudhari

Thanks for explanation.

I walked through the article and see how to make ForEach activity. For 10 tables copy (ex.), How many copy activities should be set under ForEach activity ?

In the article, I saw an initiation with first table from the source dataset and then input the following query in source section : SELECT * FROM [@{item().TABLE_SCHEMA}].[@{item().TABLE_NAME}] which should initiate the copy of the all present tables.

But the run failed.

Rgds








0 Votes 0 ·

In foreach activity, you should have only one copy activity. The flow should be metadata driven (by control table)

This video is good for reference - Azure Data Factory | Copy multiple tables in Bulk with Lookup & ForEach


If you are comfortable with this metadata driven flow, go for it. Otherwise you can also go with individual pipelines, copy activities separately and run them in parallel.

0 Votes 0 ·

Hi @bassamballaji-4529 ,

Not sure if you have already looked into this . May be you are trying to do something similar .

https://social.msdn.microsoft.com/Forums/en-US/d3d414b2-79b0-4e91-9335-e4c93675443a/dynamic-loading-tables-while-ignoring-conflicing-columns-dynamic-schema-mapping?forum=AzureDataFactory

Let me know how it goes .

Thanks
Himanshu

0 Votes 0 ·

Hi

The video explains well what I'm trying to do.

Best Rgds,

0 Votes 0 ·

Hi @VaibhavChaudhari ,your comment "No, each copy activity will copy single table. To copy 10 tables data, you
would need to run 1 copy activities" got me really confused here. Where as below thread explicitly mentions that " To use one Copy activity for multiple tables, you'd need to wrap a single parameterized Copy activity in a ForEach
activity. The ForEach can scale to run multiple sources at one time by setting isSequential to false and setting the batchCount value to the number of threads you want. The default batch count is 20 and the max is 50. Copy Parallelism on a single Copy activity just uses more threads to concurrently copy partitions of data from the same data source. "

https://stackoverflow.com/questions/63488899/azure-data-factory-degree-of-copy-parallelism

Which I also believe should be the case. Could you please clarify. Thanks

0 Votes 0 ·