question

YangChowmun-0538 avatar image
0 Votes"
YangChowmun-0538 asked ShaikMaheer-MSFT edited

ADF transfer rate is low

I am getting the data from google big query using self hosted IR and save it into azure MySQL database using Azure Manged IR with VNET integration. Currently the transfer rate is slow. It took about an hour to transfer 1G of data.

I have read the official website about tuning the performance by adjusting the DIU and parallel copies. I have try to increase these parameters and the throughput does not really change much.

I have checked on azure MySQL database, the memory consumption is about 50% for 2vcore. I did try to increase to 4vcore and the memory consumption is reduced to about 25%. So I presume that it should not be the bottleneck for the process.

How should I identify the bottleneck of the process and improve the throughput? 136814-untitled.png
Any advice would be greatly appreciated!


azure-data-factory
untitled.png (47.3 KiB)
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.


Have you gone through below performance tuning doc? Please have a look.

https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-performance-troubleshooting#performance-tuning-
tips Thanks

0 Votes 0 ·

Yea, I saw the document and have tried some tuning suggested. But overall the performance is not improved drastically. Logically, how long will it take for a transfer of 1GB data from GB to azure MySQL?

0 Votes 0 ·
Samy-7940 avatar image Samy-7940 YangChowmun-0538 ·

This heavily depends on network bandwidth and generally higher the Bandwidth the more quickly data gets
copied. Few pointers for your reference if that helps:

Azure provides a set of enterprise-grade data storage and data warehouse solutions, and Copy Activity
offers a highly optimized data loading experience that is easy to configure and set up. With just a single
copy activity, you can achieve:

Loading data into Azure Synapse Analytics at 1.2 GBps. For a walkthrough with a use case, see Load 1 TB into
Azure Synapse Analytics under 15 minutes with Azure Data Factory.
Loading data into Azure Blob storage at 1.0 GBps
Loading data into Azure Data Lake Store at 1.0 GBps Thanks

0 Votes 0 ·
Show more comments

1 Answer

ShaikMaheer-MSFT avatar image
1 Vote"
ShaikMaheer-MSFT answered ShaikMaheer-MSFT edited

Hi @YangChowmun-0538 ,

Thank you for posting query on Microsoft Q&A Platform.

Data movement throughput can be depends on many factors. Such as,

  • Network bandwidth between source and destination data stores.

  • Source or destination data store input/output operations per second (IOPS) and bandwidth

  • In case of Azure IR, what DIUs defined.

  • In case of self-hosted IR, Machine capacity and no. of nodes using.

  • How we are copying data? Either using single copy activity or using multiple copy activities by Partitioning data. etc..

Below are few recommendations which you can try to increase data movement throughput.

  • When using Azure integration runtime (IR), you can specify up to 256 data integration units (DIUs) for each copy activity, in a server less manner.

  • When using self-hosted IR, you can take either of the following approaches:
    Manually scale up the machine.
    Scale out to multiple machines (up to 4 nodes), and a single copy activity will partition its file set across all nodes.

  • Including ForEach to partition and spawn off multiple concurrent copy activities.

In case of Self-hosted IR, recommendation is to use a dedicated machine to host IR. The machine should be separate from the server hosting the data store. Start with default values for parallel copy setting and using a single node for the self-hosted IR.

Please check below documentation, where many recommendation's listed for copy activity performance increase.
https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-performance

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators



· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I have found the root cause. The bottleneck is the IOPS for MySQL. For azure database for MySQL, the IOPS is proportional to the storage. I have started with 5Gb which result in only max of 100IOPS and the transfer speed is pretty slow for 100IOPS.

0 Votes 0 ·

Hi @YangChowmun-0538 ,

Awesome. Happy to know that you found cause. Thank you for sharing.

0 Votes 0 ·