Why is the Copy-Activity slower than a Stored Procedure in Azure Data Factory?

RenPanning-1598 60 Reputation points
2024-04-12T12:19:33.7+00:00

I am building an Azure Data Factory pipeline to mimic the Stored-Procedure-based workflow, where the first step is to read several JSON-files from a connected Azure Blob Storage and copy the data to our Azure SQL Database. However, the Dataflow takes about 40 minutes to copy 4-5GB of data spread over roughly 50 files in the same folder and the Stored Procedure we were using before took only 10 minutes. Even a Copy-Activity is slow.

What's weird is that the monitoring view of the database shows DTU usage at 100% in both scenarios. How can the Dataflow use so many resources despite having much lower throughput? I have already consulted the following pages (and a few more), but none have helped:

What am I doing wrong, and if I'm not doing anything in particular wrong, why would I ever bother using ADF to copy anything at all?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,432 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,571 questions
{count} votes

Accepted answer
  1. Vinodh247-1375 11,206 Reputation points
    2024-04-14T06:45:59.8733333+00:00

    Dataflow is a serverless data transformation service in Azure Data Factory. It processes data in parallel using Spark-based execution. It’s designed for complex transformations, data enrichment, and aggregations. However, it may introduce overhead or delay due to its parallel execution model.

    The DTU limit of a database is determined by CPU, reads, writes, and memory available to the database. However, because the SQL Database engine typically uses all available memory for its data cache to improve performance, the avg_memory_usage_percent value will usually be close to 100 percent, regardless of current database load. Therefore, even though memory does indirectly influence the DTU limit, it is not used in the DTU utilization formula.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful