question

JogalekarMandar-9214 avatar image
0 Votes"
JogalekarMandar-9214 asked BalasubramanianNNagarajan-4131 answered

Azure data factory tempdb usage in data flow

0


I am using data flow to copy large data to sql azure database.

I recieved an error

Operation on target Data flow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job
failed due to reason: at Sink 'xxxx':
shaded.msdataflow.com.microsoft.sqlserver.jdbc.SQLServerException: The database 'tempdb' has
reached its size quota. Partition or delete data, drop indexes, or consult the documentation
for possible resolutions.","Details":"at Sink 'xxx':
shaded.msdataflow.com.microsoft.sqlserver.jdbc.SQLServerException: The database 'tempdb' has
reached its size quota. Partition or delete data, drop indexes, or consult the documentation
for possible resolutions."}
To resolve, i tried creating a different schema in the data flow setting and unchecked "Use tempdb" option and provided with schema name of schema i created.

I continue to get the same error with not much information available about solving this

azure-data-factoryazure-sql-database
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

HimanshuSinha-MSFT avatar image
0 Votes"
HimanshuSinha-MSFT answered HimanshuSinha-MSFT edited

Hello @JogalekarMandar-9214 ,
As mentioned by @AlbertoMorillo the change which you did is making the the table being created in the userDB as previously to tempDB . If I were you could have tried

  • Upgrade the DB to a higher tier and scale down after the ingestion ( @AlbertoMorillo also suggested this ) , the down side is this will increase your azure bill .

  • You can set the Batch size , the challenge is to find the correct batch size , this will be comparatively slower then the above one .

Thanks
Himanshu
Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members


103939-image.png



image.png (18.9 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

AlbertoMorillo avatar image
0 Votes"
AlbertoMorillo answered JogalekarMandar-9214 commented

By default, Data Factory will use a global temporary table to store data as part of the loading process. You can alternatively uncheck the "Use TempDB" option on the Sink transformation as shown here, ask Data Factory to store the temporary holding table in a user database that is located in the database that is being used for this Sink.

Another possible workaround scale up/down the tier of the database to Premium tier. You can also have ADF to scale up the database prior to that ETL process and when finished, you can have ADF scale down the database.


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I did uncheck the "Use TempDB" option and instead provided my own schema . I can also see ADF creating temporary table there . but still I see the same error during sink. Does adf use tempdb for any other purpose than what is configurable ?

1 Vote 1 ·
JogalekarMandar-9214 avatar image
0 Votes"
JogalekarMandar-9214 answered

In case anyone face this issue, the only solution which worked for me was to increase my Db Tier to a compute heavy one with a big price.

I used business critical version with 4 core of azure sql.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

BalasubramanianNNagarajan-4131 avatar image
0 Votes"
BalasubramanianNNagarajan-4131 answered

I am also facing similar issue in Dataflow. As per the documentation, Data Factory will use a global temporary table to store data as part of the loading process. You can alternatively uncheck the "Use TempDB" option on the Sink transformation as shown here, ask Data Factory to store the temporary holding table in a user database that is located in the database that is being used for this Sink.

I also unchecked the tempdb option in the sink and specified use schema TMP. I see the temp table is created in the TMP schema and monitored the tempdb usage in the server. The query does an upsert of 20 million row to an existing table of 10 million rows.

  1. Why tempDB space is used though I specified the user schema?

  2. I upgraded the sqldb version from Hyperscale Gen5 V2 cores to Hyperscale Gen5 V16 cores and it consumed all 512GB. How do I find the right vcore size for the query ruuning?

@PoelvanderRERon-6976
@MarkKromer-MSFT



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.