Debasish22-0229 avatar image
0 Votes"
Debasish22-0229 asked PRADEEPCHEEKATLA-MSFT commented

Azure-DataBricks Spark not performing

Hi All,

My requirement was to process approx 1TB of data stored in Azure container.The container contains millions of json files which are multi part in nature .

For this i was using HdInsight which was able to process the data in 45 mins approx :

Worker Nodes (1-4)autoscale - 16 cores 112 gb
Headnodes-2 - 4 cores 28gb

we planned to migrate to Azure Databricks Spark cluster

configuration of cluster used

Worker Nodes (4-10) autoscale - 8 cores 56gb - memory optimized
Head nodes - 4 cores 28gb

But this keeps running for more then 2.5 hrs but still the process was not completed, and i can see it used 4 worker nodes to the maximum but does not scale up to leverage the remaining worker nodes to speed up the process.

Can any one help if i am doing something wrong here.

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Debasish22-0229,

Welcome to the MS Q&A platform.

This issue looks strange. For a deeper investigation and immediate assistance on this issue, if you have a support plan you may file a support ticket.

0 Votes 0 ·

Hello @Debasish22-0229,

Did you get a chance to open a support ticket for this issue?

0 Votes 0 ·

0 Answers