question

YaroslawKhomenkoC-2238 avatar image
0 Votes"
YaroslawKhomenkoC-2238 asked YaroslawKhomenkoC-2238 commented

Connect pipeline to established Spark Pool

Is there a way to speed up a pipeline that is executing a notebook on spark pool?
The pipeline itself executes for 30-35 seconds, but cluster start takes around 3 minutes, which is totally inefficient.

For example, in Databricks jobs you can execute a job using cluster which is already running and you dont need to wait for the cluster to start. How to achieve the same using Synapse?

Thanks in advance,
Yaro

azure-synapse-analytics
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered YaroslawKhomenkoC-2238 commented

I'm guessing you are asking about the idle pools feature in Databricks? You can prevent the pool from being deallocated by turning off auto-shutdown, and paying the related costs, but you currently can't avoid creating a fresh cluster for a given job.

We always take requests like this from the forums back to the product teams, but you can also put a feature request on feedback.azure.com since the vote system there helps with feature prioritization.


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for the reply, I will look into create a request on feedback portal

0 Votes 0 ·