question

rt-0367 avatar image
1 Vote"
rt-0367 asked MartinJaffer-MSFT commented

Controlling the queuing of pipeline in Azure Data Factory

Hello,

I was looking for solution for how to queue runs of a pipeline if there is already a pipeline that is running with the same set of parameters.

For example if I have generic pipeline that has one parameter I would only want concurrent runs if the parameter had different values between the runs.
If two requests were made to run the pipeline first with parameter "A" and then parameter "B" I would want these to be executed in parallel.
However if there were two request were made with the same parameter “A” I would want the second request to be queued by Data Factory until the first one finishes.

One option would be to have a pipeline for each parameter value . The concurrency could then be controlled at the pipeline level however this would lead to a large number of pipelines to create which we did not want to manage. Is there another alternative to control the queuing of pipelines?

Thanks.

azure-data-factory
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

MartinJaffer-MSFT avatar image
0 Votes"
MartinJaffer-MSFT answered MartinJaffer-MSFT commented

Hello @rt-0367 and welcome to Microsoft Q&A. Thank you for you well-considered ask.

At this time, there is no feature to control pipeline concurrency with respect to parameter values.

Other than the option you outlined, I can think of one more possibility. A sort of home-brew custom scheduler. The details of this depend upon whether you are triggering pipelines manually, or via triggers.

If you are triggering manually, then you may want to build an application outside of Data Factory. This application would receive requests for pipeline runs, and query the Data Factory service to either get the status of existing runs, or start a new run. This could use REST API, Powershell, or other SDK/API/CLI. This has an advantage over the next option, it can provide feedback to the user.

While roundabout, it is possible to use the ADF Web activity to query the ADF service and get the list of current or past pipeline runs. Suppose we have a pipeline, which takes as input, the parameters to start another pipeline with. This pipeline would query the service and check for in-progress runs and compare parameters . Depending upon the result, it could trigger the desired pipeline and pass parameters, it could wait and try checking again later, or it could halt.
Your triggers would point to this "proxy" pipeline. This would necessitate having 1 "proxy" pipeline for every 1 "business" pipeline. I hope this increase would be smaller than the permutation of parameters.

Did I communicate effectively?

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Yes thank you this was very helpful

0 Votes 0 ·

Thank you for letting me know, @97171780 / @rt-0367 . If this answered your question, please mark as accepted answer. Let me know if you have follow-up questions or need more information.

Also, you can share this product feedback in the feedback forum.


0 Votes 0 ·