question

LeeHarper-5286 avatar image
0 Votes"
LeeHarper-5286 asked YutongTie-MSFT commented

Azure ML designer: force pipeline to execute with only an underlying data change

We are deploying an ML model through the Azure ML designer. Over time the underlying data changes and so the model needs to be regularly retrained. The actual designer pipeline and the dataset definition (a query on a SQL database) are not changed, only the underlying data in the Azure SQL database.

Right now, the pipeline API can be triggered, but it does not execute (as expected). This is equivalent to the default allow_reuse = True in the Azure ML SDK. Is there a way to disable this setting (or set in to False) in the designer so that when the API is triggered we can force it to re-execute the pipeline every time we want to do a retraining (eg once a week) as new data comes in, so that a new model version is generated every time.

To be clear, the training takes around 20 minutes, and the compute cluster it runs on has a 120 second scale-down time, so cost considerations etc (ie the reason for this feature being enabled by default) are not a concern.

Thanks in advance for any help.

azure-machine-learning
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I've seen a way of doing this with the Import Data module's "regenerate output" setting, but I would much rather use datasets so we can keep the SQL queries versioned if possible.

0 Votes 0 ·

1 Answer

YutongTie-MSFT avatar image
0 Votes"
YutongTie-MSFT answered YutongTie-MSFT commented

Hello,

Use the following steps to update a module pipeline parameter:

At the top of the canvas, select the gear icon.
In the Pipeline parameters section, you can view and update the name and default value for all of your pipeline parameter.

84664-image.png

Hope this helps. Thanks.

Regards,
Yutong


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for the response - I added this in, but the pipeline is still short-circuiting, it didn't seem to do anything. Is there a setting in one of the modules that I need to change as well to add this in so that it has the equivalent functionality to the SDK?

0 Votes 0 ·

Hello Lee,

One of my customer has the same issue as you. The workaround for him is to recreate the pipeline with the parameters and the magic works. Could you please have a try?

https://docs.microsoft.com/en-us/answers/questions/344626/index.html

Regards,
Yutong

0 Votes 0 ·