Use pipeline parameters to retrain models in the designer
In this how-to article, you learn how to use Azure Machine Learning designer to retrain a machine learning model using pipeline parameters. You will use published pipelines to automate your workflow and set parameters to train your model on new data. Pipeline parameters let you re-use existing pipelines for different jobs.
In this article, you learn how to:
- Train a machine learning model.
- Create a pipeline parameter.
- Publish your training pipeline.
- Retrain your model with new parameters.
- An Azure Machine Learning workspace
- Complete part 1 of this how-to series, Transform data in the designer
If you do not see graphical elements mentioned in this document, such as buttons in studio or designer, you may not have the right level of permissions to the workspace. Please contact your Azure subscription administrator to verify that you have been granted the correct level of access. For more information, see Manage users and roles.
This article also assumes that you have some knowledge of building pipelines in the designer. For a guided introduction, complete the tutorial.
The pipeline used in this article is an altered version of a sample pipeline Income prediction in the designer homepage. The pipeline uses the Import Data module instead of the sample dataset to show you how to train models using your own data.
Create a pipeline parameter
Pipeline parameters are used to build versatile pipelines which can be resubmitted later with varying parameter values. Some common scenarios are updating datasets or some hyper-parameters for retraining. Create pipeline parameters to dynamically set variables at runtime.
Pipeline parameters can be added to data source or module parameters in a pipeline. When the pipeline is resubmitted, the values of these parameters can be specified.
For this example, you will change the training data path from a fixed value to a parameter, so that you can retrain your model on different data. You can also add other module parameters as pipeline parameters according to your use case.
Select the Import Data module.
This example uses the Import Data module to access data in a registered datastore. However, you can follow similar steps if you use alternative data access patterns.
In the module detail pane, to the right of the canvas, select your data source.
Enter the path to your data. You can also select Browse path to browse your file tree.
Mouseover the Path field, and select the ellipses above the Path field that appear.
Select Add to pipeline parameter.
Provide a parameter name and a default value.
You can also detach a module parameter from pipeline parameter in the module detail pane, similar to adding pipeline parameters.
You can inspect and edit your pipeline parameters by selecting the Settings gear icon next to the title of your pipeline draft.
- After detaching, you can delete the pipeline parameter in the Setings pane.
- You can also add a pipeline parameter in the Settings pane, and then apply it on some module parameter.
Submit the pipeline run.
Publish a training pipeline
Publish a pipeline to a pipeline endpoint to easily reuse your pipelines in the future. A pipeline endpoint creates a REST endpoint to invoke pipeline in the future. In this example, your pipeline endpoint lets you reuse your pipeline to retrain a model on different data.
Select Publish above the designer canvas.
Select or create a pipeline endpoint.
You can publish multiple pipelines to a single endpoint. Each pipeline in a given endpoint is given a version number, which you can specify when you call the pipeline endpoint.
Retrain your model
Now that you have a published training pipeline, you can use it to retrain your model on new data. You can submit runs from a pipeline endpoint from the studio workspace or programmatically.
Submit runs by using the studio portal
Use the following steps to submit a parameterized pipeline endpoint run from the studio portal:
- Go to the Endpoints page in your studio workspace.
- Select the Pipeline endpoints tab. Then, select your pipeline endpoint.
- Select the Published pipelines tab. Then, select the pipeline version that you want to run.
- Select Submit.
- In the setup dialog box, you can specify the parameters values for the run. For this example, update the data path to train your model using a non-US dataset.
Submit runs by using code
You can find the REST endpoint of a published pipeline in the overview panel. By calling the endpoint, you can retrain the published pipeline.
To make a REST call, you need an OAuth 2.0 bearer-type authentication header. For information about setting up authentication to your workspace and making a parameterized REST call, see Build an Azure Machine Learning pipeline for batch scoring.
In this article, you learned how to create a parameterized training pipeline endpoint using the designer.
For a complete walkthrough of how you can deploy a model to make predictions, see the designer tutorial to train and deploy a regression model.
For how to publish and submit a run to pipeline endpoint using SDK, see this article.