Tutorial: Get started with a Python script in Azure Machine Learning (part 1 of 3)
In this tutorial, you run your first Python script in the cloud with Azure Machine Learning. This tutorial is part 1 of a three-part tutorial series.
This tutorial avoids the complexity of training a machine learning model. You will run a "Hello World" Python script in the cloud. You will learn how a control script is used to configure and create a run in Azure Machine Learning.
In this tutorial, you will:
- Create and run a "Hello world!" Python script.
- Create a Python control script to submit "Hello world!" to Azure Machine Learning.
- Understand the Azure Machine Learning concepts in the control script.
- Submit and run the "Hello world!" script.
- View your code output in the cloud.
- Complete Quickstart: Set up your workspace to get started with Azure Machine Learning to create a workspace, compute instance, and compute cluster to use in this tutorial series.
Create and run a Python script
This tutorial will use the compute instance as your development computer. First create a few folders and the script:
- Sign in to the Azure Machine Learning studio and select your workspace if prompted.
- On the left, select Notebooks
- In the Files toolbar, select +, then select Create new folder.
- Name the folder get-started.
- To the right of the folder name, use the ... to create another folder under get-started.
- Name the new folder src. Use the Edit location link if the file location is not correct.
- To the right of the src folder, use the ... to create a new file in the src folder.
- Name your file hello.py. Switch the File type to Python (.py)*.
Copy this code into your file:
# src/hello.py print("Hello world!")
Your project folder structure will now look like:
Test your script
You can run your code locally, which in this case means on the compute instance. Running code locally has the benefit of interactive debugging of code.
If you have previously stopped your compute instance, start it now with the Start compute tool to the right of the compute dropdown. Wait about a minute for state to change to Running.
Select Save and run script in terminal to run the script.
You'll see the output of the script in the terminal window that opens. Close the tab and select Terminate to close the session.
Create a control script
A control script allows you to run your
hello.py script on different compute resources. You use the control script to control how and where your machine learning code is run.
Select the ... at the end of get-started folder to create a new file. Create a Python file called run-hello.py and copy/paste the following code into that file:
# get-started/run-hello.py from azureml.core import Workspace, Experiment, Environment, ScriptRunConfig ws = Workspace.from_config() experiment = Experiment(workspace=ws, name='day1-experiment-hello') config = ScriptRunConfig(source_directory='./src', script='hello.py', compute_target='cpu-cluster') run = experiment.submit(config) aml_url = run.get_portal_url() print(aml_url)
If you used a different name when you created your compute cluster, make sure to adjust the name in the code
compute_target='cpu-cluster' as well.
Understand the code
Here's a description of how the control script works:
ws = Workspace.from_config()
Workspace connects to your Azure Machine Learning workspace, so that you can communicate with your Azure Machine Learning resources.
experiment = Experiment( ... )
Experiment provides a simple way to organize multiple runs under a single name. Later you can see how experiments make it easy to compare metrics between dozens of runs.
config = ScriptRunConfig( ... )
ScriptRunConfig wraps your
hello.py code and passes it to your workspace. As the name suggests, you can use this class to configure how you want your script to run in Azure Machine Learning. It also specifies what compute target the script will run on. In this code, the target is the compute cluster that you created in the setup tutorial.
run = experiment.submit(config)
Submits your script. This submission is called a run. A run encapsulates a single execution of your code. Use a run to monitor the script progress, capture the output, analyze the results, visualize metrics, and more.
aml_url = run.get_portal_url()
run object provides a handle on the execution of your code. Monitor its progress from the Azure Machine Learning studio with the URL that's printed from the Python script.
Submit and run your code in the cloud
Select Save and run script in terminal to run your control script, which in turn runs
hello.pyon the compute cluster that you created in the setup tutorial.
In the terminal, you may be asked to sign in to authenticate. Copy the code and follow the link to complete this step.
Once you're authenticated, you'll see a link in the terminal. Select the link to view the run.
You may see some Failure to load... warnings in the terminal. You can ignore these warnings. Use the link at the bottom of these warnings to view your output.
View the output
- In the page that opens, you'll see the run status.
- When the status of the run is Completed, select Output + logs at the top of the page.
- Select 70_driver_log.txt to view the output of your run.
Monitor your code in the cloud in the studio
The output from your script will contain a link to the studio that looks something like this:
Follow the link. At first, you'll see a status of Queued or Preparing. The very first run will take 5-10 minutes to complete. This is because the following occurs:
- A docker image is built in the cloud
- The compute cluster is resized from 0 to 1 node
- The docker image is downloaded to the compute.
Subsequent runs are much quicker (~15 seconds) as the docker image is cached on the compute. You can test this by resubmitting the code below after the first run has completed.
Wait about 10 minutes. You'll see a message that the run has completed. Then use Refresh to see the status change to Completed. Once the job completes, go to the Outputs + logs tab. There you can see a
70_driver_log.txt file that looks like this:
1: [2020-08-04T22:15:44.407305] Entering context manager injector. 2: [context_manager_injector.py] Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath', 'RunHistory:context_managers.RunHistory', 'TrackUserError:context_managers.TrackUserError', 'UserExceptions:context_managers.UserExceptions'], invocation=['hello.py']) 3: Starting the daemon thread to refresh tokens in background for process with pid = 31263 4: Entering Run History Context Manager. 5: Preparing to call script [ hello.py ] with arguments:  6: After variable expansion, calling script [ hello.py ] with arguments:  7: 8: Hello world! 9: Starting the daemon thread to refresh tokens in background for process with pid = 31263 10: 11: 12: The experiment completed successfully. Finalizing run... 13: Logging experiment finalizing status in history service. 14: [2020-08-04T22:15:46.541334] TimeoutHandler __init__ 15: [2020-08-04T22:15:46.541396] TimeoutHandler __enter__ 16: Cleaning up all outstanding Run operations, waiting 300.0 seconds 17: 1 items cleaning up... 18: Cleanup took 0.1812913417816162 seconds 19: [2020-08-04T22:15:47.040203] TimeoutHandler __exit__
On line 8, you see the "Hello world!" output.
70_driver_log.txt file contains the standard output from a run. This file can be useful when you're debugging remote runs in the cloud.
In this tutorial, you took a simple "Hello world!" script and ran it on Azure. You saw how to connect to your Azure Machine Learning workspace, create an experiment, and submit your
hello.py code to the cloud.
In the next tutorial, you build on these learnings by running something more interesting than
If you want to finish the tutorial series here and not progress to the next step, remember to clean up your resources.