Exercise - Train and evaluate a classification model
Data scientists generally use specialized machine learning frameworks to train and evaluate models, including classification models.
One of the most commonly used machine learning frameworks for Python is scikit-learn, and in this hands-on exercise, you'll use scikit-learn to train and evaluate a classification model.
Before you start
To complete the exercise, you'll need:
- A Microsoft Azure subscription. If you don't already have one, you can sign up for a free trial at https://azure.microsoft.com/free.
- An Azure Machine Learning workspace with a compute instance and the ml-basics repository cloned.
This module is one of many that make use of an Azure Machine Learning workspace. If you are completing this module as part of the Create machine learning models learning path or in preparation for the Azure Data Scientist certification, consider creating the workspace once and reusing it in other modules. After completing the exercise, be sure to follow the Clean Up instructions to stop compute resources, and retain the workspace if you plan to reuse it.
Create an Azure Machine Learning workspace
If you don't already have an Azure Machine Learning workspace in your Azure subscription, follow these steps to create one:
- Sign into the Azure portal using the Microsoft account associated with your Azure subscription.
- Select ＋Create a resource, search for Machine Learning, and create a new Machine Learning resource with the following settings:
- Workspace Name: enter a unique name of your choice
- Subscription: your Azure subscription
- Resource group: create a new resource group with a unique name
- Location: choose any available location
- Wait for your workspace resource to be created (it can take a few minutes). Then go to it in the portal, and on the Overview page for your workspace, launch Azure Machine Learning studio (or navigate to https://ml.azure.com), and sign in using your Microsoft account.
- In Azure Machine Learning studio, toggle the ☰ icon at the top left to view the various pages in the interface. You can use these pages to manage the resources in your workspace.
Create a compute instance
To run the notebook used in this exercise, you will need a compute instance in your Azure Machine Learning workspace.
- In Azure Machine Learning studio, view the Compute page for your workspace (under Manage).
- On the Compute Instances tab, if you already have a compute instance, start it; otherwise create a new compute instance with the following settings:
- Compute name: enter a unique name
- Virtual Machine type: CPU
- Virtual Machine size: Standard_DS11_v2
- Wait for the compute instance to start (this may take a minute or so)
Clone the ml-basics repository
The files used in this module (and other related modules) are published in the MicrosoftDocs/ml-basics GitHub repository. If you haven't already done so, use the following steps to clone the repository to your Azure Machine Learning workspace:
In Azure Machine Learning studio, on the Compute page, view your running compute instance.
Use the Jupyter link to open Jupyter Notebooks in a new browser tab.
In the Jupyter page, on the New menu, select Terminal. This will open a new tab with a terminal shell.
In the terminal shell, run the following commands to change the current directory to the Users directory, and clone the ml-basics repository, which contains the notebook and files you will use in this exercise:
cd Users git clone https://github.com/microsoftdocs/ml-basics
After the command has completed and the checkout of the files is done, close the terminal tab and view the home page in your Jupyter notebook file explorer. Then open the Users folder - it should contain an ml-basics folder, containing the files you will use in the rest of this exercise.
We highly recommend using Jupyter in an Azure Machine Learning workspace for this exercise. This setup ensures the correct version of Python and the various packages you will need are installed; and after creating the workspace once, you can reuse it in other modules. If you prefer to complete the exercise in a Python environment on your own computer, you can do so. You'll find details for configuring a local development environment that uses Visual Studio Code at Running the labs on your own computer. Be aware that if you choose to do this, the instructions in the exercise may not match your notebooks user interface.
After you've created a Jupyter environment and cloned the ml-basics repository, you're ready to explore classification.
- In Jupyter, open the Classification.ipynb notebook in the ml-basics folder and follow the instructions it contains.
- When you've finished, close and halt all notebooks.
If you used a compute instance in an Azure Machine Learning workspace to complete the exercise, use these steps to clean up.
- Close all Jupyter notebooks and the Jupyter home page.
- In Azure Machine Learning Studio, on the Compute page, select your compute instance and stop it.
If you don't intend to complete other modules that require the Azure Machine Learning workspace, you can delete the resource group you created for it from your Azure subscription.