Install the Azure Machine Learning SDK for Python

This article is a guide for different installation options for the SDK.

Default install

Use azureml-core.

pip install azureml-core

Then install any other packages required for your particular job.

Upgrade install

Tip

We recommend that you always keep azureml-core updated to the latest version.

Upgrade a previous version:

pip install --upgrade azureml-core

Check version

Verify your SDK version:

pip show azureml-core

To see all packages in your environment:

pip list

You can also show the SDK version in Python, but this version will not include the minor version.

import azureml.core
print(azureml.core.VERSION)

To learn more about how to configure your development environment for Azure Machine Learning service, see Configure your development environment.

Other azureml packages

The SDK contains many other optional packages that you can install. These include dependencies that aren't required for all use-cases, so they are not included in the default installation in order to avoid bloating the environment. The following table outlines some of these optional packages and their use-cases.

Additional package  Use-case
azureml-accel-models Accelerates deep neural networks on FPGAs with the Azure ML Hardware Accelerated Models Service.
azureml-train-automl Provides classes for building and running automated machine learning experiments. Also installs common data science packages including pandas, numpy, and scikit-learn. Install azureml-train-automl.

If you're looking to submit automated ML runs on a remote compute and don't need do any ML locally, we recommend using the thin client, azureml-train-automl-client, package that is part of the azureml-sdk default installation.

See the additional use-case guidance for more information on working with the full automl SDK or its thin client, azureml-train-automl-client.
azureml-contrib Installs azureml-contrib-* packages, which include experimental functionality or preview features.
azureml-datadrift Contains functionality to detect when model training data has drifted from its scoring data.
azureml-interpret Used for model interpretability, including feature and class importance for blackbox and whitebox models.
azureml-widgets Provides support for interactive widgets in a Jupyter notebook environment. This is unnecessary to install if you aren't running in a Jupyter notebook (ex. if you are building in PyCharm), or if you don't need widgets enabled.
azureml-contrib-services Provides functionality for scoring scripts to request raw HTTP access.
azureml-tensorboard Provides classes and methods for exporting experiment run history and launching TensorBoard for visualizing experiment performance and structure.

For a full list of available packages, see AzureML on pypi.

Additional use-case guidance

If your use-case is described below, note the guidance and any recommended actions.

Use-case Guidance
Using automl  Install the full azureml-train-automl SDK in a new 64-bit Python environment. A new 64-bit environment is required because of a dependency on the LightGBM framework. This package installs and pins specific versions of data science packages for compatibility, which requires a clean environment.

The thin client, azureml-train-automl-client, package doesn't install additional data science packages or require a clean Python environment. We recommend azureml-train-automl-client if you only need to submit automated ML runs to a remote compute, and don't need to submit local runs or download your model locally.
Using Azure Databricks In the Azure Databricks environment, use the library sources detailed in this guide for installing the SDK. Also, see these tips for further information on working with Azure Machine Learning SDK for Python on Azure Databricks.
Using Azure Data Science Virtual Machine Azure Data Science Virtual Machines created after September 27, 2018 come with the Python SDK preinstalled.
Running Azure Machine Learning tutorials or notebooks If you are using an older version of the SDK than the one mentioned in the tutorial or notebook, you should upgrade your SDK. Some functionality in the tutorials and notebooks may require additional Python packages such as matplotlib, scikit-learn, or pandas. Instructions in each tutorial and notebook will show you which packages are required.

Troubleshooting

  • Pip Installation: Dependencies are not guaranteed to be consistent with single-line installation:

    This is a known limitation of pip, as it does not have a functioning dependency resolver when you install as a single line. The first unique dependency is the only one it looks at.

    In the following code azureml-datadrift and azureml-train-automl are both installed using a single-line pip install.

      pip install azureml-datadrift, azureml-train-automl
    

    For this example, let's say azureml-datadrift requires version > 1.0 and azureml-train-automl requires version < 1.2. If the latest version of azureml-datadrift is 1.3, then both packages get upgraded to 1.3, regardless of the azureml-train-automl package requirement for an older version.

    To ensure the appropriate versions are installed for your packages, install using multiple lines like in the following code. Order isn't an issue here, since pip explicitly downgrades as part of the next line call. And so, the appropriate version dependencies are applied.

       pip install azureml-datadrift
       pip install azureml-train-automl 
    
  • Explanation package not guaranteed to be installed when installing the azureml-train-automl-client:

    When running a remote AutoML run with model explanation enabled, you will see an error message "Please install azureml-explain-model package for model explanations." This is a known issue. As a workaround follow one of the steps below:

    1. Install azureml-explain-model locally.
        pip install azureml-explain-model
    
    1. Disable the explainability feature entirely by passing model_explainability=False in the AutoML configuration.
        automl_config = AutoMLConfig(task = 'classification',
                               path = '.',
                               debug_log = 'automated_ml_errors.log',
                               compute_target = compute_target,
                               run_configuration = aml_run_config,
                               featurization = 'auto',
                               model_explainability=False,
                               training_data = prepped_data,
                               label_column_name = 'Survived',
                               **automl_settings)
    
  • Panda errors: Typically seen during AutoML Experiment:

    When manually setting up your environment using pip, you may notice errors (especially from pandas) due to unsupported package versions being installed.

    For example, ModuleNotFoundError: No module named 'pandas.core.internals.managers'; 'pandas.core.internals' is not a package

    In order to prevent such errors, please install the AutoML SDK using the automl_setup.cmd:

    1. Open an Anaconda prompt and clone the GitHub repository for a set of sample notebooks.
    git clone https://github.com/Azure/MachineLearningNotebooks.git
    
    1. cd to the how-to-use-azureml/automated-machine-learning folder where the sample notebooks were extracted and then run:
    automl_setup
    
  • KeyError: 'brand' when running AutoML on local compute or Azure Databricks cluster

    If a new environment was created after June 10, 2020, by using SDK 1.7.0 or earlier, training might fail with this error due to an update in the py-cpuinfo package. (Environments created on or before June 10, 2020, are unaffected, as are experiments run on remote compute because cached training images are used.) To work around this issue, take either of the following two steps:

    • Update the SDK version to 1.8.0 or later (this also downgrades py-cpuinfo to 5.0.0):

      pip install --upgrade azureml-sdk[automl]
      
    • Downgrade the installed version of py-cpuinfo to 5.0.0:

      pip install py-cpuinfo==5.0.0
      
  • Error message: Cannot uninstall 'PyYAML'

    Azure Machine Learning SDK for Python: PyYAML is a distutils installed project. Therefore, we cannot accurately determine which files belong to it if there is a partial uninstall. To continue installing the SDK while ignoring this error, use:

    pip install --upgrade azureml-sdk[notebooks,automl] --ignore-installed PyYAML
    
  • Azure Machine Learning SDK installation failing with an exception: ModuleNotFoundError: No module named 'ruamel' or 'ImportError: No module named ruamel.yaml'

    This issue is getting encountered with the installation of Azure Machine Learning SDK for Python on the latest pip (>20.1.1) in the conda base environment for all released versions of Azure Machine Learning SDK for Python. Refer to the following workarounds:

    • Avoid installing Python SDK on the conda base environment, rather create your conda environment and install SDK on that newly created user environment. The latest pip should work on that new conda environment.

    • For creating images in docker, where you cannot switch away from conda base environment, please pin pip<=20.1.1 in the docker file.

    conda install -c r -y conda python=3.6.2 pip=20.1.1
    

Next steps

Try these next steps to learn how to use the Azure Machine Learning service SDK for Python:

  1. Read the overview to learn about key classes and design patterns with code samples.
  2. Follow this tutorial to begin creating experiments and models.