追蹤模型開發Track model development

模型開發流程會反復進行,而且在開發和最佳化模型時,追蹤您的工作可能會很困難。The model development process is iterative, and it can be challenging to keep track of your work as you develop and optimize a model. 在 Azure Databricks 中,您可以使用 MLflow 追蹤 協助您追蹤模型開發流程,包括您所嘗試的參數設定或組合,以及其如何影響模型的效能。In Azure Databricks, you can use MLflow tracking to help you keep track of the model development process, including parameter settings or combinations you have tried and how they affected the model’s performance.

MLflow 追蹤會使用「實驗」和「執行」來記錄和追蹤您的模型開發。MLflow tracking uses experiments and runs to log and track your model development. 執行是模型程式碼的單次執行。A run is a single execution of model code. 在 MLflow 執行期間,您可以記錄模型參數和結果。During an MLflow run, you can log model parameters and results. 實驗是相關執行的集合。An experiment is a collection of related runs. 在實驗中,您可以比較和篩選執行,以了解模型的執行方式,以及其效能取決於參數設定、輸入資料等等。Within an experiment, you can compare and filter runs to understand how your model performs and how its performance depends on the parameter settings, input data, and so on.

本文中的筆記本提供簡單的範例,可協助您快速開始使用 MLflow 來追蹤您的模型開發。The notebooks in this article provide simple examples that can help you quickly get started using MLflow to track your model development. 如需在 Azure Databricks 中使用 MLflow 追蹤的詳細資訊,請參閱 追蹤機器學習訓練執行For more details on using MLflow tracking in Azure Databricks, see Track machine learning training runs.

使用 autologging 來追蹤模型開發Use autologging to track model development

MLflow 可以自動記錄以許多 ML 架構撰寫的定型程式碼。MLflow can automatically log training code written in many ML frameworks. 這是開始使用 MLflow 追蹤的最簡單方式。This is the easiest way to get started using MLflow tracking.

這個範例筆記本會示範如何使用 autologging 搭配 scikit-learnThis example notebook shows how to use autologging with scikit-learn. 如需 autologging 與其他 Python 程式庫的相關資訊,請參閱自動將定型執行記錄到 MLflowFor information about autologging with other Python libraries, see Automatically log training runs to MLflow.

MLflow Autologging 快速入門 Python 筆記本MLflow Autologging Quick Start Python notebook

取得筆記本Get notebook

使用記錄 API 來追蹤模型開發Use the logging API to track model development

此筆記本說明如何使用 MLflow 記錄 API。This notebook illustrates how to use the MLflow logging API. 使用記錄 API 可讓您更充分掌控所記錄的計量,並可讓您記錄資料表或繪圖之類的其他成品。Using the logging API gives you more control over the metrics logged and lets you log additional artifacts such as tables or plots.

這個範例筆記本會示範如何使用 Python 記錄 APIThis example notebook shows how to use the Python logging API. MLflow 也有 REST、R 和 JAVA APIMLflow also has REST, R, and Java APIs.

MLflow 記錄 API 快速入門 Python 筆記本MLflow Logging API Quick Start Python notebook

取得筆記本Get notebook

端對端範例End-to-end example

本教學課程筆記本提供在 Azure Databricks 中定型模型的端對端範例,包括載入資料、將資料視覺化、設定平行超參數最佳化,以及使用 MLflow 來檢閱結果、註冊模型,以及使用 Spark UDF 中已註冊的模型對新資料執行推斷。This tutorial notebook presents an end-to-end example of training a model in Azure Databricks, including loading data, visualizing the data, setting up a parallel hyperparameter optimization, and using MLflow to review the results, register the model, and perform inference on new data using the registered model in a Spark UDF.

需求Requirements

Databricks Runtime 6.5 ML 或更新版本。Databricks Runtime 6.5 ML or above.

MLflow 端對端範例筆記本MLflow end-to-end example notebook

取得筆記本Get notebook