Redaguoti

Bendrinti naudojant


Autologging in Microsoft Fabric

Synapse Data Science in Microsoft Fabric includes autologging, which significantly reduces the amount of code required to automatically log the parameters, metrics, and items of a machine learning model during training. This article describes autologging for Synapse Data Science in Microsoft Fabric.

Autologging extends MLflow Tracking capabilities and is deeply integrated into the Synapse Data Science in Microsoft Fabric experience. Autologging can capture various metrics, including accuracy, loss, F1 score, and custom metrics you define. By using autologging, developers and data scientists can easily track and compare the performance of different models and experiments without manual tracking.

Supported frameworks

Autologging supports a wide range of machine learning frameworks, including TensorFlow, PyTorch, Scikit-learn, and XGBoost. To learn more about the framework-specific properties that autologging captures, see the MLflow documentation.

Configuration

Autologging works by automatically capturing values of input parameters, output metrics, and output items of a machine learning model as it's being trained. This information is logged to your Microsoft Fabric workspace, where you can access and visualize it by using the MLflow APIs or the corresponding experiment and model items in your Microsoft Fabric workspace.

When you launch a Synapse Data Science notebook, Microsoft Fabric calls mlflow.autolog() to instantly enable tracking and load the corresponding dependencies. As you train models in your notebook, MLflow automatically tracks this model information.

The configuration happens automatically behind the scenes when you run import mlflow. The default configuration for the notebook mlflow.autolog() hook is:


mlflow.autolog(
    log_input_examples=False,
    log_model_signatures=True,
    log_models=True,
    disable=False,
    exclusive=True,
    disable_for_unsupported_versions=True,
    silent=True
)

Customization

To customize logging behavior, you can use the mlflow.autolog() configuration. This configuration provides parameters to enable model logging, collect input samples, configure warnings, or enable logging for added content that you specify.

Track more metrics, parameters, and properties

For runs created with MLflow, update the MLflow autologging configuration to track additional metrics, parameters, files, and metadata as follows:

  1. Update the mlflow.autolog() call to set exclusive=False.

        mlflow.autolog(
        log_input_examples=False,
        log_model_signatures=True,
        log_models=True,
        disable=False,
        exclusive=False, # Update this property to enable custom logging
        disable_for_unsupported_versions=True,
        silent=True
    )
    
  2. Use the MLflow tracking APIs to log additional parameters and metrics. The following example code enables you to log your custom metrics and parameters alongside additional properties.

    import mlflow
    mlflow.autolog(exclusive=False)
    
    with mlflow.start_run():
      mlflow.log_param("parameter name", "example value")
      # <add model training code here>
      mlflow.log_metric("metric name", 20)
    

Disable Microsoft Fabric autologging

You can disable Microsoft Fabric autologging for a specific notebook session. You can also disable autologging across all notebooks by using the workspace setting.

Note

If autologging is disabled, you must manually log your parameters and metrics by using the MLflow APIs.

Disable autologging for a notebook session

To disable Microsoft Fabric autologging for a specific notebook session, call mlflow.autolog() and set disable=True.

import mlflow
mlflow.autolog(disable=True)

Disable autologging for all notebooks and sessions

Workspace administrators can enable or disable Microsoft Fabric autologging for all notebooks and sessions in their workspace by using the workspace settings. To enable or disable Synapse Data Science autologging:

  1. In your Synapse Data Science workspace, select Workspace settings.

    Screenshot of the Synapse Data Science page with Workspace settings highlighted.

  2. On the Workspace settings screen, expand Data Engineering/Science on the left navigation bar and select Spark settings.

  3. On the Spark settings screen, select the Automatic log tab.

  4. Set Automatically track machine learning experiments and models to On or Off.

  5. Select Save.

    Screenshot of the Data Science workspace setting for autologging.