机器学习教程Machine learning tutorial

备注

Databricks Runtime ML 是使用 Azure Databricks 开发和部署机器学习模型的综合性工具。Databricks Runtime ML is a comprehensive tool for developing and deploying machine learning models with Azure Databricks. 它包括最常见的机器学习和深度学习库,以及 MLflow,用于跟踪和管理端到端机器学习生命周期的机器学习平台 API。It includes the most popular machine learning and deep learning libraries, as well as MLflow, a machine learning platform API for tracking and managing the end-to-end machine learning lifecycle. 有关详细信息,请参阅 机器学习和深度学习See Machine learning and deep learning for details.

Apache Spark 机器学习库 (MLlib) ,使数据科学家能够专注于其数据问题和模型,而无需解决围绕分布式 (数据的复杂性,如基础结构、配置等) 。The Apache Spark machine learning library (MLlib) allows data scientists to focus on their data problems and models instead of solving the complexities surrounding distributed data (such as infrastructure, configurations, and so on). 教程笔记本会指导你完成加载和预处理数据的步骤,使用 MLlib 算法为模型定型,计算模型性能,优化模型,以及进行预测。The tutorial notebook takes you through the steps of loading and preprocessing data, training a model using an MLlib algorithm, evaluating model performance, tuning the model, and making predictions. 它还说明了如何使用 MLlib 管道和 MLflow 机器学习平台。It also illustrates the use of MLlib pipelines and the MLflow machine learning platform.

笔记本Notebook

使用与群集上的 Databricks Runtime 版本相对应的笔记本。Use the notebook that corresponds to the Databricks Runtime version on your cluster. 有关机器学习的更多示例,请参阅 机器学习和深度学习For more machine learning examples, see Machine learning and deep learning.

MLlib 笔记本 (Databricks Runtime 7.0 及更高版本的入门) Get started with MLlib notebook (Databricks Runtime 7.0 and above)

获取笔记本Get notebook

MLlib 笔记本 (Databricks Runtime 5.5 LTS 或 1.x) 入门Get started with MLlib notebook (Databricks Runtime 5.5 LTS or 6.x)

获取笔记本Get notebook