您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

什么是机器学习?What is machine learning?

机器学习是一项数据科研技术,可以让计算机根据现有的数据来预测将来的行为、结果和趋势。Machine learning is a data science technique that allows computers to use existing data to forecast future behaviors, outcomes, and trends. 使用机器学习,计算机可以在不需显式编程的情况下进行学习。By using machine learning, computers learn without being explicitly programmed. 机器学习工具使用 AI 系统,这种系统可用于识别模式并创建与数据的体验相关的关联。Machine learning tools use AI systems which provide the ability to identify patterns and create associations from experience with the data.

自动机器学习预测或预测可以使应用程序和设备更智能。Automated machine learning forecasts or predictions can make applications and devices smarter. 例如,在网上购物时,机器学习可根据购买的产品帮助推荐其他产品。For example, when you shop online, machine learning helps recommend other products you might want based on what you've bought. 或者,在刷信用卡时,机器学习可将这笔交易与交易数据库进行比较,帮助检测诈骗。Or when your credit card is swiped, machine learning compares the transaction to a database of transactions and helps detect fraud. 当吸尘器机器人打扫房间时,机器学习可帮助它确定作业是否已完成。And when your robot vacuum cleaner vacuums a room, machine learning helps it decide whether the job is done.

适用于每个任务的机器学习工具Machine learning tools to fit each task

Azure 机器学习为其机器学习工作流提供了开发人员和数据科学家所需的所有工具,包括:Azure Machine Learning provides all the tools developers and data scientists need for their machine learning workflows, including:

甚至可以使用 MLflow 跟踪指标,并部署模型kubeflow 以生成端到端工作流管道。You can even use MLflow to track metrics and deploy models or kubeflow to build end-to-end workflow pipelines.

在 Python 或 R 中构建机器学习模型Build machine learning models in Python or R

开始使用 Azure 机器学习 Python SDKR SDK 在本地计算机上训练。Start training on your local machine using the Azure Machine Learning Python SDK or R SDK. 然后,横向扩展到云。Then, you can scale out to the cloud. 使用多种可用 计算目标(如 Azure 机器学习计算和 Azure Databricks,以及使用 高级超参数优化服务),你可以使用云的强大功能更快地生成更好的模型。With many available compute targets, like Azure Machine Learning compute and Azure Databricks, and with advanced hyperparameter tuning services, you can build better models faster by using the power of the cloud. 也可使用 SDK 自动完成模型训练和优化You can also automate model training and tuning using the SDK.

用无代码工具构建机器学习模型Build machine learning models with no-code tools

对于无代码或低代码训练和部署,请尝试:For code-free or low-code training and deployment, try:

  • Azure 机器学习设计器(预览版)Azure Machine Learning designer (preview)

    使用设计器可在不编写任何代码的情况下准备数据、训练、测试、部署、管理和跟踪机器学习模型。Use the designer to prep data, train, test, deploy, manage, and track machine learning models without writing any code. 不需要编程,只需以可视方式连接数据集和模块即可构建模型。There is no programming required, you visually connect datasets and modules to construct your model. 尝试设计器教程Try out the designer tutorial.

    有关详细信息,请参阅 Azure 机器学习设计器概述一文Learn more in the Azure Machine Learning designer overview article.

  • 自动机器学习 (AutoML) UIAutomated machine learning (AutoML) UI

    了解如何在易于使用的界面中创建 AutoML 试验Learn how to create AutoML experiments in the easy-to-use interface.

MLOps:部署和生命周期管理MLOps: Deploy and lifecycle management

(MLOps 的机器学习操作) 基于可提高工作流效率的 DevOps 原则和实践。Machine learning operations (MLOps) is based on DevOps principles and practices that increase the efficiency of workflows. 例如持续集成、持续交付和持续部署。For example, continuous integration, delivery, and deployment. MLOps 将这些原理应用到机器学习过程,其目标是:MLOps applies these principles to the machine learning process, with the goal of:

  • 更快地试验和开发模型Faster experimentation and development of models
  • 更快地将模型部署到生产环境Faster deployment of models into production
  • 质量保证Quality assurance

有了正确的模型以后,即可轻松地将其用在 Web 服务中、IoT 设备上或 Power BI 中。When you have the right model, you can easily use it in a web service, on an IoT device, or from Power BI. 有关详细信息,请参阅使用 Azure 机器学习部署模型For more information, see Deploy models with Azure Machine Learning.

然后,可以使用 适用于 PythonAzure 机器学习 studio机器学习 CLI的 Azure 机器学习 SDK 来管理已部署的模型。Then you can manage your deployed models by using the Azure Machine Learning SDK for Python, Azure Machine Learning studio, or the Machine learning CLI.

可以使用这些模型并对大量数据进行 实时异步 返回预测。These models can be consumed and return predictions in Real time or asynchronously on large quantities of data.

使用高级机器学习管道,可以在每一步(从数据准备、模型训练和评估一直到部署)进行协作。And with advanced machine learning pipelines, you can collaborate on each step from data preparation, model training and evaluation, through deployment. 使用 Pipelines 可以:Pipelines allow you to:

  • 自动完成云中的端到端机器学习过程Automate the end-to-end machine learning process in the cloud
  • 重用组件并仅在需要时重新运行步骤Reuse components and only rerun steps when needed
  • 在每个步骤中使用不同的计算资源Use different compute resources in each step
  • 运行批量评分任务Run batch scoring tasks

如果要使用脚本来自动执行机器学习工作流,则 机器学习 CLI 提供了执行常见任务的命令行工具,例如提交定型运行或部署模型。If you want to use scripts to automate your machine learning workflow, the Machine learning CLI provides command-line tools that perform common tasks, such as submitting a training run or deploying a model.

若要开始使用 Azure 机器学习,请参阅后续步骤To get started using Azure Machine Learning, see Next steps.

自动化机器学习Automated Machine Learning

在试验阶段,数据科学家花费了大量时间来循环访问模型。Data scientists spend an inordinate amount of time iterating over models during the experimentation phase. 由于工作的单调和不具挑战性,在构建可接受的模型之前,尝试使用不同的算法和超参数组合的整个过程对于数据科学家而言非常严重。The whole process of trying out different algorithms and hyperparameter combinations until an acceptable model is built is extremely taxing for data scientists, due to the monotonous and non-challenging nature of work. 尽管这是一项在模型效力方面产生巨大收益的练习,但有时成本太大,时间和资源也可能会有负面回报, (投资回报) 。While this is an exercise that yields massive gains in terms of the model efficacy, it sometimes costs too much in terms of time and resources and thus may have a negative return on investment (ROI).

这就是自动化机器学习 (AutoML 的) 。This is where automated machine learning (AutoML) comes in. 它使用概率 matrix 上的研究论文中的概念因子分解,并实现基于所显示数据的试探法来尝试智能选择的算法和 hypermeter 设置的自动管道,并将其考虑到给定的问题或方案。It uses the concepts from the research paper on probabilistic matrix factorization and implements an automated pipeline of trying out intelligently-selected algorithms and hypermeter settings, based on the heuristics of the data presented, keeping into consideration the given problem or scenario. 此管道的结果是一组最适合于给定问题和数据集的模型。The result of this pipeline is a set of models that are best suited for the given problem and dataset.

有关 AutoML 的详细信息,请参阅 AutoML And MLOps with Azure 机器学习For more information on AutoML, see AutoML and MLOps with Azure Machine Learning.

负责的 MLResponsible ML

在 AI 系统的整个开发和使用过程中,信任必须是核心。Throughout the development and use of AI systems, trust must be at the core. 具体包括信任平台、过程和模型。Trust in the platform, process, and models. 由于 AI 和自治系统将更多的信息集成到了社会结构中,因此请务必主动努力预测并缓解这些技术的意想不到后果。As AI and autonomous systems integrate more into the fabric of society, it's important to proactively make an effort to anticipate and mitigate the unintended consequences of these technologies.

  • 了解模型并为公平构建: 说明模型行为和发现对预测影响最大的功能。Understand your models and build for fairness: Explain model behavior and uncover features that have the most impact on predictions. 在模型定型和推理过程中,将内置 explainers 用于玻璃箱和黑色框模型。Use built-in explainers for both glass-box and black-box models during model training and inference. 使用交互式可视化效果来比较模型并执行模拟分析以提高模型准确性。Use interactive visualizations to compare models and perform what-if analysis to improve model accuracy. 使用先进的算法,针对公平测试模型。Test your models for fairness using state-of-the-art algorithms. 在整个机器学习生命周期中缓解 unfairness,比较缓解后的模型,并根据需要进行公平权衡与准确性权衡。Mitigate unfairness throughout the machine learning lifecycle, compare mitigated models, and make intentional fairness versus accuracy trade-offs as desired.
  • 保护数据隐私和机密性: 使用差异隐私中的最新创新来维护隐私,这些模型在数据中注入准确级别的统计噪音,以限制敏感信息的泄露。Protect data privacy and confidentiality: Build models that preserve privacy using the latest innovations in differential privacy, which injects precise levels of statistical noise in data to limit the disclosure of sensitive information. 确定数据泄漏并智能地限制重复查询,以管理暴露风险。Identify data leaks and intelligently limit repeat queries to manage exposure risk. 使用加密和机密机器学习 (即将为机器学习专门设计) 技术,使用机密数据安全构建模型。Use encryption and confidential machine learning (coming soon) techniques specifically designed for machine learning to securely build models using confidential data.
  • 控制和管理机器学习过程的每个步骤: 访问内置功能,以自动跟踪沿袭并在机器学习生命周期内创建审核试用版。Control and govern through every step of the machine learning process: Access built-in capabilities to automatically track lineage and create an audit trial across the machine learning lifecycle. 通过跟踪数据集、模型、试验、代码等,获取机器学习过程的完全可见性。Obtain full visibility into the machine learning process by tracking datasets, models, experiments, code, and more. 使用自定义标记来实现模型数据表、文档关键模型元数据、提高责任,并确保负责流程。Use custom tags to implement model data sheets, document key model metadata, increase accountability, and ensure responsible process.

详细了解如何实现 责任 MLLearn more about how to implement Responsible ML.

与其他服务集成Integration with other services

Azure 机器学习适用于 Azure 平台上的其他服务,并且还与诸如 Git 和 MLflow 的开源工具集成。Azure Machine Learning works with other services on the Azure platform, and also integrates with open-source tools such as Git and MLflow.

后续步骤Next steps