SQL Server 机器学习服务中的新增功能What's new in SQL Server Machine Learning Services

适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) 及更高版本适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) and later

本文介绍 SQL Server 机器学习服务的每个版本中包含的新功能和功能。This articles describes what new capabilities and features are included in each version of SQL Server Machine Learning Services. 随着我们继续扩大、扩展和深化数据平台、高级分析和数据科学之间的集成,机器学习功能会添加到每个版本的 SQL Server 中。Machine learning capabilities are added to SQL Server in each release as we continue to expand, extend, and deepen the integration between the data platform, advanced analytics, and data science.

SQL Server 2019 中的新增功能New in SQL Server 2019

此版本在 SQL Server 中添加了用于 Python 和 R 机器学习操作的需求最大的功能。This release adds the top-requested features for Python and R machine learning operations in SQL Server. 有关此版本中所有功能的详细信息,请参阅 SQL Server 2019 中的新增功能SQL Server 2019 的发行说明For more information about all of the features in this release, see What's New in SQL Server 2019 and Release Notes for SQL Server 2019.

备注

有关 SQL Server 2019 中 Java 的新增功能文档,请参阅 What's new in SQL Server Language Extensions?(SQL Server 语言扩展中的新增功能?)For the what's new documentation on Java in SQL Server 2019, see the What's new in SQL Server Language Extensions?

以下是 SQL Server 机器学习服务的新增功能,在 Windows 和 Linux 上均可使用 :Below are the new features for SQL Server Machine Learning Services, available on both Windows and Linux:

SQL Server 2017 的新增功能New in SQL Server 2017

此版本添加了 Python 支持和行业领先的机器学习算法This release adds Python support and industry-leading machine learning algorithms. 重命名以反映新的作用域,SQL Server 2017 标志着 SQL Server 机器学习服务(数据库内)的引入,同时对 Python 和 R 提供语言支持。Renamed to reflect the new scope, SQL Server 2017 marks the introduction of SQL Server Machine Learning Services (In-Database), with language support for both Python and R.

有关所有的功能公告,请参阅 SQL Server 2017 的新增功能For feature announcements all-up, see What's New in SQL Server 2017.

R 增强功能R enhancements

SQL Server 机器学习服务 R 组件是下一代 SQL Server 2016 R Services,其中包含基本 R、RevoScaler 和其他包的更新版本。The R component of SQL Server Machine Learning Services is the next generation of SQL Server 2016 R Services, with updated versions of base R, RevoScaler, and other packages.

R 的新功能包括包管理,以下是一些亮点 :New capabilities for R include package management, with the following highlights:

R 库R libraries

程序包Package 说明Description
MicrosoftMLMicrosoftML 在此版本中,MicrosoftML 包含在默认 R 安装中,从而消除了之前 SQL Server 2016 R Services 中所需的升级步骤。In this release, MicrosoftML is included in a default R installation, eliminating the upgrade step required in the previous SQL Server 2016 R Services. MicrosoftML 提供先进的机器学习算法和可在远程计算上下文中扩展或运行的数据转换。MicrosoftML provides state-of-the-art machine learning algorithms and data transformations that can be scaled or run in remote compute contexts. 算法包括可自定义的深层神经网络、快速决策树和决策林、线性回归和逻辑回归。Algorithms include customizable deep neural networks, fast decision trees and decision forests, linear regression, and logistic regression.

用于数据库内分析的 Python 集成Python integration for in-database analytics

Python 是一种语言,可为各种机器学习任务提供极大灵活性和强大功能。Python is a language that offers great flexibility and power for a variety of machine learning tasks. 用于 Python 的开放源代码库包括可自定义神经网络的多个平台以及用于自然语言处理的常用库。Open-source libraries for Python include several platforms for customizable neural networks, as well as popular libraries for natural language processing.

由于 Python 与数据库引擎集成,因此可以保持与数据接近的分析结果,同时清除与数据移动相关的成本和安全风险。Because Python is integrated with the database engine, you can keep analytics close to the data and eliminate the costs and security risks associated with data movement. 可使用 Visual Studio 之类的工具基于 Python 部署机器学习解决方案。You can deploy machine learning solutions based on Python using tools like Visual Studio. 使用 SQL Server 数据访问方法,生产应用程序可以从 Python 3.5 运行时获取预测、模型或视觉对象。Your production applications can get predictions, models, or visuals from the Python 3.5 runtime using SQL Server data access methods.

通过 sp_execute_external_script 系统存储过程支持 T-SQL 和 Python 集成。T-SQL and Python integration is supported through the sp_execute_external_script system stored procedure. 可使用此存储过程调用任何 Python 代码。You can call any Python code using this stored procedure. 代码在安全的双体系结构中运行,该体系结构可使用企业级的 Python 模型和脚本部署,可使用简单的存储过程从应用程序中调用。Code runs in a secure, dual architecture that enables enterprise-grade deployment of Python models and scripts, callable from an application using a simple stored procedure. 通过将数据从 SQL 流式传输到 Python 进程以及 MPI 环并行化,实现更多性能提升。Additional performance gains are achieved by streaming data from SQL to Python processes and MPI ring parallelization.

可使用 T-SQL PREDICT 函数在以前以所需的二进制格式保存的预定型模型上执行本机评分You can use the T-SQL PREDICT function to perform native scoring on a pre-trained model that has been previously saved in the required binary format.

Python 库Python libraries

程序包Package 说明Description
revoscalepyrevoscalepy RevoScaleR 的 Python 等效项。Python-equivalent of RevoScaleR. 你可以为线性和逻辑回归、决策树、提升树和随机林创建 Python 模型,所有这些都是可并行化的,并能够在远程计算上下文中运行。You can create Python models for linear and logistic regressions, decision trees, boosted trees, and random forests, all parallelizable, and capable of being run in remote compute contexts. 此包支持使用多个数据源和远程计算上下文。This package supports use of multiple data sources and remote compute contexts. 数据科学家或开发人员可以在远程 SQL Server 上执行 Python 代码,以浏览数据或生成模型,而无需移动数据。The data scientist or developer can execute Python code on a remote SQL Server, to explore data or build models without moving data.
microsoftmlmicrosoftml MicrosoftML R 包的 Python 等效项。Python-equivalent of the MicrosoftML R package.

预定型模型Pre-trained models

预定型模型可用于 Python 和 R。使用这些模型进行图像识别和正负情绪分析,以便根据自己的数据生成预测 。Pre-trained models are available for both Python and R. Use these models for image recognition and positive-negative sentiment analysis, to generate predictions on your own data.

独立服务器作为 SQL Server 安装程序中的共享功能Standalone Server as a shared feature in SQL Server Setup

此版本还添加了 SQL Server 机器学习服务器(独立版),这是一个完全独立的数据科学服务器,支持 R 和 Python 中的统计分析和预测分析。This release also adds SQL Server Machine Learning Server (Standalone), a fully independent data science server, supporting statistical and predictive analytics in R and Python. 与 R Services 一样,此服务器是 SQL Server 2016 R Server(独立版)的下一版本。As with R Services, this server is the next version of SQL Server 2016 R Server (Standalone). 使用该独立服务器,你可以分发和扩展 R 或 Python 解决方案,而无需依赖于 SQL Server。With the standalone server, you can distribute and scale R or Python solutions with no dependencies on SQL Server.

SQL Server 2016 中的新增功能New in SQL Server 2016

此版本通过 SQL Server 2016 R Services 将机器学习功能引入到 SQL Server,这是一个数据库内分析引擎,用于处理数据库引擎实例中常驻数据上的 R 脚本 。This release introduced machine learning capabilities into SQL Server through SQL Server 2016 R Services, an in-database analytics engine for processing R script on resident data within a database engine instance.

此外,SQL Server 2016 R Server(独立版)是作为在 Windows 服务器上安装 R Server 的方式发布的 。Additionally, SQL Server 2016 R Server (Standalone) was released as a way to install R Server on a Windows server. 最初,SQL Server 安装程序提供了安装 R Server for Windows 的唯一方法。Initially, SQL Server Setup provided the only way to install R Server for Windows. 在更高版本中,希望在 Windows 上使用 R Server 的开发人员和数据科学家可以使用另一个独立的安装程序实现相同的目标。In later releases, developers and data scientists who wanted R Server on Windows could use another standalone installer to achieve the same goal. SQL Server 中的独立服务器在功能上等同于独立服务器软件 Microsoft R Server for WindowsThe standalone server in SQL Server is functionally equivalent to the standalone server product, Microsoft R Server for Windows.

有关所有的功能公告,请参阅 SQL Server 2016 的新增功能For feature announcements all-up, see What's New in SQL Server 2016.

发布Release 功能更新Feature update
CU 添加件CU additions 实时评分依赖于本机 C++ 库来读取以优化的二进制格式存储的模型,然后生成预测,而无需调用 R 运行时 。Real-time scoring relies on native C++ libraries to read a model stored in an optimized binary format, and then generate predictions without having to call the R runtime. 这使得评分操作的速度更快。This makes scoring operations much faster. 使用实时评分,可以运行存储过程或从 R 代码执行实时评分。With real-time scoring, you can run a stored procedure or perform real-time scoring from R code. 如果实例升级到 Microsoft R ServerMicrosoft R Server 的最新版本,则实时评分也可用于 SQL Server 2016。Real-time scoring is also available for SQL Server 2016, if the instance is upgraded to the latest release of Microsoft R ServerMicrosoft R Server.
初始版本Initial release 用于数据库内分析的 R 集成R integration for in-database analytics.

用于在 T-SQL 中调用 R 函数的 R 包,反之亦然。R packages for calling R functions in T-SQL, and vice versa. RevoScaleR 函数通过将数据分块到组件部分、协调和管理分布式处理以及聚合结果,从而大规模提供 R 服务。RevoScaleR functions provide R analytics at scale by chunking data into component parts, coordinating and managing distributed processing, and aggregating results. 在 SQL Server 2016 R Services(数据库内)中,RevoScaleR 引擎与数据库引擎实例集成在一起,并在同一处理上下文中将数据和分析结合在一起。In SQL Server 2016 R Services (In-Database), the RevoScaleR engine is integrated with a database engine instance, brining data and analytics together in the same processing context.

通过 sp_execute_external_script 实现 T-SQL 和 R 的集成。T-SQL and R integration through sp_execute_external_script. 可使用此存储过程调用任何 R 代码。You can call any R code using this stored procedure. 此安全体系结构支持企业级 Rn 模型和脚本的部署,这些模型和脚本可以使用简单的存储过程从应用程序中调用。This secure infrastructure enables enterprise-grade deployment of Rn models and scripts that can be called from an application using a simple stored procedure. 通过将数据从 SQL 流式传输到 R 进程以及 MPI 环并行化,实现更多性能提升。Additional performance gains are achieved by streaming data from SQL to R processes and MPI ring parallelization.

可使用 T-SQL PREDICT 函数在以前以所需的二进制格式保存的预定型模型上执行本机评分You can use the T-SQL PREDICT function to perform native scoring on a pre-trained model that has been previously saved in the required binary format.

Linux 支持Linux support

在使用数据库引擎实例安装机器学习包时,SQL Server 2019 会为 R 和 Python 添加 Linux 支持。SQL Server 2019 adds Linux support for R and Python when you install the machine learning packages with a database engine instance. 有关详细信息,请参阅在 Linux 上安装 SQL Server 机器学习服务For more information, see Install SQL Server Machine Learning Services on Linux.

在 Linux 上,SQL Server 2017 没有 R 或 Python 集成,但你可以在 Linux 上使用本机评分,因为该功能可通过在 Linux 上运行的 T-SQL PREDICT提供。On Linux, SQL Server 2017 does not have R or Python integration, but you can use Native scoring on Linux because that functionality is available through T-SQL PREDICT, which runs on Linux. 本机评分可从预定型模型进行高性能评分,无需进行调用,甚至不需要 R 运行时。Native scoring enables high-performance scoring from a pretrained model, without calling or even requiring an R runtime.

后续步骤Next steps