What's new in SQL Server Machine Learning Services?

Applies to: SQL Server 2016 (13.x) and later versions

This articles describes what new capabilities and features are included in each version of SQL Server Machine Learning Services. Machine learning capabilities are added to SQL Server in each release as we continue to expand, extend, and deepen the integration between the data platform, advanced analytics, and data science.

Note

Feature capabilities and installation options vary between versions of SQL Server. Use the version selector dropdown to choose the appropriate version of SQL Server.

New in SQL Server 2022

Beginning with SQL Server 2022 (16.x), runtimes for R, Python, and Java, are no longer installed with SQL Setup. Instead, install any desired custom runtime(s) and packages. For more information, see Install SQL Server 2022 Machine Learning Services (Python and R) on Windows or Install SQL Server Machine Learning Services (Python and R) on Linux.

New in SQL Server 2019

This release adds the top-requested features for Python and R machine learning operations in SQL Server. For more information about all of the features in this release, see What's New in SQL Server 2019 and Release Notes for SQL Server 2019.

For the what's new documentation on Java and C# in SQL Server 2019, see the What's new in SQL Server Language Extensions?.

Below are the new features for SQL Server Machine Learning Services, available on both Windows and Linux:

New in SQL Server 2017

This release adds Python support and industry-leading machine learning algorithms. Renamed to reflect the new scope, SQL Server 2017 marks the introduction of SQL Server Machine Learning Services (In-Database), with language support for both Python and R.

For feature announcements all-up, see What's New in SQL Server 2017.

R enhancements

The R component of SQL Server Machine Learning Services is the next generation of SQL Server 2016 R Services, with updated versions of base R, RevoScaler, and other packages.

New capabilities for R include package management, with the following highlights:

R libraries

Package Description
MicrosoftML In this release, MicrosoftML is included in a default R installation, eliminating the upgrade step required in the previous SQL Server 2016 R Services. MicrosoftML provides state-of-the-art machine learning algorithms and data transformations that can be scaled or run in remote compute contexts. Algorithms include customizable deep neural networks, fast decision trees and decision forests, linear regression, and logistic regression.

Python integration for in-database analytics

Python is a language that offers great flexibility and power for a variety of machine learning tasks. Open-source libraries for Python include several platforms for customizable neural networks, as well as popular libraries for natural language processing.

Because Python is integrated with the database engine, you can keep analytics close to the data and eliminate the costs and security risks associated with data movement. You can deploy machine learning solutions based on Python using tools like Visual Studio. Your production applications can get predictions, models, or visuals from the Python 3.5 runtime using SQL Server data access methods.

T-SQL and Python integration is supported through the sp_execute_external_script system stored procedure. You can call any Python code using this stored procedure. Code runs in a secure, dual architecture that enables enterprise-grade deployment of Python models and scripts, callable from an application using a simple stored procedure. Additional performance gains are achieved by streaming data from SQL to Python processes and MPI ring parallelization.

You can use the T-SQL PREDICT function to perform native scoring on a pre-trained model that has been previously saved in the required binary format.

Python libraries

Package Description
revoscalepy Python-equivalent of RevoScaleR. You can create Python models for linear and logistic regressions, decision trees, boosted trees, and random forests, all parallelizable, and capable of being run in remote compute contexts. This package supports use of multiple data sources and remote compute contexts. The data scientist or developer can execute Python code on a remote SQL Server, to explore data or build models without moving data.
microsoftml Python-equivalent of the MicrosoftML R package.

Pre-trained models

Pre-trained models are available for both Python and R. Use these models for image recognition and positive-negative sentiment analysis, to generate predictions on your own data.

Standalone Server as a shared feature in SQL Server Setup

This release also adds SQL Server Machine Learning Server (Standalone), a fully independent data science server, supporting statistical and predictive analytics in R and Python. As with R Services, this server is the next version of SQL Server 2016 R Server (Standalone). With the standalone server, you can distribute and scale R or Python solutions with no dependencies on SQL Server.

New in SQL Server 2016

This release introduced machine learning capabilities into SQL Server through SQL Server 2016 R Services, an in-database analytics engine for processing R script on resident data within a database engine instance.

Additionally, SQL Server 2016 R Server (Standalone) was released as a way to install R Server on a Windows server. Initially, SQL Server Setup provided the only way to install R Server for Windows. In later releases, developers and data scientists who wanted R Server on Windows could use another standalone installer to achieve the same goal. The standalone server in SQL Server is functionally equivalent to the standalone server product, Microsoft R Server for Windows.

For feature announcements all-up, see What's New in SQL Server 2016.

Release Feature update
CU additions Real-time scoring relies on native C++ libraries to read a model stored in an optimized binary format, and then generate predictions without having to call the R runtime. This makes scoring operations much faster. With real-time scoring, you can run a stored procedure or perform real-time scoring from R code. Real-time scoring is also available for SQL Server 2016, if the instance is upgraded to the latest release of Microsoft R Server.
Initial release R integration for in-database analytics.

R packages for calling R functions in T-SQL, and vice versa. RevoScaleR functions provide R analytics at scale by chunking data into component parts, coordinating and managing distributed processing, and aggregating results. In SQL Server 2016 R Services (In-Database), the RevoScaleR engine is integrated with a database engine instance, brining data and analytics together in the same processing context.

T-SQL and R integration through sp_execute_external_script. You can call any R code using this stored procedure. This secure infrastructure enables enterprise-grade deployment of Rn models and scripts that can be called from an application using a simple stored procedure. Additional performance gains are achieved by streaming data from SQL to R processes and MPI ring parallelization.

You can use the T-SQL PREDICT function to perform native scoring on a pre-trained model that has been previously saved in the required binary format.

Linux support

SQL Server 2019 adds Linux support for R and Python when you install the machine learning packages with a database engine instance. For more information, see Install SQL Server Machine Learning Services on Linux.

On Linux, SQL Server 2017 does not have R or Python integration, but you can use Native scoring on Linux because that functionality is available through T-SQL PREDICT, which runs on Linux. Native scoring enables high-performance scoring from a pretrained model, without calling or even requiring an R runtime.

Next steps