This topic describes a feature available in SQL Server 2016 and SQL Server 2017 that supports scoring on machine learning models in near realtime.
Native scoring is a special implementation of realtime scoring that uses the native T-SQL PREDICT function for very fast scoring, and is available only in SQL Server 2017. For more information, see Native scoring.
How realtime scoring works
Realtime scoring is supported in both SQL Server 2017 and SQL Server 2016, on specific model types created by using supproted RevoScaleR or MicrosoftML algorithms. It uses native C++ libraries to generate scores, based on user input provided to a machine learning model stored in a special binary format.
Because a trained model can be used for scoring without having to call an external language runtime, the overhead of multiple processes is reduced. This supports much faster prediction performance for production scoring scenarios. Because the data never leaves SQL Server, results can be generated and inserted into a new table without any data translation between R and SQL.
Realtime scoring is a multi-step process:
- The stored procedure that does scoring must be enabled on a per-database basis.
- You load the pre-trained model in binary format.
- You provide new input data, either tabular or single rows, as input to the model.
- To generate scores, call the sp_rxPredict stored procedure.
For code examples and instructions, see How to perform native scoring or realtime scoring.
For an example of how rxPredict can used for scoring, see End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service
If you are working exclusively in R code, you can also use the rxPredict function for fast scoring.
Realtime scoring is supported on these platforms:
- SQL Server 2017 Machine Learning Services
- SQL Server R Services 2016, with an upgrade of the R Services instance to Microsoft R Server 9.1.0 or later
- Machine Learning Server (Standalone)
On SQL Server, you must enable the realtime scoring feature in advance. This is because the feature requires installation of CLR-based libraries into SQL Server.
For information regarding realtime scoring in a distributed environment based on Microsoft R Server, please refer to the publishService function available in the mrsDeploy package, which supports publishing models for realtime scoring as a new a web service running on R Server.
The model must be trained in advance using one of the supported rx algorithms. For details, see Supported algorithms. Realtime scoring with
sp_rxPredictsupports both RevoScaleR and MicrosoftML algorithms.
Realtime scoring does not use an interpreter interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
Models using the
rxNaiveBayesalgorithms are not currently supported
RevoScaleR models that use an R transformation function, or a formula that contains a transformation, such as
A ~ log(B)are not supported in realtime scoring. To use a model of this type, we recommend that you perform the transformation on the to input data before passing the data to realtime scoring.
Realtime scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousand of rows. On very large datasets, using rxPredict might be faster.
Models marked with * also support native scoring with the PREDICT function.
Transformations supplied by MicrosoftML
Unsupported model types
The following model types are not supported:
- Models containing other, unsupported types of R transformations
- Models using the
rxNaiveBayesalgorithms in RevoScaleR
- PMML models
- Models created using other R libraries from CRAN or other repositories
- Models containing any other kind of R transformation other than those listed here
sp_rxPredictreturns an inaccurate message when a NULL value is passed as the model: "System.Data.SqlTypes.SqlNullValueException:Data in Null".