Tutorial: Deploy a predictive model in R with SQL machine learning

Applies to: SQL Server 2016 (13.x) and later Azure SQL Managed Instance

In part four of this four-part tutorial series, you'll deploy a machine learning model developed in R into SQL Server Machine Learning Services or on Big Data Clusters.

In part four of this four-part tutorial series, you'll deploy a machine learning model developed in R into SQL Server using Machine Learning Services.

In part four of this four-part tutorial series, you'll deploy a machine learning model developed in R into SQL Server using SQL Server R Services.

In part four of this four-part tutorial series, you'll deploy a machine learning model developed in R into Azure SQL Managed Instance using Machine Learning Services.

In this article, you'll learn how to:

  • Create a stored procedure that generates the machine learning model
  • Store the model in a database table
  • Create a stored procedure that makes predictions using the model
  • Execute the model with new data

In part one, you learned how to restore the sample database.

In part two, you learned how to import a sample database and then prepare the data to be used for training a predictive model in R.

In part three, you learned how to create and train multiple machine learning models in R, and then choose the most accurate one.

Prerequisites

Part four of this tutorial assumes you fulfilled the prerequisites of part one and completed the steps in part two and part three.

Create a stored procedure that generates the model

In part three of this tutorial series, you decided that a decision tree (dtree) model was the most accurate. Now, using the R scripts you developed, create a stored procedure (generate_rental_model) that trains and generates the dtree model using rpart from the R package.

Run the following commands in Azure Data Studio.

USE [TutorialDB]
DROP PROCEDURE IF EXISTS generate_rental_model;
GO
CREATE PROCEDURE generate_rental_model (@trained_model VARBINARY(max) OUTPUT)
AS
BEGIN
    EXECUTE sp_execute_external_script @language = N'R'
        , @script = N'
rental_train_data$Month   <- factor(rental_train_data$Month);
rental_train_data$Day     <- factor(rental_train_data$Day);
rental_train_data$Holiday <- factor(rental_train_data$Holiday);
rental_train_data$Snow    <- factor(rental_train_data$Snow);
rental_train_data$WeekDay <- factor(rental_train_data$WeekDay);

#Create a dtree model and train it using the training data set
library(rpart);
model_dtree <- rpart(RentalCount ~ Month + Day + WeekDay + Snow + Holiday, data = rental_train_data);
#Serialize the model before saving it to the database table
trained_model <- as.raw(serialize(model_dtree, connection=NULL));
'
        , @input_data_1 = N'
            SELECT RentalCount
                 , Year
                 , Month
                 , Day
                 , WeekDay
                 , Snow
                 , Holiday
            FROM dbo.rental_data
            WHERE Year < 2015
            '
        , @input_data_1_name = N'rental_train_data'
        , @params = N'@trained_model varbinary(max) OUTPUT'
        , @trained_model = @trained_model OUTPUT;
END;
GO

Store the model in a database table

Create a table in the TutorialDB database and then save the model to the table.

  1. Create a table (rental_models) for storing the model.

    USE TutorialDB;
    DROP TABLE IF EXISTS rental_models;
    GO
    CREATE TABLE rental_models (
          model_name VARCHAR(30) NOT NULL DEFAULT('default model') PRIMARY KEY
        , model VARBINARY(MAX) NOT NULL
        );
    GO
    
  2. Save the model to the table as a binary object, with the model name "DTree".

    -- Save model to table
    TRUNCATE TABLE rental_models;
    
    DECLARE @model VARBINARY(MAX);
    
    EXECUTE generate_rental_model @model OUTPUT;
    
    INSERT INTO rental_models (
          model_name
        , model
        )
    VALUES (
         'DTree'
        , @model
        );
    
    SELECT *
    FROM rental_models;
    

Create a stored procedure that makes predictions

Create a stored procedure (predict_rentalcount_new) that makes predictions using the trained model and a set of new data.

-- Stored procedure that takes model name and new data as input parameters and predicts the rental count for the new data
USE [TutorialDB]
DROP PROCEDURE IF EXISTS predict_rentalcount_new;
GO
CREATE PROCEDURE predict_rentalcount_new (
      @model_name VARCHAR(100)
    , @input_query NVARCHAR(MAX)
    )
AS
BEGIN
    DECLARE @model VARBINARY(MAX) = (
            SELECT model
            FROM rental_models
            WHERE model_name = @model_name
            );

    EXECUTE sp_execute_external_script @language = N'R'
        , @script = N'
#Convert types to factors
rentals$Month   <- factor(rentals$Month);
rentals$Day     <- factor(rentals$Day);
rentals$Holiday <- factor(rentals$Holiday);
rentals$Snow    <- factor(rentals$Snow);
rentals$WeekDay <- factor(rentals$WeekDay);

#Before using the model to predict, we need to unserialize it
rental_model <- unserialize(model);

#Call prediction function
rental_predictions <- predict(rental_model, rentals);
rental_predictions <- data.frame(rental_predictions);
'
        , @input_data_1 = @input_query
        , @input_data_1_name = N'rentals'
        , @output_data_1_name = N'rental_predictions'
        , @params = N'@model varbinary(max)'
        , @model = @model
    WITH RESULT SETS(("RentalCount_Predicted" FLOAT));
END;
GO

Execute the model with new data

Now you can use the stored procedure predict_rentalcount_new to predict the rental count from new data.

-- Use the predict_rentalcount_new stored procedure with the model name and a set of features to predict the rental count
EXECUTE dbo.predict_rentalcount_new @model_name = 'DTree'
    , @input_query = '
        SELECT CONVERT(INT,  3) AS Month
             , CONVERT(INT, 24) AS Day
             , CONVERT(INT,  4) AS WeekDay
             , CONVERT(INT,  1) AS Snow
             , CONVERT(INT,  1) AS Holiday
        ';
GO

You should see a result similar to the following.

RentalCount_Predicted
332.571428571429

You have successfully created, trained, and deployed a model in a database. You then used that model in a stored procedure to predict values based on new data.

Clean up resources

When you've finished using the TutorialDB database, delete it from your server.

Next steps

In part four of this tutorial series, you learned how to:

  • Create a stored procedure that generates the machine learning model
  • Store the model in a database table
  • Create a stored procedure that makes predictions using the model
  • Execute the model with new data

To learn more about using R in Machine Learning Services, see: