Create and run simple R scripts in Azure SQL Database Machine Learning Services (preview)
In this quickstart, you'll create and run a set of simple R scripts using the public preview of Machine Learning Services (with R) in Azure SQL Database. You'll learn how to wrap a well-formed R script in the stored procedure sp_execute_external_script and execute the script in a SQL database.
Azure SQL Database Machine Learning Services is currently in preview. Sign up for the preview is currently closed.
If you don't have an Azure subscription, create an account before you begin.
To run the example code in these exercises, you must first have an Azure SQL database with Machine Learning Services (with R) enabled. During the public preview, Microsoft will onboard you and enable machine learning for your existing or new database. Follow the steps in Sign up for the preview.
Make sure you've installed the latest SQL Server Management Studio (SSMS). You can run R scripts using other database management or query tools, but in this quickstart you'll use SSMS.
This quickstart requires that you configure a server-level firewall rule. For information on how to do this, see Create server-level firewall rule.
Run a simple script
To run an R script, you'll pass it as an argument to the system stored procedure, sp_execute_external_script.
In the following steps, you'll run this example R script in your SQL database:
a <- 1 b <- 2 c <- a/b d <- a*b print(c(c, d))
Open SQL Server Management Studio and connect to your SQL database.
If you need help connecting, see Quickstart: Use SQL Server Management Studio to connect and query an Azure SQL database.
Pass the complete R script to the sp_execute_external_script stored procedure.
The script is passed through the
@scriptargument. Everything inside the
@scriptargument must be valid R code.
EXECUTE sp_execute_external_script @language = N'R' , @script = N' a <- 1 b <- 2 c <- a/b d <- a*b print(c(c, d)) '
If you get any errors, it might be because the public preview of Machine Learning Services (with R) is not enabled for your SQL database. See Prerequisites above.
If you're an administrator, you can run external code automatically. You can grant permission to other users using the command:
GRANT EXECUTE ANY EXTERNAL SCRIPT TO <username>.
The correct result is calculated and the R
It should look something like this.
STDOUT message(s) from external script: 0.5 2
Run a Hello World script
A typical example script is one that just outputs the string "Hello World". Run the following command.
EXECUTE sp_execute_external_script @language = N'R' , @script = N'OutputDataSet<-InputDataSet' , @input_data_1 = N'SELECT 1 AS hello' WITH RESULT SETS(([Hello World] INT)); GO
Inputs to this stored procedure include:
|@language||defines the language extension to call, in this case, R|
|@script||defines the commands passed to the R runtime. Your entire R script must be enclosed in this argument, as Unicode text. You could also add the text to a variable of type nvarchar and then call the variable|
|@input_data_1||data returned by the query, passed to the R runtime, which returns the data to SQL Server as a data frame|
|WITH RESULT SETS||clause defines the schema of the returned data table for SQL Server, adding "Hello World" as the column name, int for the data type|
The command outputs the following text:
Use inputs and outputs
By default, sp_execute_external_script accepts a single dataset as input, which typically you supply in the form of a valid SQL query. It then returns a single R data frame as output.
For now, let's use the default input and output variables of sp_execute_external_script: InputDataSet and OutputDataSet.
Create a small table of test data.
CREATE TABLE RTestData (col1 INT NOT NULL) INSERT INTO RTestData VALUES (1); INSERT INTO RTestData VALUES (10); INSERT INTO RTestData VALUES (100); GO
SELECTstatement to query the table.
SELECT * FROM RTestData
Run the following R script. It retrieves the data from the table using the
SELECTstatement, passes it through the R runtime, and returns the data as a data frame. The
WITH RESULT SETSclause defines the schema of the returned data table for SQL Database, adding the column name NewColName.
EXECUTE sp_execute_external_script @language = N'R' , @script = N'OutputDataSet <- InputDataSet;' , @input_data_1 = N'SELECT * FROM RTestData;' WITH RESULT SETS(([NewColName] INT NOT NULL));
Now let's change the names of the input and output variables. The default input and output variable names are InputDataSet and OutputDataSet, this script changes the names to SQL_in and SQL_out:
EXECUTE sp_execute_external_script @language = N'R' , @script = N' SQL_out <- SQL_in;' , @input_data_1 = N' SELECT 12 as Col;' , @input_data_1_name = N'SQL_in' , @output_data_1_name = N'SQL_out' WITH RESULT SETS(([NewColName] INT NOT NULL));
Note that R is case-sensitive. The input and output variables used in the R script (SQL_out, SQL_in) need to match the values defined with
@output_data_1_name, including case.
Only one input dataset can be passed as a parameter, and you can return only one dataset. However, you can call other datasets from inside your R code and you can return outputs of other types in addition to the dataset. You can also add the OUTPUT keyword to any parameter to have it returned with the results.
You also can generate values just using the R script with no input data (
@input_data_1is set to blank).
The following script outputs the text "hello" and "world".
EXECUTE sp_execute_external_script @language = N'R' , @script = N' mytextvariable <- c("hello", " ", "world"); OutputDataSet <- as.data.frame(mytextvariable); ' , @input_data_1 = N'' WITH RESULT SETS(([Col1] CHAR(20) NOT NULL));
Check R version
If you would like to see which version of R is installed in your SQL database, run the following script.
EXECUTE sp_execute_external_script @language = N'R' , @script = N'print(version)'; GO
STDOUT message(s) from external script: _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 4.4 year 2018 month 03 day 15 svn rev 74408 language R version.string R version 3.4.4 (2018-03-15) nickname Someone to Lean On
List R packages
Microsoft provides a number of R packages pre-installed with Machine Learning Services in your SQL database.
To see a list of which R packages are installed, including version, dependencies, license, and library path information, run the following script.
EXEC sp_execute_external_script @language = N'R' , @script = N' OutputDataSet <- data.frame(installed.packages()[,c("Package", "Version", "Depends", "License", "LibPath")]);' WITH result sets(( Package NVARCHAR(255) , Version NVARCHAR(100) , Depends NVARCHAR(4000) , License NVARCHAR(1000) , LibPath NVARCHAR(2000) ));
The output is from
installed.packages() in R and is returned as a result set.
To create a machine learning model using R in SQL Database, follow this quickstart:
For more information on Azure SQL Database Machine Learning Services with R (preview), see the following articles.