Install SQL Server Machine Learning Services on Windows

APPLIES TO: yesSQL Server (Windows only) noAzure SQL Database noAzure SQL Data Warehouse noParallel Data Warehouse

Starting in SQL Server 2017, R and Python support for in-database analytics is provided in SQL Server Machine Learning Services, the successor to SQL Server R Services introduced in SQL Server 2016. Function libraries are available in R and Python and run as external script on a database engine instance.

This article explains how to install the machine learning component by running the SQL Server setup wizard, and following the on-screen prompts.

Pre-install checklist

  • SQL Server 2017 (or greater) Setup is required if you want to install Machine Learning Services with R, Python, or Java language support. If instead you have SQL Server 2016 installation media, you can install SQL Server 2016 R Services (In-Database) to get R language support.

  • A database engine instance is required. You cannot install just R or Python features, although you can add them incrementally to an existing instance.

  • Installing Machine Learning Services is not supported on a failover cluster in SQL Server 2017. However, it is supported with SQL Server 2019.

  • Do not install Machine Learning Services on a domain controller. The Machine Learning Services portion of setup will fail.

  • Do not install Shared Features > Machine Learning Server (Standalone) on the same computer running an in-database instance. A standalone server will compete for the same resources, undermining the performance of both installations.

  • Side-by-side installation with other versions of R and Python is supported but not recommended. It's supported because SQL Server instance uses its own copies of the open-source R and Anaconda distributions. But it's not recommended because running code that uses R and Python on the SQL Server computer outside SQL Server can lead to various problems:

    • You use a different library and different executable, and get different results, than you do when you are running in SQL Server.
    • R and Python scripts running in external libraries cannot be managed by SQL Server, leading to resource contention.

Important

After setup is complete, be sure to complete the post-configuration steps described in this article. These steps include enabling SQL Server to use external scripts, and adding accounts required for SQL Server to run R and Python jobs on your behalf. Configuration changes generally require a restart of the instance, or a restart of the Launchpad service.

Get the installation media

The download location for SQL Server depends on the edition:

  • SQL Server Enterprise, Standard, and Express Editions are licensed for production use. For Enterprise and Standard Editions, contact your software vendor for the installation media. You can find purchasing information and a directory of Microsoft partners on the Microsoft purchasing website.

  • Free editions are available at SQL Server Downloads.

Run Setup

For local installations, you must run Setup as an administrator. If you install SQL Server from a remote share, you must use a domain account that has read and execute permissions on the remote share.

  1. Start the setup wizard for SQL Server 2017. You can download

  2. On the Installation tab, select New SQL Server stand-alone installation or add features to an existing installation.

    Install Machine Learning Services in-database

  3. On the Feature Selection page, select these options:

    • Database Engine Services

      To use R and Python with SQL Server, you must install an instance of the database engine. You can use either a default or a named instance.

    • Machine Learning Services (In-Database)

      This option installs the database services that support R and Python script execution.

    • R

      Check this option to add the Microsoft R packages, interpreter, and open-source R.

    • Python

      Check this option to add the Microsoft Python packages, the Python 3.5 executable, and select libraries from the Anaconda distribution.

      Feature options for R and Python

      Note

      Do not select the option for Machine Learning Server (Standalone). The option to install Machine Learning Server under Shared Features is intended for use on a separate computer.

  4. On the Consent to Install R page, select Accept. This license agreement covers Microsoft R Open, which includes a distribution of the open-source R base packages and tools, together with enhanced R packages and connectivity providers from the Microsoft development team.

  5. On the Consent to Install Python page, select Accept. The Python open-source licensing agreement also covers Anaconda and related tools, plus some new Python libraries from the Microsoft development team.

    Agreement to Python license

    Note

    If the computer you are using does not have internet access, you can pause setup at this point to download the installers separately. For more information, see Install machine learning components without internet access.

    Select Accept, wait until the Next button becomes active, and then select Next.

  6. On the Ready to Install page, verify that these selections are included, and select Install.

    • Database Engine Services
    • Machine Learning Services (In-Database)
    • R or Python, or both

      Note of the location of the folder under the path ..\Setup Bootstrap\Log where the configuration files are stored. When setup is complete, you can review the installed components in the Summary file.

  7. After setup is complete, if you are instructed to restart the computer, do so now. It is important to read the message from the Installation Wizard when you have finished with Setup. For more information, see View and Read SQL Server Setup Log Files.

Enable script execution

  1. Open SQL Server Management Studio.

    Tip

    You can download and install the appropriate version from this page: Download SQL Server Management Studio (SSMS).

    You can also try out the preview release of Azure Data Studio, which supports administrative tasks and queries against SQL Server.

  2. Connect to the instance where you installed Machine Learning Services, click New Query to open a query window, and run the following command:

    sp_configure
    

    The value for the property, external scripts enabled, should be 0 at this point. That is because the feature is turned off by default. The feature must be explicitly enabled by an administrator before you can run R or Python scripts.

  3. To enable the external scripting feature, run the following statement:

    EXEC sp_configure  'external scripts enabled', 1
    RECONFIGURE WITH OVERRIDE
    

    If you have already enabled the feature for the R language, don't run reconfigure a second time for Python. The underlying extensibility platform supports both languages.

Restart the service

When the installation is complete, restart the database engine before continuing to the next, enabling script execution.

Restarting the service also automatically restarts the related SQL Server Trusted Launchpad service.

You can restart the service using the right-click Restart command for the instance in SSMS, or by using the Services panel in Control Panel, or by using SQL Server Configuration Manager.

Verify installation

Use the following steps to verify that all components used to launch external script are running.

  1. In SQL Server Management Studio, open a new query window, and run the following command:

    EXEC sp_configure  'external scripts enabled'
    

    The run_value should now be set to 1.

  2. Open the Services panel or SQL Server Configuration Manager, and verify SQL Server Launchpad service is running. You should have one service for every database engine instance that has R or Python installed. For more information about the service, see Extensibility framework.

  3. If Launchpad is running, you should be able to run simple R and Python scripts to verify that external scripting runtimes can communicate with SQL Server.

    Open a new Query window in SQL Server Management Studio, and then run a script such as the following:

    • For R

      EXEC sp_execute_external_script  @language =N'R',
      @script=N'
      OutputDataSet <- InputDataSet;
      ',
      @input_data_1 =N'SELECT 1 AS hello'
      WITH RESULT SETS (([hello] int not null));
      GO
      
    • For Python

      EXEC sp_execute_external_script  @language =N'Python',
      @script=N'
      OutputDataSet = InputDataSet;
      ',
      @input_data_1 =N'SELECT 1 AS hello'
      WITH RESULT SETS (([hello] int not null));
      GO
      

    Results

    The script can take a little while to run, the first time the external script runtime is loaded. The results should be something like this:

    hello
    1

Note

Columns or headings used in the Python script are not returned, by design. To add column names for your output, you must specify the schema for the return data set. Do this by using the WITH RESULTS parameter of the stored procedure, naming the columns and specifying the SQL data type.

For example, you can add the following line to generate an arbitrary column name: WITH RESULT SETS ((Col1 AS int))

Additional configuration

If the external script verification step was successful, you can run R or Python commands from SQL Server Management Studio, Visual Studio Code, or any other client that can send T-SQL statements to the server.

If you got an error when running the command, review the additional configuration steps in this section. You might need to make additional appropriate configurations to the service or database.

At the instance level, additional configuration might include:

On the database, you might need the following configuration updates:

Note

Whether additional configuration is required depends on your security schema, where you installed SQL Server, and how you expect users to connect to the database and run external scripts.

Enable implied authentication for SQL Restricted User Group (SQLRUserGroup) account group

If you need to run scripts from a remote data science client, and you are using Windows authentication, additional configuration is required to give worker accounts running R and Python processes permission to sign in to the SQL Server instance on your behalf. This behavior is called implied authentication, and is implemented by the database engine to support secure execution of external scripts in SQL Server 2016 and SQL Server 2017.

Note

If you use a SQL login for running scripts in a SQL Server compute context, this extra step is not required.

  1. In SQL Server Management Studio, in Object Explorer, expand Security. Then right-click Logins, and select New Login.
  2. In the Login - New dialog box, select Search.
  3. Select Object Types, and select Groups. Clear everything else.
  4. In Enter the object name to select, type SQLRUserGroup, and select Check Names.
  5. The name of the local group associated with the instance's Launchpad service should resolve to something like instancename\SQLRUserGroup. Select OK.
  6. By default, the group is assigned to the public role, and has permission to connect to the database engine.
  7. Select OK.

In SQL Server 2017 and earlier, a number of local Windows user accounts are created for the purpose of running tasks under the security token of the SQL Server Trusted Launchpad service. You can view these accounts in the Windows user group, SQLRUserGroup. By default, 20 worker accounts are created, which is usually more than enough for running external script jobs.

These accounts are used as follows. When a user sends a Python or R script from an external client, SQL Server activates an available worker account, maps it to the identity of the calling user, and runs the script on behalf of the user. If the script, which is executing external to SQL Server, has to retrieve data or resources from SQL Server, the connection back to SQL Server requires a log in. Creating a database login for SQLRUserGroup also the connection to succeed.

In SQL Server 2019, worker accounts are replaced with AppContainers, with processes executing under the SQL Server Launchpad service. Although the worker accounts are no longer used, you are still required to add a database login for SQLRUsergroup if implied authentication is needed. Just as the worker accounts did not have login permission, the Launchpad service identity does not either. Creating a login for SQLRUserGroup, which consists of the Launchpad service in this release, allows implied authentication to work.

Give users permission to run external scripts

If you installed SQL Server yourself, and you are running R or Python scripts in your own instance, you typically execute scripts as an administrator. Thus, you have implicit permission over various operations and all data in the database.

Most users, however, do not have such elevated permissions. For example, users in an organization who use SQL logins to access the database generally do not have elevated permissions. Therefore, for each user who is using R or Python, you must grant users of Machine Learning Services the permission to run external scripts in each database where the language is used. Here's how:

USE <database_name>
GO
GRANT EXECUTE ANY EXTERNAL SCRIPT  TO [UserName]

Note

Permissions are not specific to the supported script language. In other words, there are not separate permission levels for R script versus Python script. If you need to maintain separate permissions for these languages, install R and Python on separate instances.

Give your users read, write, or data definition language (DDL) permissions to databases

While a user is running scripts, the user might need to read data from other databases. The user might also need to create new tables to store results, and write data into tables.

For each Windows user account or SQL login that is running R or Python scripts, ensure that it has the appropriate permissions on the specific database: db_datareader, db_datawriter, or db_ddladmin.

For example, the following Transact-SQL statement gives the SQL login MySQLLogin the rights to run T-SQL queries in the ML_Samples database. To run this statement, the SQL login must already exist in the security context of the server.

USE ML_Samples
GO
EXEC sp_addrolemember 'db_datareader', 'MySQLLogin'

For more information about the permissions included in each role, see Database-level roles.

Create an ODBC data source for the instance on your data science client

You might create a machine learning solution on a data science client computer. If you need to run code by using the SQL Server computer as the compute context, you have two options: access the instance by using a SQL login, or by using a Windows account.

  • For SQL logins: Ensure that the login has appropriate permissions on the database where you are reading data. You can do this by adding Connect to and SELECT permissions, or by adding the login to the db_datareader role. To create objects, assign DDL_admin rights. If you must save data to tables, add to the db_datawriter role.

  • For Windows authentication: You might need to create an ODBC data source on the data science client that specifies the instance name and other connection information. For more information, see ODBC data source administrator.

Suggested optimizations

Now that you have everything working, you might also want to optimize the server to support machine learning, or install pretrained models.

Add more worker accounts

If you expect many users to be running scripts concurrently, you can increase the number of worker accounts that are assigned to the Launchpad service. For more information, see Modify the user account pool for SQL Server Machine Learning Services.

Optimize the server for script execution

The default settings for SQL Server setup are intended to optimize the balance of the server for a variety of services that are supported by the database engine, which might include extract, transform, and load (ETL) processes, reporting, auditing, and applications that use SQL Server data. Therefore, under the default settings, you might find that resources for machine learning are sometimes restricted or throttled, particularly in memory-intensive operations.

To ensure that machine learning jobs are prioritized and resourced appropriately, we recommend that you use SQL Server Resource Governor to configure an external resource pool. You might also want to change the amount of memory that's allocated to the SQL Server database engine, or increase the number of accounts that run under the SQL Server Trusted Launchpad service.

If you are using Standard Edition and do not have Resource Governor, you can use Dynamic Management Views (DMVs) and Extended Events, as well as Windows event monitoring, to help manage the server resources. For more information, see Monitoring and managing R Services and Monitoring and managing Python Services.

Install additional R packages

The R solutions you create for SQL Server can call basic R functions, functions from the proprietary packages installed with SQL Server, and third-party R packages compatible with the version of open-source R installed by SQL Server.

Packages that you want to use from SQL Server must be installed in the default library that is used by the instance. If you have a separate installation of R on the computer, or if you installed packages to user libraries, you won't be able to use those packages from T-SQL.

The process for installing and managing R packages is different in SQL Server 2016 and SQL Server 2017. In SQL Server 2016, a database administrator must install R packages that users need. In SQL Server 2017, you can set up user groups to share packages on a per-database level, or configure database roles to enable users to install their own packages. For more information, see Install new R packages in SQL Server.

Get help

Need help with installation or upgrade? For answers to common questions and known issues, see the following article:

To check the installation status of the instance and fix common issues, try these custom reports.

Next steps

R developers can get started with some simple examples, and learn the basics of how R works with SQL Server. For your next step, see the following links:

Python developers can learn how to use Python with SQL Server by following these tutorials:

To view examples of machine learning that are based on real-world scenarios, see Machine learning tutorials.