使用 RevoScaleR 安装 R 包Use RevoScaleR to install R packages

适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) 是SQL Server 2017 (14.x)SQL Server 2017 (14.x)yesSQL Server 2017 (14.x)SQL Server 2017 (14.x)适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) 是SQL Server 2017 (14.x)SQL Server 2017 (14.x)yesSQL Server 2017 (14.x)SQL Server 2017 (14.x)

本文介绍如何使用 RevoScaleR(9.0.1 版及更高版本)函数在附带机器学习服务或 R Services 的 SQL Server 上安装 R 包。This article describes how to use RevoScaleR (version 9.0.1 and later) functions to install R packages on SQL Server with Machine Learning Services or R Services. 远程非管理员可以使用 RevoScaleR 函数在 SQL Server 上安装包,而无需直接访问服务器。The RevoScaleR functions can be used by remote, non-administrators to install packages on SQL Server without direct access to the server.

备注

SQL Server R Services 客户必须执行组件升级才能获取 RevoScaleR 包管理函数。SQL Server R Services customers must do a component upgrade to get RevoScaleR package management functions. 有关如何检索包版本和内容的说明,请参阅获取 R 包信息For instructions on how to retrieve package version and contents, see Get R package information.

用于包管理的 RevoScaleR 函数RevoScaleR functions for package management

下表描述了用于 R 包安装和管理的函数。The following table describes the functions used for R package installation and management.

函数Function 说明Description
rxSqlLibPathsrxSqlLibPaths 确定远程 SQL Server 上实例库的路径。Determine the path of the instance library on the remote SQL Server.
rxFindPackagerxFindPackage 获取远程 SQL Server 上的一个或多个包的路径。Gets the path for one or more packages on the remote SQL Server.
rxInstallPackagesrxInstallPackages 从远程 R 客户端调用此函数,从指定存储库或通过读取本地保存的压缩包,将包安装到 SQL Server 计算上下文。Call this function from a remote R client to install packages in a SQL Server compute context, either from a specified repository, or by reading locally saved zipped packages. 此函数检查依赖项并确保任何相关包可安装到 SQL Server,就像本地计算上下文中的 R 包安装一样。This function checks for dependencies and ensures that any related packages can be installed to SQL Server, just like R package installation in the local compute context. 若要使用此选项,必须在服务器和数据库上启用包管理。To use this option, you must have enabled package management on the server and database. 客户端和服务器环境必须具有相同版本的 RevoScaleR。Both client and server environments must have the same version of RevoScaleR.
rxInstalledPackagesrxInstalledPackages 在指定的计算上下文中获取安装的包的列表。Gets a list of packages installed in the specified compute context.
rxSyncPackagesrxSyncPackages 对于指定的计算上下文,在文件系统和数据库之间复制有关包库的信息。Copy information about a package library between the file system and database, for the specified compute context.
rxRemovePackagesrxRemovePackages 删除指定计算上下文中的包。Removes packages from a specified compute context. 它还会计算依赖项,并确保删除 SQL Server 上其他包不再使用的包以释放资源。It also computes dependencies and ensures that packages that are no longer used by other packages on SQL Server are removed, to free up resources.

必备条件Prerequisites

  • 在 SQL Server 上启用远程管理。Remote management enabled on SQL Server. 有关详细信息,请参阅在 SQL Server 上启用远程 R 包管理For more information, see Enable remote R package management on SQL Server.

  • 客户端和服务器环境中的 RevoScaleR 版本相同。RevoScaleR versions are the same on both client and server environments. 有关详细信息,请参阅获取 R 包信息For more information, see Get R package information.

  • 你有权连接到服务器和数据库,以及运行 R 命令。You have permission to connect to the server and a database, and to run R commands. 你必须是数据库角色的成员,该角色允许你在指定的实例和数据库上安装包。You must be a member of a database role that allows you to install packages on the specified instance and database.

    • 共享范围 中的包可以由属于指定数据库中 rpkgs-shared 角色的用户进行安装。Packages in shared scope can be installed by users belonging to the rpkgs-shared role in a specified database. 此角色中的所有用户都可以卸载共享包。All users in this role can uninstall shared packages.

    • 专用范围 中的包可以由属于数据库中 rpkgs-private 角色的任何用户进行安装。Packages in private scope can be installed by any user belonging to the rpkgs-private role in a database. 但是,用户只能查看并卸载自己的包。However, users can see and uninstall only their own packages.

    • 数据库所有者可以使用共享包或专用包。Database owners can work with shared or private packages.

客户端连接Client connections

在同一网络上,客户端工作站可以是 Microsoft R ClientMicrosoft Machine Learning Server(数据科学家经常使用免费开发人员版)。A client workstation can be Microsoft R Client or a Microsoft Machine Learning Server (data scientists often use the free developer edition) on the same network.

从远程 R 客户端调用包管理函数时,必须先使用 RxInSqlServer 函数创建计算上下文对象。When calling package management functions from a remote R client, you must create a compute context object first, using the RxInSqlServer function. 此后,针对你使用的每个包管理函数,将计算上下文作为参数传递。Thereafter, for each package management function that you use, pass the compute context as an argument.

用户标识通常在设置计算上下文时指定。User identity is typically specified when setting the compute context. 如果在创建计算上下文时未指定用户名和密码,则将使用运行 R 代码的用户标识。If you don't specify a user name and password when you create the compute context, the identity of the user running the R code is used.

  1. 在 R 命令行中,定义实例和数据库的连接字符串。From an R command line, define a connection string to the instance and database.

  2. 利用连接字符串,使用 RxInSqlServer 构造函数来定义 SQL Server 计算上下文。Use the RxInSqlServer constructor to define a SQL Server compute context, using the connection string.

    sqlcc <- RxInSqlServer(connectionString = myConnString, shareDir = sqlShareDir, wait = sqlWait, consoleOutput = sqlConsoleOutput)
    
  3. 创建要安装的包列表,并将列表保存在字符串变量中。Create a list of the packages you want to install and save the list in a string variable.

    packageList <- c("e1071", "mice")
    
  4. 调用 rxInstallPackages,并传递计算上下文和包含包名称的字符串变量。Call rxInstallPackages and pass the compute context and the string variable containing the package names.

    rxInstallPackages(pkgs = packageList, verbose = TRUE, computeContext = sqlcc)
    

    如果需要相关的包,也会安装它们,前提是客户端上有 Internet 连接。If dependent packages are required, they are also installed, assuming an internet connection is available on the client.

    使用建立连接的用户凭据,在该用户的默认范围安装包。Packages are installed using the credentials of the user making the connection, in the default scope for that user.

在存储过程中调用包管理函数Call package management functions in stored procedures

可以在 sp_execute_external_script 中运行包管理函数。You can run package management functions inside sp_execute_external_script. 执行此操作时,将使用存储过程调用方的安全上下文执行该函数。When you do so, the function is executed using the security context of the stored procedure caller.

示例Examples

本节将提供一些示例,演示在以计算上下文形式连接到 SQL Server 实例或数据库时,如何从远程客户端使用这些函数。This section provides examples of how to use these functions from a remote client when connecting to a SQL Server instance or database as the compute context.

对于所有示例,必须提供连接字符串或需要连接字符串的计算上下文。For all examples, you must provide either a connection string, or a compute context, which requires a connection string. 此示例提供了一种为 SQL Server 创建计算上下文的方法:This example provides one way to create a compute context for SQL Server:

instance_name <- "computer-name/instance-name";
database_name <- "TestDB";
sqlWait= TRUE;
sqlConsoleOutput <- TRUE;
connString <- paste("Driver=SQL Server;Server=", instance_name, ";Database=", database_name, ";Trusted_Connection=true;", sep="");
sqlcc <- RxInSqlServer(connectionString = connString, wait = sqlWait, consoleOutput = sqlConsoleOutput, numTasks = 4);

根据服务器所在的位置和安全模型,你可能需要在连接字符串中提供域和子网规范,或使用 SQL 登录名。Depending on where the server is located, and the security model, you might need to provide a domain and subnet specification in the connection string, or use a SQL login. 例如:For example:

connStr <- "Driver=SQL Server;Server=myserver.financeweb.contoso.com;Database=Finance;Uid=RUser1;Pwd=RUserPassword"

获取远程 SQL Server 计算上下文中的包路径Get package path on a remote SQL Server compute context

此示例获取计算上下文 上,RevoScaleRsqlcc 包的路径。This example gets the path for the RevoScaleR package on the compute context, sqlcc.

sqlPackagePaths <- rxFindPackage(package = "RevoScaleR", computeContext = sqlcc)
print(sqlPackagePaths)

结果Results

"C:/Program Files/Microsoft SQL Server/MSSQL14.MSSQLSERVER/R_SERVICES/library/RevoScaleR""C:/Program Files/Microsoft SQL Server/MSSQL14.MSSQLSERVER/R_SERVICES/library/RevoScaleR"

提示

如果已启用了查看 SQL 控制台输出的选项,则可以从 print 语句前面的函数获取状态消息。If you have enabled the option to see SQL console output, you might get status messages from the function that precedes the print statement. 测试完代码后,在计算上下文构造函数中将 consoleOutput 设置为 FALSE 以消除消息。After you have finished testing your code, set consoleOutput to FALSE in the compute context constructor to eliminate messages.

获取多个包的位置Get locations for multiple packages

以下示例获取计算上下文 上,RevoScaleR 和 latticesqlcc 包的路径。The following example gets the paths for the RevoScaleR and lattice packages, on the compute context, sqlcc. 获取有关多个包的信息时,传递包含包名称的字符串向量。To get information about multiple packages, pass a string vector containing the package names.

packagePaths <- rxFindPackage(package = c("RevoScaleR", "lattice"), computeContext = sqlcc)
print(packagePaths)

获取远程计算上下文中的包版本Get package versions on a remote compute context

从 R 控制台运行此命令来获取在计算上下文 sqlServer 中安装的包的生成号和版本号。Run this command from an R console to get the build number and version numbers for packages installed on the compute context, sqlServer.

sqlPackages <- rxInstalledPackages(fields = c("Package", "Version", "Built"), computeContext = sqlServer)

在 SQL Server 上安装包Install a package on SQL Server

此示例将 forecast 包及其依赖项安装到计算上下文。This example installs the forecast package and its dependencies into the compute context.

pkgs <- c("forecast")
rxInstallPackages(pkgs = pkgs, verbose = TRUE, scope = "private", computeContext = sqlcc)

从 SQL Server 删除包Remove a package from SQL Server

此示例将 forecast 包及其依赖项从计算上下文删除。This example removes the forecast package and its dependencies from the compute context.

pkgs <- c("forecast")
rxRemovePackages(pkgs = pkgs, verbose = TRUE, scope = "private", computeContext = sqlcc)

在数据库和文件系统之间同步包Synchronize packages between database and file system

下面的示例将检查数据库 TestDB ,并确定是否在文件系统中安装了所有包。The following example checks the database TestDB, and determines whether all packages are installed in the file system. 如果缺少某些包,则它们将安装在文件系统中。If some packages are missing, they are installed in the file system.

# Instantiate the compute context
connectionString <- "Driver=SQL Server;Server=myServer;Database=TestDB;Trusted_Connection=True;"
computeContext <- RxInSqlServer(connectionString = connectionString )

# Synchronize the packages in the file system for all scopes and users
rxSyncPackages(computeContext=computeContext, verbose=TRUE)

包同步基于每个数据库和每个用户运作。Package synchronization works on a per database and per user basis. 有关详细信息,请参阅 SQL Server 的 R 包同步For more information, see R package synchronization for SQL Server.

使用存储过程列出 SQL Server 中的包Use a stored procedure to list packages in SQL Server

从 Management Studio 或另一个支持 T-SQL 的工具运行此命令,以使用存储过程中的 rxInstalledPackages 获取当前实例中已安装包的列表。Run this command from Management Studio or another tool that supports T-SQL, to get a list of installed packages on the current instance, using rxInstalledPackages in a stored procedure.

EXEC sp_execute_external_script 
  @language=N'R', 
  @script=N'
    myPackages <- rxInstalledPackages();
    OutputDataSet <- as.data.frame(myPackages);
    '

rxSqlLibPaths 函数可用于确定 SQL Server 机器学习服务使用的活动库。The rxSqlLibPaths function can be used to determine the active library used by SQL Server Machine Learning Services. 此脚本只能返回当前服务器的库路径。This script can return only the library path for the current server.

declare @instance_name nvarchar(100) = @@SERVERNAME, @database_name nvarchar(128) = db_name();
exec sp_execute_external_script 
  @language = N'R',
  @script = N'
    connStr <- paste("Driver=SQL Server;Server=", instance_name, ";Database=", database_name, ";Trusted_Connection=true;", sep="");
    .libPaths(rxSqlLibPaths(connStr));
    print(.libPaths());
  ', 
  @input_data_1 = N'', 
  @params = N'@instance_name nvarchar(100), @database_name nvarchar(128)',
  @instance_name = @instance_name, 
  @database_name = @database_name;

另请参阅See also