在 SQL Server 上安装预先定型的机器学习模型Install pre-trained machine learning models on SQL Server

适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) 及更高版本适用于:Applies to: 是SQL Server 2016 (13.x)SQL Server 2016 (13.x)yesSQL Server 2016 (13.x)SQL Server 2016 (13.x) and later

本文介绍如何使用 PowerShell 将用于情绪分析和图像特征化的免费预训练机器学习模型添加到具有 R 或 Python 集成的 SQL Server 实例 。This article explains how to use PowerShell to add free pre-trained machine learning models for sentiment analysis and image featurization to a SQL Server instance having R or Python integration. 预先定型的模型由 Microsoft 生成并且可供使用,作为安装后任务添加到实例中。The pre-trained models are built by Microsoft and ready-to-use, added to an instance as a post-install task. 有关这些模型的详细信息,请参阅本文的资源部分。For more information about these models, see the Resources section of this article.

安装完成后,预先定型的模型将被视为一种实现细节,它支持 MicrosoftML (R) 和 MicrosoftML (Python) 库中的特定功能。Once installed, the pre-trained models are considered an implementation detail that power specific functions in the MicrosoftML (R) and microsoftml (Python) libraries. 不应(并且不能)查看、自定义或重新定型模型,也不能在自定义代码或成对的其他函数中将它们视为独立的资源。You should not (and cannot) view, customize, or retrain the models, nor can you treat them as an independent resource in custom code or paired other functions.

若要使用预先定型的模型,请调用下表中列出的函数。To use the pretrained models, call the functions listed in the following table.

R 函数 (MicrosoftML)R function (MicrosoftML) Python 函数 (microsoftml)Python function (microsoftml) 使用情况Usage
getSentimentgetSentiment get_sentimentget_sentiment 对文本输入生成正负情绪分数。Generates positive-negative sentiment score over text inputs.
featurizeImagefeaturizeImage featurize_imagefeaturize_image 从图像文件输入中提取文本信息。Extracts text information from image file inputs.

先决条件Prerequisites

机器学习算法是计算密集型的。Machine learning algorithms are computationally intensive. 对于低等到中等的工作负荷,包括使用所有示例数据来完成教程演练,我们建议使用 16 GB RAM。We recommend 16 GB RAM for low-to-moderate workloads, including completion of the tutorial walkthroughs using all of the sample data.

必须具有计算机和 SQL Server 的管理员权限,才能添加预先定型的模型。You must have administrator rights on the computer and SQL Server to add pre-trained models.

必须启用外部脚本,并且必须运行 SQL Server LaunchPad 服务。External scripts must be enabled and SQL Server LaunchPad service must be running. 安装说明提供了启用和验证这些功能的步骤。Installation instructions provide the steps for enabling and verifying these capabilities.

MicrosoftML R 包microsoftml Python 包包含预先定型的模型。MicrosoftML R package or microsoftml Python package contain the pre-trained models.

SQL Server 机器学习服务包含机器学习库的两种语言版本,因此无需执行任何其他操作即可满足此必备条件。SQL Server Machine Learning Services includes both language versions of the machine learning library, so this prerequisite is met with no further action on your part. 由于存在这些库,所以能使用本文中所述的 PowerShell 脚本将预先定型的模型添加到这些库。Because the libraries are present, you can use the PowerShell script described in this article to add the pre-trained models to these libraries.

MicrosoftML R 包包含预先定型的模型。MicrosoftML R package contain the pre-trained models.

SQL Server R Services(仅 R)不包含现成的 MicrosoftML 包SQL Server R Services, which is R only, does not include MicrosoftML package out of the box. 若要添加 MicrosoftML,必须执行组件升级To add MicrosoftML, you must do a component upgrade. 组件升级的一个优点是可以同时添加预先定型的模型,所以无需运行 PowerShell 脚本。One advantage of the component upgrade is that you can simultaneously add the pre-trained models, which makes running the PowerShell script unnecessary. 不过如果已经升级,但第一次没有添加预先定型的模型,则可以按照本文所述的内容运行 PowerShell 脚本。However, if you already upgraded but missed adding the pre-trained models the first time around, you can run the PowerShell script as described in this article. SQL Server 的两个版本都适用。It works for both versions of SQL Server. 在执行此操作之前,请确认 MicrosoftML 库存在于 C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\R_SERVICES\libraryBefore you do, confirm that the MicrosoftML library exists at C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\R_SERVICES\library.

检查是否安装了预先定型的模型Check whether pre-trained models are installed

R 和 Python 模型的安装路径如下所示:The install paths for R and Python models are as follows:

  • R 模型:C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\library\MicrosoftML\mxLibs\x64For R: C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\library\MicrosoftML\mxLibs\x64

  • Python 模型:C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\Lib\site-packages\microsoftml\mxLibsFor Python: C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\Lib\site-packages\microsoftml\mxLibs

下面列出的是模型文件名:Model file names are listed below:

  • AlexNet_Updated.modelAlexNet_Updated.model
  • ImageNet1K_mean.xmlImageNet1K_mean.xml
  • pretrained.modelpretrained.model
  • ResNet_101_Updated.modelResNet_101_Updated.model
  • ResNet_18_Updated.modelResNet_18_Updated.model
  • ResNet_50_Updated.modelResNet_50_Updated.model

如果已安装这些模型,请跳到验证步骤以确认可用性。If the models are already installed, skip ahead to the validation step to confirm availability.

下载安装脚本Download the installation script

单击 https://aka.ms/mlm4sql 下载文件 Install-MLModels.ps1 。Click https://aka.ms/mlm4sql to download the file Install-MLModels.ps1.

使用提升的权限执行Execute with elevated privileges

  1. 启动 PowerShell。Start PowerShell. 在任务栏上,右键单击 PowerShell 程序图标,然后选择“以管理员身份运行” 。On the task bar, right-click the PowerShell program icon and select Run as administrator.

  2. 输入安装脚本文件的完全限定路径,并且包含实例名称。Enter a fully-qualified path to the installation script file and include the instance name. 这里假定为“下载”文件夹和默认实例,该命令可能如下所示:Assuming the Downloads folder and a default instance, the command might look like this:

    PS C:\WINDOWS\system32> C:\Users\<user-name>\Downloads\Install-MLModels.ps1 MSSQLSERVER
    

输出Output

在连接 Internet 的 SQL Server 机器学习服务默认实例(带有 R 和 Python)上,应会看到类似于以下内容的消息。On an internet-connected SQL Server Machine Learning Services default instance with R and Python, you should see messages similar to the following.

MSSQL14.MSSQLSERVER
     Verifying R models [9.2.0.24]
     Downloading R models [C:\Users\<user-name>\AppData\Local\Temp]
     Installing R models [C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\]
     Verifying Python models [9.2.0.24]
     Installing Python models [C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\]
PS C:\WINDOWS\system32>

验证安装Verify installation

首先,检查 mxlibs 文件夹中的新文件。First, check for the new files in the mxlibs folder. 接下来,运行演示代码以确认模型已安装且正常工作。Next, run demo code to confirm the models are installed and functional.

R 验证步骤R verification steps

  1. 启动 C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\bin\x64 处的 RGUI.EXE 。Start RGUI.EXE at C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\bin\x64.

  2. 在命令提示符处粘贴以下 R 脚本。Paste in the following R script at the command prompt.

    # Create the data
    CustomerReviews <- data.frame(Review = c(
    "I really did not like the taste of it",
    "It was surprisingly quite good!",
    "I will never ever ever go to that place again!!"),
    stringsAsFactors = FALSE)
    
    # Get the sentiment scores
    sentimentScores <- rxFeaturize(data = CustomerReviews, 
                                    mlTransforms = getSentiment(vars = list(SentimentScore = "Review")))
    
    # Let's translate the score to something more meaningful
    sentimentScores$PredictedRating <- ifelse(sentimentScores$SentimentScore > 0.6, 
                                            "AWESOMENESS", "BLAH")
    
    # Let's look at the results
    sentimentScores
    
  3. 按 Enter 查看情绪分数。Press Enter to view the sentiment scores. 输出应如下所示:Output should be as follows:

    > sentimentScores
                                            Review SentimentScore
    1           I really did not like the taste of it      0.4617899
    2                 It was surprisingly quite good!      0.9601924
    3 I will never ever ever go to that place again!!      0.3103435
    PredictedRating
    1            BLAH
    2     AWESOMENESS
    3            BLAH
    

Python 验证步骤Python verification steps

  1. 启动 C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES 处的 Python.exe 。Start Python.exe at C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES.

  2. 在命令提示符处粘贴以下 Python 脚本Paste in the following Python script at the command prompt

    import numpy
    import pandas
    from microsoftml import rx_logistic_regression, rx_featurize, rx_predict, get_sentiment
    
    # Create the data
    customer_reviews = pandas.DataFrame(data=dict(review=[
                "I really did not like the taste of it",
                "It was surprisingly quite good!",
                "I will never ever ever go to that place again!!"]))
    
    # Get the sentiment scores
    sentiment_scores = rx_featurize(
        data=customer_reviews,
        ml_transforms=[get_sentiment(cols=dict(scores="review"))])
    
    # Let's translate the score to something more meaningful
    sentiment_scores["eval"] = sentiment_scores.scores.apply(
                lambda score: "AWESOMENESS" if score > 0.6 else "BLAH")
    print(sentiment_scores)
    
  3. 按 Enter 打印分数。Press Enter to print the scores. 输出应如下所示:Output should be as follows:

    >>> print(sentiment_scores)
                                                review    scores         eval
    0            I really did not like the taste of it  0.461790         BLAH
    1                  It was surprisingly quite good!  0.960192  AWESOMENESS
    2  I will never ever ever go to that place again!!  0.310344         BLAH
    >>>
    

备注

如果演示脚本失败,请首先检查文件位置。If demo scripts fail, check the file location first. 在具有多个 SQL Server 实例的系统上,或者对于与独立版本并行运行的实例,安装脚本可能会错误地读取环境,并将文件放在错误的位置。On systems having multiple instances of SQL Server, or for instances that run side-by-side with standalone versions, it's possible for the installation script to mis-read the environment and place the files in the wrong location. 一般情况下,手动将文件复制到正确的 mxlib 文件夹可以解决此问题。Usually, manually copying the files to the correct mxlib folder fixes the problem.

使用预先定型的模型的示例Examples using pre-trained models

下面的链接包含调用预先定型的模型的示例代码。The following link include example code invoking the pretrained models.

研究和资源Research and resources

目前可用的模型是用于情绪分析和图像分类的深度神经网络 (DNN) 模型。Currently the models that are available are deep neural network (DNN) models for sentiment analysis and image classification. 所有预先定型的模型都通过 Microsoft 的 Computation Network Toolkit(计算网络工具包)或 CNTK 进行定型。All pre-trained models were trained by using Microsoft's Computation Network Toolkit, or CNTK.

每个网络的配置都基于以下引用实现:The configuration of each network was based on the following reference implementations:

  • ResNet-18ResNet-18
  • ResNet-50ResNet-50
  • ResNet-101ResNet-101
  • AlexNetAlexNet

有关这些深度学习模型中使用的算法的详细信息,以及如何使用 CNTK 将其实现和定型的详细信息,请参阅以下文章:For more information about the algorithms used in these deep learning models, and how they are implemented and trained using CNTK, see these articles:

另请参阅See also