您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

教程:使用 R 创建机器学习模型Tutorial: Use R to create a machine learning model

适用于:是基本版是企业版               (升级到企业版APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

在本教程中,我们将使用 Azure 机器学习 R SDK 创建逻辑回归模型,该模型预测交通事故中的死亡几率。In this tutorial you'll use the Azure Machine Learning R SDK to create a logistic regression model that predicts the likelihood of a fatality in a car accident. 你将了解 Azure 机器学习云资源如何与 R 一起工作,提供一个可缩放的环境以用来训练和部署模型。You'll see how the Azure Machine Learning cloud resources work with R to provide a scalable environment for training and deploying a model.

将在本教程中执行以下任务:In this tutorial, you perform the following tasks:

  • 创建 Azure 机器学习工作区Create an Azure Machine Learning workspace
  • 将笔记本文件夹和运行本教程所需的文件克隆到工作区Clone a notebook folder with the files necessary to run this tutorial into your workspace
  • 从工作区打开 RStudioOpen RStudio from your workspace
  • 加载数据并准备训练Load data and prepare for training
  • 将数据上传到数据存储,使之可用于远程训练Upload data to a datastore so it is available for remote training
  • 创建计算资源以远程训练模型Create a compute resource to train the model remotely
  • 训练 caret 模型以预测死亡几率Train a caret model to predict probability of fatality
  • 部署预测终结点Deploy a prediction endpoint
  • 在 R 中测试模型Test the model from R

如果没有 Azure 订阅,请在开始操作前先创建一个免费帐户。If you don't have an Azure subscription, create a free account before you begin. 立即试用免费版或付费版 Azure 机器学习Try the free or paid version of Azure Machine Learning today.

创建工作区Create a workspace

Azure 机器学习工作区是云中的基础资源,用于试验、训练和部署机器学习模型。An Azure Machine Learning workspace is a foundational resource in the cloud that you use to experiment, train, and deploy machine learning models. 它将 Azure 订阅和资源组关联到服务中一个易于使用的对象。It ties your Azure subscription and resource group to an easily consumed object in the service.

通过 Azure 门户创建工作区,该门户是用于管理 Azure 资源的基于 Web 的控制台。You create a workspace via the Azure portal, a web-based console for managing your Azure resources.

  1. 使用 Azure 订阅的凭据登录到 Azure 门户Sign in to Azure portal by using the credentials for your Azure subscription.

  2. 在 Azure 门户的左上角,选择“+ 创建资源” 。In the upper-left corner of Azure portal, select + Create a resource.

    创建新资源

  3. 使用搜索栏查找“机器学习” 。Use the search bar to find Machine Learning.

  4. 选择“机器学习” 。Select Machine Learning.

  5. 在“机器学习”窗格中,选择“创建”以开始 。In the Machine Learning pane, select Create to begin.

  6. 提供以下信息来配置新工作区:Provide the following information to configure your new workspace:

    字段Field 说明Description
    工作区名称Workspace name 输入用于标识工作区的唯一名称。Enter a unique name that identifies your workspace. 本示例使用 docs-ws 。In this example, we use docs-ws. 名称在整个资源组中必须唯一。Names must be unique across the resource group. 使用易于记忆且区别于其他人所创建工作区的名称。Use a name that's easy to recall and to differentiate from workspaces created by others.
    订阅Subscription 选择要使用的 Azure 订阅。Select the Azure subscription that you want to use.
    资源组Resource group 使用订阅中的现有资源组,或者输入一个名称以创建新的资源组。Use an existing resource group in your subscription or enter a name to create a new resource group. 资源组保存 Azure 解决方案的相关资源。A resource group holds related resources for an Azure solution. 本示例使用 docs-aml 。In this example, we use docs-aml.
    位置Location 选择离你的用户和数据资源最近的位置来创建工作区。Select the location closest to your users and the data resources to create your workspace.
    工作区版本Workspace edition 选择“基本” 作为本教程的工作区类型。Select Basic as the workspace type for this tutorial. 工作区类型(基本和企业)确定要访问的功能和定价。The workspace type (Basic & Enterprise) determines the features to which you’ll have access and pricing. 本教程中的所有内容均可使用基本或企业工作区来执行。Everything in this tutorial can be performed with either a Basic or Enterprise workspace.
  7. 完成工作区配置后,选择“查看 + 创建” 。After you are finished configuring the workspace, select Review + Create.

    警告

    在云中创建工作区可能需要几分钟时间。It can take several minutes to create your workspace in the cloud.

    完成创建后,会显示部署成功消息。When the process is finished, a deployment success message appears.

  8. 若要查看新工作区,请选择“转到资源” 。To view the new workspace, select Go to resource.

重要

记下你的工作区和订阅 。Take note of your workspace and subscription. 你将需要这些项才能确保在正确的位置创建试验。You'll need these to ensure you create your experiment in the right place.

克隆笔记本文件夹Clone a notebook folder

本示例使用工作区中的云笔记本服务器来实现免安装的预配置体验。This example uses the cloud notebook server in your workspace for an install-free and pre-configured experience. 如果你希望控制环境、包和依赖项,请使用自己的环境Use your own environment if you prefer to have control over your environment, packages and dependencies.

在 Azure 机器学习工作室中完成以下试验设置和运行步骤,该工作室是包含用于为所有技能级别的数据科学实践者执行数据科学方案的机器学习工具的合并界面。You complete the following experiment set-up and run steps in Azure Machine Learning studio, a consolidated interface that includes machine learning tools to perform data science scenarios for data science practitioners of all skill levels.

  1. 登录到 Azure 机器学习工作室Sign in to Azure Machine Learning studio.

  2. 选择创建的订阅和工作区。Select your subscription and the workspace you created.

  3. 选择左侧的“笔记本” 。Select Notebooks on the left.

  4. 打开“Samples”文件夹 。Open the Samples folder.

  5. 打开 R 文件夹 。Open the R folder.

  6. 打开包含版本号的文件夹。Open the folder with a version number on it. 此数字表示 R SDK 的当前版本。This number represents the current release for the R SDK.

  7. 选择 vignettes 文件夹右侧的“...”,然后选择“克隆”。 Select the "..." at the right of the vignettes folder and then select Clone.

    克隆文件夹

  8. 将显示文件夹列表,其中显示了访问工作区的每个用户。A list of folders displays showing each user who accesses the workspace. 选择要将“vignettes”文件夹克隆到其中的文件夹 。Select your folder to clone the vignettes folder there.

打开 RStudioOpen RStudio

在计算实例或 Notebook VM 上使用 RStudio 运行此教程。Use RStudio on a compute instance or Notebook VM to run this tutorial.

  1. 选择左侧的“计算” 。Select Compute on the left.

  2. 如果没有计算资源,请添加一个。Add a compute resource if one does not already exist.

  3. 计算运行后,使用 RStudio 链接打开 RStudio 。Once the compute is running, use the RStudio link to open RStudio.

  4. 在 RStudio 中,“vignettes”文件夹位于右下位置“文件”部分中的“用户”下几级的位置 。In RStudio, your vignettes folder is a few levels down from Users in the Files section on the lower right. 在 vignettes 下选择“train-and-deploy-to-aci”文件夹,找到本教程中所需的文件 。 Under vignettes, select the train-and-deploy-to-aci folder to find the files needed in this tutorial.

重要

本文的余下部分包含 train-and-deploy-to-aci.Rmd 文件中所示的相同内容。The rest of this article contains the same content as you see in the train-and-deploy-to-aci.Rmd file. 如果你有 RMarkdown 方面的经验,可随意使用该文件中的代码。If you are experienced with RMarkdown, feel free to use the code from that file. 或者,可将该文件或本文中的代码片段复制/粘贴到 R 脚本或命令行中。Or you can copy/paste the code snippets from there, or from this article into an R script or the command line.

设置开发环境Set up your development environment

本教程中的开发工作设置包括以下操作:The setup for your development work in this tutorial includes the following actions:

  • 安装所需程序包Install required packages
  • 连接到工作区,使计算实例能够与远程资源通信Connect to a workspace, so that your compute instance can communicate with remote resources
  • 创建一个试验用于跟踪运行Create an experiment to track your runs
  • 创建用于训练的远程计算目标Create a remote compute target to use for training

安装所需程序包Install required packages

本教程假设已安装 Azure ML SDK。This tutorial assumes you already have the Azure ML SDK installed. 继续导入 azuremlsdk 包。Go ahead and import the azuremlsdk package.

library(azuremlsdk)

训练和评分脚本(accidents.Raccident_predict.R)有其他一些依赖项。The training and scoring scripts (accidents.R and accident_predict.R) have some additional dependencies. 如果你打算在本地运行这些脚本,请确保同时安装了这些必需的包。If you plan on running those scripts locally, make sure you have those required packages as well.

加载工作区Load your workspace

实例化现有工作区中的某个工作区对象。Instantiate a workspace object from your existing workspace. 以下代码将从 config.json 文件加载工作区详细信息。The following code will load the workspace details from the config.json file. 还可以使用 get_workspace() 检索工作区。You can also retrieve a workspace using get_workspace().

ws <- load_workspace_from_config()

创建试验Create an experiment

Azure ML 试验跟踪一组运行(通常来自相同的训练脚本)。An Azure ML experiment tracks a grouping of runs, typically from the same training script. 创建一个试验来跟踪运行,这些运行是基于事故数据训练 caret 模型的。Create an experiment to track the runs for training the caret model on the accidents data.

experiment_name <- "accident-logreg"
exp <- experiment(ws, experiment_name)

创建计算目标Create a compute target

Azure 机器学习托管计算 (AmlCompute) 是一项托管服务,可以让数据科学家在 Azure 虚拟机群集上定型机器学习模型。By using Azure Machine Learning Compute (AmlCompute), a managed service, data scientists can train machine learning models on clusters of Azure virtual machines. 示例包括带 GPU 支持的 VM。Examples include VMs with GPU support. 本教程将创建一个单节点 AmlCompute 群集作为训练环境。In this tutorial, you create a single-node AmlCompute cluster as your training environment. 如果工作区中尚无计算群集,以下代码将创建计算群集。The code below creates the compute cluster for you if it doesn't already exist in your workspace.

如果没有计算群集,可能需要等待几分钟时间来预配计算群集。You may need to wait a few minutes for your compute cluster to be provisioned if it doesn't already exist.

cluster_name <- "rcluster"
compute_target <- get_compute(ws, cluster_name = cluster_name)
if (is.null(compute_target)) {
  vm_size <- "STANDARD_D2_V2" 
  compute_target <- create_aml_compute(workspace = ws,
                                       cluster_name = cluster_name,
                                       vm_size = vm_size,
                                       max_nodes = 1)
}

wait_for_provisioning_completion(compute_target)

准备要训练的数据Prepare data for training

本教程使用美国国内高速公路交通安全管理中的数据(感谢 Mary C. Meyer 和 Tremika Finney 提供)。This tutorial uses data from the US National Highway Traffic Safety Administration (with thanks to Mary C. Meyer and Tremika Finney). 此数据集包括美国发生的 25,000 多次车祸的数据,以及可用于预测死亡几率的变量。This dataset includes data from over 25,000 car crashes in the US, with variables you can use to predict the likelihood of a fatality. 首先,将数据导入 R 中,将其转换为新的数据帧 accidents 用于分析,然后将其导出到 Rdata 文件。First, import the data into R and transform it into a new dataframe accidents for analysis, and export it to an Rdata file.

nassCDS <- read.csv("nassCDS.csv", 
                     colClasses=c("factor","numeric","factor",
                                  "factor","factor","numeric",
                                  "factor","numeric","numeric",
                                  "numeric","character","character",
                                  "numeric","numeric","character"))
accidents <- na.omit(nassCDS[,c("dead","dvcat","seatbelt","frontal","sex","ageOFocc","yearVeh","airbag","occRole")])
accidents$frontal <- factor(accidents$frontal, labels=c("notfrontal","frontal"))
accidents$occRole <- factor(accidents$occRole)
accidents$dvcat <- ordered(accidents$dvcat, 
                          levels=c("1-9km/h","10-24","25-39","40-54","55+"))

saveRDS(accidents, file="accidents.Rd")

将数据上传到数据存储Upload data to the datastore

将数据上传到云中,使远程训练环境能够对其进行访问。Upload data to the cloud so that it can be access by your remote training environment. 每个 Azure 机器学习工作区均附带一个默认的数据存储,其中存储了与 Azure Blob 容器(在附加到工作区的存储帐户中预配)的连接信息。Each Azure Machine Learning workspace comes with a default datastore that stores the connection information to the Azure blob container that is provisioned in the storage account attached to the workspace. 以下代码将前面创建的事故数据上传到该数据存储。The following code will upload the accidents data you created above to that datastore.

ds <- get_default_datastore(ws)

target_path <- "accidentdata"
upload_files_to_datastore(ds,
                          list("./accidents.Rd"),
                          target_path = target_path,
                          overwrite = TRUE)

训练模型Train a model

对于本教程,请使用远程计算群集基于上传的数据拟合一个逻辑回归模型。For this tutorial, fit a logistic regression model on your uploaded data using your remote compute cluster. 若要提交作业,需要:To submit a job, you need to:

  • 准备训练脚本Prepare the training script
  • 创建估算器Create an estimator
  • 提交作业Submit the job

准备训练脚本Prepare the training script

本教程的同一目录中已提供了一个名为 accidents.R 的训练脚本。A training script called accidents.R has been provided for you in the same directory as this tutorial. 请注意训练脚本中的以下详细信息,这些操作的目的是利用 Azure 机器学习进行训练:Notice the following details inside the training script that have been done to leverage Azure Machine Learning for training:

  • 训练脚本采用 -d 参数来查找包含训练数据的目录。The training script takes an argument -d to find the directory that contains the training data. 稍后定义并提交作业时,需要指向数据存储来获取此参数。When you define and submit your job later, you point to the datastore for this argument. Azure ML 会将存储文件夹装载到训练作业的远程群集。Azure ML will mount the storage folder to the remote cluster for the training job.
  • 训练脚本使用 log_metric_to_run() 将最终准确度作为指标,记录到 Azure ML 中的运行记录。The training script logs the final accuracy as a metric to the run record in Azure ML using log_metric_to_run(). Azure ML SDK 提供一组日志记录 API,用于在训练运行期间记录各种指标。The Azure ML SDK provides a set of logging APIs for logging various metrics during training runs. 这些指标将记录到试验运行记录中,并在其中持久保存。These metrics are recorded and persisted in the experiment run record. 以后随时可以访问这些指标,或者在工作室的运行详细信息页中查看这些指标。The metrics can then be accessed at any time or viewed in the run details page in studio. 参阅有关整套日志记录方法 log_*()参考See the reference for the full set of logging methods log_*().
  • 训练脚本将模型保存到一个名为outputs 的目录中。The training script saves your model into a directory named outputs. Azure ML 将对 ./outputs 文件夹进行特殊的处理。The ./outputs folder receives special treatment by Azure ML. 在训练期间,Azure ML 会自动将写入到 ./outputs 的文件上传到运行记录,并将其持久保存为项目。During training, files written to ./outputs are automatically uploaded to your run record by Azure ML and persisted as artifacts. 将训练的模型保存到 ./outputs 即使是在运行结束之后以及不再可以访问远程训练环境的情况下,你仍可以访问和检索模型文件。By saving the trained model to ./outputs, you'll be able to access and retrieve your model file even after the run is over and you no longer have access to your remote training environment.

创建估算器Create an estimator

Azure ML 评估器封装了在计算目标上执行训练脚本所需的运行配置信息。An Azure ML estimator encapsulates the run configuration information needed for executing a training script on the compute target. Azure ML 运行在指定的计算目标上作为容器化作业运行。Azure ML runs are run as containerized jobs on the specified compute target. 默认情况下,为训练作业生成的 Docker 映像将包含 R、Azure ML SDK 和一组通用的 R 包。By default, the Docker image built for your training job will include R, the Azure ML SDK, and a set of commonly used R packages. 查看此处包含的默认包的完整列表。See the full list of default packages included here.

若要创建评估器,请定义:To create the estimator, define:

  • 包含用于训练的脚本的目录 (source_directory)。The directory that contains your scripts needed for training (source_directory). 此目录中的所有文件将上传到群集节点以供执行。All the files in this directory are uploaded to the cluster node(s) for execution. 该目录必须包含训练脚本以及所需的任何其他脚本。The directory must contain your training script and any additional scripts required.
  • 要执行的训练脚本 (entry_script)。The training script that will be executed (entry_script).
  • 计算目标 (compute_target),在本例中为前面创建的 AmlCompute 群集。The compute target (compute_target), in this case the AmlCompute cluster you created earlier.
  • 训练脚本中所需的参数 (script_params)。The parameters required from the training script (script_params). Azure ML 将使用 Rscript 以命令行脚本的形式运行训练脚本。Azure ML will run your training script as a command-line script with Rscript. 本教程在脚本中指定一个参数,即数据目录的装入点,可以使用 ds$path(target_path) 访问它。In this tutorial you specify one argument to the script, the data directory mounting point, which you can access with ds$path(target_path).
  • 用于训练的任何环境依赖项。Any environment dependencies required for training. 为训练生成的默认 Docker 映像已包含训练脚本中所需的三个包(carete1071optparse)。The default Docker image built for training already contains the three packages (caret, e1071, and optparse) needed in the training script. 因此,无需指定附加的信息。So you don't need to specify additional information. 如果使用默认未包含的 R 包,请使用评估器的 cran_packages 参数添加附加的 CRAN 包。If you are using R packages that are not included by default, use the estimator's cran_packages parameter to add additional CRAN packages. 有关完整的可配置选项集,请参阅 estimator() 参考。See the estimator() reference for the full set of configurable options.
est <- estimator(source_directory = ".",
                 entry_script = "accidents.R",
                 script_params = list("--data_folder" = ds$path(target_path)),
                 compute_target = compute_target
                 )

在远程群集上提交作业Submit the job on the remote cluster

最后,请在群集上提交要运行的作业。Finally submit the job to run on your cluster. submit_experiment() 返回一个 Run 对象,然后,你可以使用该对象来与运行对接。submit_experiment() returns a Run object that you then use to interface with the run. 总的来说,首次运行需要大约 10 分钟 。In total, the first run takes about 10 minutes. 但对于后续的运行,只要脚本依赖项未更改,就会重复使用同一个 Docker 映像。But for later runs, the same Docker image is reused as long as the script dependencies don't change. 在这种情况下,映像将会缓存,容器启动速度要快得多。In this case, the image is cached and the container startup time is much faster.

run <- submit_experiment(exp, est)

可以在 RStudio Viewer 中查看运行详细信息。You can view the run's details in RStudio Viewer. 单击提供的“Web 视图”链接会转到 Azure 机器学习工作室,在其 UI 中可以监视运行。Clicking the "Web View" link provided will bring you to Azure Machine Learning studio, where you can monitor the run in the UI.

view_run_details(run)

模型训练在后台发生。Model training happens in the background. 在运行更多代码之前,请耐心等待,直到该模型完成定型。Wait until the model has finished training before you run more code.

wait_for_run_completion(run, show_output = TRUE)

You and colleagues with access to the workspace can submit multiple experiments in parallel, and Azure ML will take of scheduling the tasks on the compute cluster.You -- and colleagues with access to the workspace -- can submit multiple experiments in parallel, and Azure ML will take of scheduling the tasks on the compute cluster. 甚至可以将群集配置为自动扩展到多个节点,并在队列中不再存在计算任务时缩减。You can even configure the cluster to automatically scale up to multiple nodes, and scale back when there are no more compute tasks in the queue. 此配置可让团队经济高效地共享计算资源。This configuration is a cost-effective way for teams to share compute resources.

检索训练结果Retrieve training results

完成训练后,可以访问已保存到运行记录的作业项目,包括记录的所有指标,以及已训练的最终模型。Once your model has finished training, you can access the artifacts of your job that were persisted to the run record, including any metrics logged and the final trained model.

获取记录的指标Get the logged metrics

训练脚本 accidents.R 中记录了来自模型的一个指标:训练数据中的预测准确度。In the training script accidents.R, you logged a metric from your model: the accuracy of the predictions in the training data. 可以在机器学习工作室中查看指标,或者将其作为 R 列表提取到本地会话,如下所示:You can see metrics in the studio, or extract them to the local session as an R list as follows:

metrics <- get_run_metrics(run)
metrics

如果已运行多个试验(例如,使用不同的变量、算法或超参数),可以使用每个运行提供的指标来比较和选择要在生产环境中使用的模型。If you've run multiple experiments (say, using differing variables, algorithms, or hyperparamers), you can use the metrics from each run to compare and choose the model you'll use in production.

获取已训练的模型Get the trained model

可以检索已训练的模型,并在本地 R 会话中查看结果。You can retrieve the trained model and look at the results in your local R session. 以下代码将下载 ./outputs 目录的内容,其中包括模型文件。The following code will download the contents of the ./outputs directory, which includes the model file.

download_files_from_run(run, prefix="outputs/")
accident_model <- readRDS("outputs/model.rds")
summary(accident_model)

可以看到一些导致评估的死亡几率上升的因素:You see some factors that contribute to an increase in the estimated probability of death:

  • 撞击速度较高higher impact speed
  • 男性驾驶员male driver
  • 乘员年龄较大older occupant
  • 乘客passenger

可以看到死亡几率下降的因素:You see lower probabilities of death with:

  • 有安全气囊presence of airbags
  • 系了安全带presence seatbelts
  • 正面碰撞frontal collision

汽车制造年份没有明显的影响。The vehicle year of manufacture does not have a significant effect.

可以使用此模型做出新的预测:You can use this model to make new predictions:

newdata <- data.frame( # valid values shown below
 dvcat="10-24",        # "1-9km/h" "10-24"   "25-39"   "40-54"   "55+"  
 seatbelt="none",      # "none"   "belted"  
 frontal="frontal",    # "notfrontal" "frontal"
 sex="f",              # "f" "m"
 ageOFocc=16,          # age in years, 16-97
 yearVeh=2002,         # year of vehicle, 1955-2003
 airbag="none",        # "none"   "airbag"   
 occRole="pass"        # "driver" "pass"
 )

## predicted probability of death for these variables, as a percentage
as.numeric(predict(accident_model,newdata, type="response")*100)

部署为 Web 服务Deploy as a web service

使用该模型可以预测碰撞时的死亡危险几率。With your model, you can predict the danger of death from a collision. 使用 Azure ML 将模型部署为预测服务。Use Azure ML to deploy your model as a prediction service. 本教程将在 Azure 容器实例 (ACI) 中部署 Web 服务。In this tutorial, you will deploy the web service in Azure Container Instances (ACI).

注册模型Register the model

首先,使用 register_model() 将下载的模型注册到工作区。First, register the model you downloaded to your workspace with register_model(). 注册的模型可以是任意文件集合,但在本例中,使用 R 模型对象已足够。A registered model can be any collection of files, but in this case the R model object is sufficient. Azure ML 将使用注册的模型进行部署。Azure ML will use the registered model for deployment.

model <- register_model(ws, 
                        model_path = "outputs/model.rds", 
                        model_name = "accidents_model",
                        description = "Predict probablity of auto accident")

定义推理依赖项Define the inference dependencies

若要为模型创建 Web 服务,首先需要创建一个评分脚本 (entry_script),这是一个 R 脚本,它会提取输入变量值(采用 JSON 格式)并从模型输出预测结果。To create a web service for your model, you first need to create a scoring script (entry_script), an R script that will take as input variable values (in JSON format) and output a prediction from your model. 本教程使用随附的评分文件 accident_predict.RFor this tutorial, use the provided scoring file accident_predict.R. 评分脚本必须包含 init() 方法,该方法加载模型并返回一个函数,而该函数则使用该模型基于输入数据进行预测。The scoring script must contain an init() method that loads your model and returns a function that uses the model to make a prediction based on the input data. 有关更多详细信息,请参阅文档See the documentation for more details.

接下来,为脚本的包依赖项定义 Azure ML 环境Next, define an Azure ML environment for your script's package dependencies. 使用环境指定运行脚本所需的 R 包(来自 CRAN 或其他位置)。With an environment, you specify R packages (from CRAN or elsewhere) that are needed for your script to run. 还可以提供环境变量的值,脚本可以引用这些值来修改其行为。You can also provide the values of environment variables that your script can reference to modify its behavior. 默认情况下,Azure ML 会生成为训练的评估器所用的相同默认 Docker 映像。By default, Azure ML will build the same default Docker image used with the estimator for training. 由于本教程没有特殊的要求,因此可以创建一个不使用特殊属性的环境。Since the tutorial has no special requirements, create an environment with no special attributes.

r_env <- r_environment(name = "basic_env")

若要改用自己的 Docker 映像进行部署,请指定 custom_docker_image 参数。If you want to use your own Docker image for deployment instead, specify the custom_docker_image parameter. 有关用于定义环境的完整可配置选项集,请参阅 r_environment() 参考。See the r_environment() reference for the full set of configurable options for defining an environment.

完成所有准备工作后,接下来需要创建一个推理配置用于封装评分脚本和环境依赖项。Now you have everything you need to create an inference config for encapsulating your scoring script and environment dependencies.

inference_config <- inference_config(
  entry_script = "accident_predict.R",
  environment = r_env)

部署到 ACIDeploy to ACI

本教程将服务部署到 ACI。In this tutorial, you will deploy your service to ACI. 此代码预配单个容器来响应入站请求,此容器适合用于测试和轻量负载。This code provisions a single container to respond to inbound requests, which is suitable for testing and light loads. 有关其他可配置选项,请参阅 aci_webservice_deployment_config()See aci_webservice_deployment_config() for additional configurable options. (对于生产规模的部署,还可以部署到 Azure Kubernetes 服务。)(For production-scale deployments, you can also deploy to Azure Kubernetes Service.)

aci_config <- aci_webservice_deployment_config(cpu_cores = 1, memory_gb = 0.5)

现在,请将模型部署为 Web 服务。Now you deploy your model as a web service. 部署可能需要花费几分钟时间Deployment can take several minutes.

aci_service <- deploy_model(ws, 
                            'accident-pred', 
                            list(model), 
                            inference_config, 
                            aci_config)

wait_for_deployment(aci_service, show_output = TRUE)

测试已部署的服务Test the deployed service

将模型部署为服务后,可以使用 invoke_webservice() 在 R 中测试该服务。Now that your model is deployed as a service, you can test the service from R using invoke_webservice(). 提供一组新数据用于预测,将其转换为 JSON,然后将其发送到该服务。Provide a new set of data to predict from, convert it to JSON, and send it to the service.

library(jsonlite)

newdata <- data.frame( # valid values shown below
 dvcat="10-24",        # "1-9km/h" "10-24"   "25-39"   "40-54"   "55+"  
 seatbelt="none",      # "none"   "belted"  
 frontal="frontal",    # "notfrontal" "frontal"
 sex="f",              # "f" "m"
 ageOFocc=22,          # age in years, 16-97
 yearVeh=2002,         # year of vehicle, 1955-2003
 airbag="none",        # "none"   "airbag"   
 occRole="pass"        # "driver" "pass"
 )

prob <- invoke_webservice(aci_service, toJSON(newdata))
prob

还可以获取 Web 服务的、接受 REST 客户端调用的 HTTP 终结点。You can also get the web service's HTTP endpoint, which accepts REST client calls. 可以与想要测试 Web 服务或要将其集成到应用程序中的任何人共享此终结点。You can share this endpoint with anyone who wants to test the web service or integrate it into an application.

aci_service$scoring_uri

清理资源Clean up resources

请删除不再需要的资源。Delete the resources once you no longer need them. 请不要删除将来仍要使用的任何资源。Don't delete any resource you plan to still use.

删除 Web 服务:Delete the web service:

delete_webservice(aci_service)

删除已注册的模型:Delete the registered model:

delete_model(model)

删除计算群集:Delete the compute cluster:

delete_compute(compute)

删除所有内容Delete everything

重要

已创建的资源可以用作其他 Azure 机器学习教程和操作方法文章的先决条件。The resources you created can be used as prerequisites to other Azure Machine Learning tutorials and how-to articles.

如果不打算使用已创建的资源,请删除它们,以免产生任何费用:If you don't plan to use the resources you created, delete them, so you don't incur any charges:

  1. 在 Azure 门户中,选择最左侧的“资源组” 。In the Azure portal, select Resource groups on the far left.

    在 Azure 门户中删除Delete in the Azure portal

  2. 从列表中选择已创建的资源组。From the list, select the resource group you created.

  3. 选择“删除资源组” 。Select Delete resource group.

  4. 输入资源组名称。Enter the resource group name. 然后选择“删除” 。Then select Delete.

还可保留资源组,但请删除单个工作区。You can also keep the resource group but delete a single workspace. 显示工作区属性,然后选择“删除” 。Display the workspace properties and select Delete.

后续步骤Next steps

  • 在 R 中完成第一个 Azure 机器学习试验后,请详细了解适用于 R 的 Azure 机器学习 SDKNow that you've completed your first Azure Machine Learning experiment in R, learn more about the Azure Machine Learning SDK for R.

  • 通过其他 vignettes 文件夹中的示例,详细了解如何将 Azure 机器学习与 R 配合使用。Learn more about Azure Machine Learning with R from the examples in the other vignettes folders.