您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

快速入门:使用 .NET API 运行第一个 Azure Batch 作业Quickstart: Run your first Azure Batch job with the .NET API

本快速入门通过基于 Azure Batch .NET API 生成的 C# 应用程序运行 Azure Batch 作业。This quickstart runs an Azure Batch job from a C# application built on the Azure Batch .NET API. 此应用将多个输入数据文件上传到 Azure 存储,然后创建包含 Batch 计算节点(虚拟机)的The app uploads several input data files to Azure storage and then creates a pool of Batch compute nodes (virtual machines). 再然后,它创建一个示例作业,以便运行任务,在池中使用基本命令来处理每个输入文件。Then, it creates a sample job that runs tasks to process each input file on the pool using a basic command. 完成本快速入门以后,你会了解 Batch 服务的重要概念,并可使用更逼真的工作负荷进行更大规模的 Batch 试用。After completing this quickstart, you will understand the key concepts of the Batch service and be ready to try Batch with more realistic workloads at larger scale.

快速入门应用工作流

如果还没有 Azure 订阅,可以在开始前创建一个免费帐户If you don't have an Azure subscription, create a free account before you begin.

先决条件Prerequisites

登录 AzureSign in to Azure

https://portal.azure.com 中登录 Azure 门户。Sign in to the Azure portal at https://portal.azure.com.

获取帐户凭据Get account credentials

就此示例来说,需为 Batch 帐户和存储帐户提供凭据。For this example, you need to provide credentials for your Batch and Storage accounts. 若要获取所需凭据,一种直接的方法是使用 Azure 门户。A straightforward way to get the necessary credentials is in the Azure portal. (也可使用 Azure API 或命令行工具来获取这些凭据。)(You can also get these credentials using the Azure APIs or command-line tools.)

  1. 单击“所有服务” > “Batch 帐户”,然后单击 Batch 帐户的名称。Click All services > Batch accounts, and then click the name of your Batch account.

  2. 若要查看 Batch 凭据,请单击“密钥” 。To see the Batch credentials, click Keys. 将“Batch 帐户”、“URL”和“主访问密钥”的值复制到文本编辑器。 Copy the values of Batch account, URL, and Primary access key to a text editor.

  3. 若要查看存储帐户名称和密钥,请单击“存储帐户” 。To see the Storage account name and keys, click Storage account. 将“存储帐户名称”和“Key1”的值复制到文本编辑器。 Copy the values of Storage account name and Key1 to a text editor.

下载示例Download the sample

从 GitHub 下载或克隆示例应用Download or clone the sample app from GitHub. 若要使用 Git 客户端克隆示例应用存储库,请使用以下命令:To clone the sample app repo with a Git client, use the following command:

git clone https://github.com/Azure-Samples/batch-dotnet-quickstart.git

导航到包含 Visual Studio 解决方案文件 BatchDotNetQuickstart.sln 的目录。Navigate to the directory that contains the Visual Studio solution file BatchDotNetQuickstart.sln.

在 Visual Studio 中打开解决方案文件,使用为帐户获取的值更新 Program.cs 中的凭据字符串。Open the solution file in Visual Studio, and update the credential strings in Program.cs with the values you obtained for your accounts. 例如:For example:

// Batch account credentials
private const string BatchAccountName = "mybatchaccount";
private const string BatchAccountKey  = "xxxxxxxxxxxxxxxxE+yXrRvJAqT9BlXwwo1CwF+SwAYOxxxxxxxxxxxxxxxx43pXi/gdiATkvbpLRl3x14pcEQ==";
private const string BatchAccountUrl  = "https://mybatchaccount.mybatchregion.batch.azure.com";

// Storage account credentials
private const string StorageAccountName = "mystorageaccount";
private const string StorageAccountKey  = "xxxxxxxxxxxxxxxxy4/xxxxxxxxxxxxxxxxfwpbIC5aAWA8wDu+AFXZB827Mt9lybZB1nUcQbQiUrkPtilK5BQ==";

备注

为简化示例,Batch 凭据和存储帐户凭据以明文形式显示。To simplify the example, the Batch and Storage account credentials appear in clear text. 在实践中,我们建议你限制对凭据的访问,并使用环境变量或配置文件在代码中引用凭据。In practice, we recommend that you restrict access to the credentials and refer to them in your code using environment variables or a configuration file. 有关示例,请参阅 Azure Batch 代码示例存储库For examples, see the Azure Batch code samples repo.

生成并运行应用Build and run the app

若要查看运行中的 Batch 工作流,请在 Visual Studio 中构建并运行应用程序,或在命令行中使用 dotnet builddotnet run 命令。To see the Batch workflow in action, build and run the application in Visual Studio, or at the command line with the dotnet build and dotnet run commands. 运行应用程序后,请查看代码,了解应用程序的每个部分的作用。After running the application, review the code to learn what each part of the application does. 例如,在 Visual Studio 中:For example, in Visual Studio:

  • 右键单击解决方案资源管理器中的解决方案,然后单击“生成解决方案” 。Right-click the solution in Solution Explorer, and click Build Solution.

  • 出现提示时,请确认还原任何 NuGet 包。Confirm the restoration of any NuGet packages, if you're prompted. 如果需要下载缺少的包,请确保 NuGet 包管理器已安装。If you need to download missing packages, ensure the NuGet Package Manager is installed.

然后运行它。Then run it. 运行示例应用程序时,控制台输出如下所示。When you run the sample application, the console output is similar to the following. 在执行期间启动池的计算节点时,会遇到暂停并看到Monitoring all tasks for 'Completed' state, timeout in 00:30:00...During execution, you experience a pause at Monitoring all tasks for 'Completed' state, timeout in 00:30:00... while the pool's compute nodes are started. 任务会排队,在第一个计算节点运行后马上运行。Tasks are queued to run as soon as the first compute node is running. 转到 Azure 门户中的 Batch 帐户,监视池、计算节点、作业和任务。Go to your Batch account in the Azure portal to monitor the pool, compute nodes, job, and tasks.

Sample start: 11/16/2018 4:02:54 PM

Container [input] created.
Uploading file taskdata0.txt to container [input]...
Uploading file taskdata1.txt to container [input]...
Uploading file taskdata2.txt to container [input]...
Creating pool [DotNetQuickstartPool]...
Creating job [DotNetQuickstartJob]...
Adding 3 tasks to job [DotNetQuickstartJob]...
Monitoring all tasks for 'Completed' state, timeout in 00:30:00...

任务完成后,会看到每个任务的输出,如下所示:After tasks complete, you see output similar to the following for each task:

Printing task output.
Task: Task0
Node: tvm-2850684224_3-20171205t000401z
Standard out:
Batch processing began with mainframe computers and punch cards. Today it still plays a central role in business, engineering, science, and other pursuits that require running lots of automated tasks....
stderr:
...

以默认配置运行应用程序时,典型的执行时间大约为 5 分钟。Typical execution time is approximately 5 minutes when you run the application in its default configuration. 初始池设置需要最多时间。Initial pool setup takes the most time. 若要再次运行该作业,请从以前的运行中删除该作业,不要删除池。To run the job again, delete the job from the previous run and do not delete the pool. 在预配置的池中,该作业数秒即可完成。On a preconfigured pool, the job completes in a few seconds.

查看代码Review the code

本快速入门中的 .NET 应用执行以下操作:The .NET app in this quickstart does the following:

  • 将三个小的文本文件上传到 Azure 存储帐户中的 Blob 容器。Uploads three small text files to a blob container in your Azure storage account. 这些文件是供 Batch 处理的输入。These files are inputs for processing by Batch.
  • 创建一个池,其中包含运行 Windows Server 的计算节点。Creates a pool of compute nodes running Windows Server.
  • 创建一个作业和三个任务,它们需要在节点上运行。Creates a job and three tasks to run on the nodes. 每个任务都使用 Windows 命令行来处理一个输入文件。Each task processes one of the input files using a Windows command line.
  • 显示文件返回的任务。Displays files returned by the tasks.

有关详细信息,请参阅文件 Program.cs 和以下部分。See the file Program.cs and the following sections for details.

初步操作Preliminaries

为了与存储帐户交互,应用使用用于 .NET 的 Azure 存储客户端库。To interact with a storage account, the app uses the Azure Storage Client Library for .NET. 它使用 CloudStorageAccount 创建帐户引用,并据此创建 CloudBlobClientIt creates a reference to the account with CloudStorageAccount, and from that creates a CloudBlobClient.

CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

应用使用 blobClient 引用在存储帐户中创建容器,然后将数据文件上传到该容器。The app uses the blobClient reference to create a container in the storage account and to upload data files to the container. 存储中的文件定义为 Batch ResourceFile 对象,Batch 随后可以将这些对象下载到计算节点。The files in storage are defined as Batch ResourceFile objects that Batch can later download to compute nodes.

List<string> inputFilePaths = new List<string>
{
    "taskdata0.txt",
    "taskdata1.txt",
    "taskdata2.txt"
};

List<ResourceFile> inputFiles = new List<ResourceFile>();

foreach (string filePath in inputFilePaths)
{
    inputFiles.Add(UploadFileToContainer(blobClient, inputContainerName, filePath));
}

应用创建的 BatchClient 对象用于创建和管理 Batch 服务中的池、作业和任务。The app creates a BatchClient object to create and manage pools, jobs, and tasks in the Batch service. 示例中的 Batch 客户端使用共享密钥身份验证。The Batch client in the sample uses shared key authentication. (Batch 还支持 Azure Active Directory 身份验证。)(Batch also supports Azure Active Directory authentication.)

BatchSharedKeyCredentials cred = new BatchSharedKeyCredentials(BatchAccountUrl, BatchAccountName, BatchAccountKey);

using (BatchClient batchClient = BatchClient.Open(cred))
...

创建计算节点池Create a pool of compute nodes

为了创建 Batch 池,应用使用 BatchClient.PoolOperations.CreatePool 方法设置节点数、VM 大小和池配置。To create a Batch pool, the app uses the BatchClient.PoolOperations.CreatePool method to set the number of nodes, VM size, and a pool configuration. 在这里,VirtualMachineConfiguration 对象指定对 Azure 市场中发布的 Windows Server 映像的 ImageReferenceHere, a VirtualMachineConfiguration object specifies an ImageReference to a Windows Server image published in the Azure Marketplace. Batch 支持 Azure 市场中的各种 Linux 和 Windows Server 映像以及自定义 VM 映像。Batch supports a wide range of Linux and Windows Server images in the Azure Marketplace, as well as custom VM images.

节点数 (PoolNodeCount) 和 VM 大小 (PoolVMSize) 是定义的常数。The number of nodes (PoolNodeCount) and VM size (PoolVMSize) are defined constants. 此示例默认创建的池包含 2 个大小为 Standard_A1_v2 的节点。The sample by default creates a pool of 2 size Standard_A1_v2 nodes. 就此快速示例来说,建议的大小在性能和成本之间达成了很好的平衡。The size suggested offers a good balance of performance versus cost for this quick example.

Commit 方法将池提交到 Batch 服务。The Commit method submits the pool to the Batch service.


private static VirtualMachineConfiguration CreateVirtualMachineConfiguration(ImageReference imageReference)
{
    return new VirtualMachineConfiguration(
        imageReference: imageReference,
        nodeAgentSkuId: "batch.node.windows amd64");
}

private static ImageReference CreateImageReference()
{
    return new ImageReference(
        publisher: "MicrosoftWindowsServer",
        offer: "WindowsServer",
        sku: "2016-datacenter-smalldisk",
        version: "latest");
}

private static void CreateBatchPool(BatchClient batchClient, VirtualMachineConfiguration vmConfiguration)
{
    try
    {
        CloudPool pool = batchClient.PoolOperations.CreatePool(
            poolId: PoolId,
            targetDedicatedComputeNodes: PoolNodeCount,
            virtualMachineSize: PoolVMSize,
            virtualMachineConfiguration: vmConfiguration);

        pool.Commit();
    }
...

创建 Batch 作业Create a Batch job

Batch 作业是对一个或多个任务进行逻辑分组。A Batch job is a logical grouping of one or more tasks. 作业包含任务的公用设置,例如优先级以及运行任务的池。A job includes settings common to the tasks, such as priority and the pool to run tasks on. 应用使用 BatchClient.JobOperations.CreateJob 方法在池中创建作业。The app uses the BatchClient.JobOperations.CreateJob method to create a job on your pool.

Commit 方法将作业提交到 Batch 服务。The Commit method submits the job to the Batch service. 作业一开始没有任务。Initially the job has no tasks.

try
{
    CloudJob job = batchClient.JobOperations.CreateJob();
    job.Id = JobId;
    job.PoolInformation = new PoolInformation { PoolId = PoolId };

    job.Commit();
}
...

创建任务Create tasks

此应用创建 CloudTask 对象的列表。The app creates a list of CloudTask objects. 每个任务都使用 CommandLine 属性来处理输入 ResourceFile 对象。Each task processes an input ResourceFile object using a CommandLine property. 在示例中,命令行运行 Windows type 命令来显示输入文件。In the sample, the command line runs the Windows type command to display the input file. 此命令是一个用于演示的简单示例。This command is a simple example for demonstration purposes. 使用 Batch 时,可以在命令行中指定应用或脚本。When you use Batch, the command line is where you specify your app or script. Batch 提供多种将应用和脚本部署到计算节点的方式。Batch provides a number of ways to deploy apps and scripts to compute nodes.

然后,应用使用 AddTask 方法将任务添加到作业,使任务按顺序在计算节点上运行。Then, the app adds tasks to the job with the AddTask method, which queues them to run on the compute nodes.

for (int i = 0; i < inputFiles.Count; i++)
{
    string taskId = String.Format("Task{0}", i);
    string inputFilename = inputFiles[i].FilePath;
    string taskCommandLine = String.Format("cmd /c type {0}", inputFilename);

    CloudTask task = new CloudTask(taskId, taskCommandLine);
    task.ResourceFiles = new List<ResourceFile> { inputFiles[i] };
    tasks.Add(task);
}

batchClient.JobOperations.AddTask(JobId, tasks);

查看任务输出View task output

应用创建 TaskStateMonitor 来监视任务,确保其完成。The app creates a TaskStateMonitor to monitor the tasks to make sure they complete. 然后,应用使用 CloudTask.ComputeNodeInformation 属性来显示每个已完成任务生成的 stdout.txt 文件。Then, the app uses the CloudTask.ComputeNodeInformation property to display the stdout.txt file generated by each completed task. 如果任务成功运行,任务命令的输出将写入到 stdout.txtWhen the task runs successfully, the output of the task command is written to stdout.txt:

foreach (CloudTask task in completedtasks)
{
    string nodeId = String.Format(task.ComputeNodeInformation.ComputeNodeId);
    Console.WriteLine("Task: {0}", task.Id);
    Console.WriteLine("Node: {0}", nodeId);
    Console.WriteLine("Standard out:");
    Console.WriteLine(task.GetNodeFile(Constants.StandardOutFileName).ReadAsString());
}

清理资源Clean up resources

应用自动删除所创建的存储容器,并允许你选择是否删除 Batch 池和作业。The app automatically deletes the storage container it creates, and gives you the option to delete the Batch pool and job. 只要有节点在运行,就会对池收费,即使没有计划作业。You are charged for the pool while the nodes are running, even if no jobs are scheduled. 不再需要池时,请将其删除。When you no longer need the pool, delete it. 删除池时会删除节点上的所有任务输出。When you delete the pool, all task output on the nodes is deleted.

若不再需要资源组、Batch 帐户和存储帐户,请将其删除。When no longer needed, delete the resource group, Batch account, and storage account. 为此,请在 Azure 门户中选择 Batch 帐户所在的资源组,然后单击“删除资源组”。 To do so in the Azure portal, select the resource group for the Batch account and click Delete resource group.

后续步骤Next steps

本快速入门运行了使用 Batch .NET API 生成的小应用,目的是创建 Batch 池和 Batch 作业。In this quickstart, you ran a small app built using the Batch .NET API to create a Batch pool and a Batch job. 该作业运行了示例任务,并下载了在节点上产生的输出。The job ran sample tasks, and downloaded output created on the nodes. 了解 Batch 服务的重要概念以后,即可使用更逼真的工作负荷进行更大规模的 Batch 试用。Now that you understand the key concepts of the Batch service, you are ready to try Batch with more realistic workloads at larger scale. 若要详细了解 Azure Batch 并使用实际的应用程序演练并行工作负荷,请继续学习 Batch .NET 教程。To learn more about Azure Batch, and walk through a parallel workload with a real-world application, continue to the Batch .NET tutorial.