您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

使用警报触发 Azure 自动化 RunbookUse an alert to trigger an Azure Automation runbook

可以使用 Azure Monitor 来监视 Azure 中大多数服务的基本级别指标和日志。You can use Azure Monitor to monitor base-level metrics and logs for most services in Azure. 可以使用操作组或经典警报调用 Azure 自动化 Runbook,以便基于警报自动执行任务。You can call Azure Automation runbooks by using action groups or by using classic alerts to automate tasks based on alerts. 本文介绍如何使用警报来配置和运行 Runbook。This article shows you how to configure and run a runbook by using alerts.

警报类型Alert types

可对三种警报类型使用自动化 Runbook:You can use automation runbooks with three alert types:

  • 常见警报Common alerts
  • 活动日志警报Activity log alerts
  • 近实时指标警报Near-real-time metric alerts

备注

常见的警报架构会在 Azure 中标准化警报通知的使用体验。The common alert schema standardizes the consumption experience for alert notifications in Azure today. 如今,Azure 中的三种警报类型(指标、日志和活动日志)都已有自己的电子邮件模板、Webhook 架构等。若要了解详细信息,请参阅常见警报架构Historically, the three alert types in Azure today (metric, log, and activity log) have had their own email templates, webhook schemas, etc. To learn more, see Common alert schema

当警报调用 Runbook 时,实际调用是对 Webhook 的 HTTP POST 请求。When an alert calls a runbook, the actual call is an HTTP POST request to the webhook. 该 POST 请求的正文包含一个 JSON 格式的对象,该对象包含与警报相关的有用属性。The body of the POST request contains a JSON-formated object that has useful properties that are related to the alert. 下表列出了每种警报类型的有效负载架构的相应链接:The following table lists links to the payload schema for each alert type:

警报Alert 说明Description 负载架构Payload schema
常见警报Common alert 常见警报架构,用于在目前的 Azure 中标准化警报通知的使用体验。The common alert schema that standardizes the consumption experience for alert notifications in Azure today. 常见警报有效负载架构Common alert payload schema
活动日志警报Activity log alert 当 Azure 活动日志中的任何新事件符合特定条件时,就会发送通知。Sends a notification when any new event in the Azure activity log matches specific conditions. 例如,当 myProductionResourceGroup 中出现 Delete VM 操作或出现状态为 Active 的新 Azure 服务运行状况事件时。For example, when a Delete VM operation occurs in myProductionResourceGroup or when a new Azure Service Health event with an Active status appears. 活动日志警报有效负载架构Activity log alert payload schema
准实时指标警报Near real-time metric alert 当一个或多个平台级指标满足指定条件时,就会以快于指标警报的速度发送通知。Sends a notification faster than metric alerts when one or more platform-level metrics meet specified conditions. 例如,当 VM 的“CPU 百分比”大于 90 并且过去 5 分钟“网络传入”大于 500 MB 时。 For example, when the value for CPU % on a VM is greater than 90, and the value for Network In is greater than 500 MB for the past 5 minutes. 准实时指标警报有效负载架构Near real-time metric alert payload schema

由于每种警报提供的数据不同,因此需要以不同的方式处理每种警报。Because the data that's provided by each type of alert is different, each alert type is handled differently. 下一部分将介绍如何创建 Runbook 来处理不同类型的警报。In the next section, you learn how to create a runbook to handle different types of alerts.

创建 Runbook 以处理警报Create a runbook to handle alerts

若要对警报使用自动化,需要有一个 Runbook,其中包含用于管理传递到 Runbook 的警报 JSON 有效负载的逻辑。To use Automation with alerts, you need a runbook that has logic that manages the alert JSON payload that's passed to the runbook. 下面的示例 Runbook 必须从 Azure 警报调用。The following example runbook must be called from an Azure alert.

如前面部分所述,每种警报类型都有不同的架构。As described in the preceding section, each type of alert has a different schema. 此脚本采用从 WebhookData runbook 输入参数中的警报获取的 Webhook 数据。The script takes the webhook data from an alert in the WebhookData runbook input parameter. 然后,该脚本对 JSON 有效负载进行评估,确定正在使用的警报类型。Then, the script evaluates the JSON payload to determine which alert type is being used.

此示例使用来自 VM 的警报。This example uses an alert from a VM. 它从有效负载中检索 VM 数据,然后使用该信息停止运行 VM。It retrieves the VM data from the payload, and then uses that information to stop the VM. 必须在运行该 Runbook 的自动化帐户中建立连接。The connection must be set up in the Automation account where the runbook is run. 使用警报触发 runbook 时,必须检查触发的 runbook 中的警报状态。When using alerts to trigger runbooks, it is important to check the alert status in the runbook that is triggered. 每次警报更改状态时,都会触发 runbook。The runbook triggers each time the alert changes state. 警报有多个状态,其中两个最常见的状态为“已激活”和“已解决”。Alerts have multiple states, with the two most common being Activated and Resolved. 检查 runbook 逻辑的状态,以确保 runbook 不会运行多次。Check for state in your runbook logic to ensure that the runbook does not run more than once. 本文中的示例仅演示了如何查找状态为“已激活”的警报。The example in this article shows how to look for alerts with state Activated only.

该 runbook 使用连接资产 AzureRunAsConnection 运行方式帐户在 Azure 中进行身份验证,以便对 VM 执行管理操作。The runbook uses the connection asset AzureRunAsConnection Run As account to authenticate with Azure to perform the management action against the VM.

使用此示例可以创建名为 Stop-AzureVmInResponsetoVMAlert 的 Runbook。Use this example to create a runbook called Stop-AzureVmInResponsetoVMAlert. 可以修改此 PowerShell 脚本,并将其用于许多不同的资源。You can modify the PowerShell script, and use it with many different resources.

  1. 转到 Azure 自动化帐户。Go to your Azure Automation account.

  2. 在“过程自动化”下,选择“Runbook”。 Under Process Automation, select Runbooks.

  3. 在 Runbook 列表的顶部选择“+ 创建 Runbook”。At the top of the list of runbooks, select + Create a runbook.

  4. 在“添加 runbook”页上,输入 Stop-AzureVmInResponsetoVMAlert 作为 runbook 名称 。On the Add Runbook page, enter Stop-AzureVmInResponsetoVMAlert for the runbook name. 对于 runbook 类型,选择“PowerShell”。For the runbook type, select PowerShell. 然后选择“创建”。Then, select Create.

  5. 将以下 PowerShell 示例复制到“编辑”页中。Copy the following PowerShell example into the Edit page.

    [OutputType("PSAzureOperationResponse")]
    param
    (
        [Parameter (Mandatory=$false)]
        [object] $WebhookData
    )
    $ErrorActionPreference = "stop"
    
    if ($WebhookData)
    {
        # Get the data object from WebhookData
        $WebhookBody = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
    
        # Get the info needed to identify the VM (depends on the payload schema)
        $schemaId = $WebhookBody.schemaId
        Write-Verbose "schemaId: $schemaId" -Verbose
        if ($schemaId -eq "azureMonitorCommonAlertSchema") {
            # This is the common Metric Alert schema (released March 2019)
            $Essentials = [object] ($WebhookBody.data).essentials
            # Get the first target only as this script doesn't handle multiple
            $alertTargetIdArray = (($Essentials.alertTargetIds)[0]).Split("/")
            $SubId = ($alertTargetIdArray)[2]
            $ResourceGroupName = ($alertTargetIdArray)[4]
            $ResourceType = ($alertTargetIdArray)[6] + "/" + ($alertTargetIdArray)[7]
            $ResourceName = ($alertTargetIdArray)[-1]
            $status = $Essentials.monitorCondition
        }
        elseif ($schemaId -eq "AzureMonitorMetricAlert") {
            # This is the near-real-time Metric Alert schema
            $AlertContext = [object] ($WebhookBody.data).context
            $SubId = $AlertContext.subscriptionId
            $ResourceGroupName = $AlertContext.resourceGroupName
            $ResourceType = $AlertContext.resourceType
            $ResourceName = $AlertContext.resourceName
            $status = ($WebhookBody.data).status
        }
        elseif ($schemaId -eq "Microsoft.Insights/activityLogs") {
            # This is the Activity Log Alert schema
            $AlertContext = [object] (($WebhookBody.data).context).activityLog
            $SubId = $AlertContext.subscriptionId
            $ResourceGroupName = $AlertContext.resourceGroupName
            $ResourceType = $AlertContext.resourceType
            $ResourceName = (($AlertContext.resourceId).Split("/"))[-1]
            $status = ($WebhookBody.data).status
        }
        elseif ($schemaId -eq $null) {
            # This is the original Metric Alert schema
            $AlertContext = [object] $WebhookBody.context
            $SubId = $AlertContext.subscriptionId
            $ResourceGroupName = $AlertContext.resourceGroupName
            $ResourceType = $AlertContext.resourceType
            $ResourceName = $AlertContext.resourceName
            $status = $WebhookBody.status
        }
        else {
            # Schema not supported
            Write-Error "The alert data schema - $schemaId - is not supported."
        }
    
        Write-Verbose "status: $status" -Verbose
        if (($status -eq "Activated") -or ($status -eq "Fired"))
        {
            Write-Verbose "resourceType: $ResourceType" -Verbose
            Write-Verbose "resourceName: $ResourceName" -Verbose
            Write-Verbose "resourceGroupName: $ResourceGroupName" -Verbose
            Write-Verbose "subscriptionId: $SubId" -Verbose
    
            # Determine code path depending on the resourceType
            if ($ResourceType -eq "Microsoft.Compute/virtualMachines")
            {
                # This is an Resource Manager VM
                Write-Verbose "This is an Resource Manager VM." -Verbose
    
                # Authenticate to Azure with service principal and certificate and set subscription
                Write-Verbose "Authenticating to Azure with service principal and certificate" -Verbose
                $ConnectionAssetName = "AzureRunAsConnection"
                Write-Verbose "Get connection asset: $ConnectionAssetName" -Verbose
                $Conn = Get-AutomationConnection -Name $ConnectionAssetName
                if ($Conn -eq $null)
                {
                    throw "Could not retrieve connection asset: $ConnectionAssetName. Check that this asset exists in the Automation account."
                }
                Write-Verbose "Authenticating to Azure with service principal." -Verbose
                Add-AzAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint | Write-Verbose
                Write-Verbose "Setting subscription to work against: $SubId" -Verbose
                Set-AzContext -SubscriptionId $SubId -ErrorAction Stop | Write-Verbose
    
                # Stop the Resource Manager VM
                Write-Verbose "Stopping the VM - $ResourceName - in resource group - $ResourceGroupName -" -Verbose
                Stop-AzVM -Name $ResourceName -ResourceGroupName $ResourceGroupName -Force
                # [OutputType(PSAzureOperationResponse")]
            }
            else {
                # ResourceType not supported
                Write-Error "$ResourceType is not a supported resource type for this runbook."
            }
        }
        else {
            # The alert status was not 'Activated' or 'Fired' so no action taken
            Write-Verbose ("No action taken. Alert status: " + $status) -Verbose
        }
    }
    else {
        # Error
        Write-Error "This runbook is meant to be started from an Azure alert webhook only."
    }
    
  6. 选择“发布”以保存并发布 Runbook。Select Publish to save and publish the runbook.

创建警报Create the alert

警报使用操作组,操作组是警报触发的操作的集合。Alerts use action groups, which are collections of actions that are triggered by the alert. Runbook 只是操作组包含的诸多操作之一。Runbooks are just one of the many actions that you can use with action groups.

  1. 在自动化帐户中,选择“监视”下的“警报” 。In your Automation account, select Alerts under Monitoring.

  2. 选择“+ 新建警报规则”。Select + New alert rule.

  3. 单击“资源”下的“选择” 。Click Select under Resource. 在“选择资源”页上,选择要发出其关闭警报的 VM,然后单击“完成”"。On the Select a resource page, select your VM to alert off of, and click Done.

  4. 在“条件”下单击“添加条件” 。Click Add condition under Condition. 选择要使用的信号(例如“CPU 百分比”),然后单击“完成” "。Select the signal you want to use, for example Percentage CPU and click Done.

  5. 在“配置信号逻辑”页上,在“警报逻辑”下输入“阈值”,然后单击“完成” 。On the Configure signal logic page, enter your Threshold value under Alert logic, and click Done.

  6. 在“操作组”下,选择“新建” 。Under Action groups, select Create New.

  7. 在“添加操作组”页上,为操作组指定名称和短名称。On the Add action group page, give your action group a name and a short name.

  8. 为操作指定一个名称。Give the action a name. 对于操作类型,选择“自动化 runbook”。For the action type, select Automation Runbook.

  9. 选择“编辑详细信息”。Select Edit Details. 在“配置 Runbook”页上的“Runbook 源”下,选择“用户” 。On the Configure Runbook page, under Runbook source, select User.

  10. 选择订阅自动化帐户,然后选择 Stop-AzureVmInResponsetoVMAlert Runbook。Select your Subscription and Automation account, and then select the Stop-AzureVmInResponsetoVMAlert runbook.

  11. 为“启用常见警报架构”选择“是” 。Select Yes for Enable the common alert schema.

  12. 若要创建操作组,请选择“确定”。To create the action group, select OK.

    “添加操作组”页

    可以在创建的活动日志警报准实时警报中使用此操作组。You can use this action group in the activity log alerts and near real-time alerts that you create.

  13. 在“警报详细信息”下,添加警报规则名称和说明,然后单击“创建警报规则” 。Under Alert Details, add an alert rule name and description and click Create alert rule.

后续步骤Next steps