您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

使用 Azure Monitor 创建、查看和管理日志警报Create, view, and manage log alerts using Azure Monitor

概述Overview

通过日志警报,用户可以使用 Log Analytics 查询按每个设置的频率评估资源日志,并根据结果触发警报。Log alerts allow users to use a Log Analytics query to evaluate resources logs every set frequency, and fire an alert based on the results. 规则可以使用操作组触发一个或多个操作。Rules can trigger one or more actions using Action Groups. 详细了解日志警报的功能和术语Learn more about functionality and terminology of log alerts.

本文说明如何使用 Azure Monitor 创建和管理日志警报。This article shows you how to create and manage log alerts using Azure Monitor. 警报规则由三个组件定义:Alert rules are defined by three components:

  • 目标:要监视的特定 Azure 资源。Target: A specific Azure resource to monitor.
  • 条件:要评估的逻辑。Criteria: Logic to evaluate. 如果满足,则触发警报。If met, the alert fires.
  • 操作:通知或自动化 - 电子邮件、短信、Webhook 等。Action: Notifications or automation - email, SMS, webhook, and so on.

也可以使用 Azure 资源管理器模板创建日志预警规则,该方法在单独的文章中进行了介绍。You can also create log alert rules using Azure Resource Manager templates, which are described in a separate article.

备注

可以将 Log Analytics 工作区中的日志数据发送到 Azure Monitor 指标存储。Log data from a Log Analytics workspace can be sent to the Azure Monitor metrics store. 指标警报具有不同的行为,该行为可能更可取,具体取决于你要使用的数据。Metrics alerts have different behavior, which may be more desirable depending on the data you are working with. 要了解如何将日志路由到指标,请参阅日志的指标警报For information on what and how you can route logs to metrics, see Metric Alert for Logs.

使用 Azure 门户创建日志警报规则Create a log alert rule with the Azure portal

以下是开始编写警报查询的步骤:Here the steps to get started writing queries for alerts:

  1. 转到想要对其发出警报的资源。Go to the resource you would like to alert on.

  2. 在“监视器”下,选择“日志” 。Under Monitor, select Logs.

  3. 查询可以指示问题的日志数据。Query the log data that can indicate the issue. 可以使用警报查询示例主题来了解可发现的内容或开始编写你自己的查询You can use the alert query examples topic to understand what you can discover or get started on writing your own query. 此外,了解如何创建优化的警报查询Also, learn how to create optimized alert queries.

  4. 按“+ 新建预警规则”按钮启动警报创建流。Press on '+ New Alert Rule' button to start the alert creation flow.

    Log Analytics - 设置警报

备注

建议在对日志使用资源访问模式时大规模创建警报,该模式在使用资源组或订阅范围的多个资源上运行。It is recommended that you create alerts at scale, when using resource access mode for logs, which runs on multiple resources using a resource group or subscription scope. 大规模警报会减少规则管理开销。Alerting at scale reduces rule management overhead. 为了能够以资源为目标,请在结果中包含“资源 ID”列。To be able to target the resources, please include the resource ID column in the results. 详细了解如何按维度拆分警报Learn more about splitting alerts by dimensions.

Log Analytics 和 Application Insights 的日志警报Log alert for Log Analytics and Application Insights

  1. 如果查询语法正确,将以图表形式显示查询的历史数据,同时显示用于调整图表时间段(从过去六小时到过去一周)的选项。If the query syntax is correct, then historical data for the query appears as a graph with the option to tweak the chart period from the last six hours to last week.

    如果查询结果包含汇总数据或项目特定列(不包含时间列),则图表将显示单个值。If your query results contain summarized data or project specific columns without time column, the chart shows a single value.

    配置警报规则

  2. 使用“时段”选项,选择评估指定条件的时间范围。Choose the time range over which to assess the specified condition, using Period option.

  3. 日志警报可以基于两种类型的“度量值”Log Alerts can be based on two types of Measures:

    1. 结果数 - 查询返回的记录的计数。Number of results - Count of records returned by the query.
    2. 指标度量 - 使用通过按选择的表达式和 bin() 选择分组的汇总计算得出的聚合值。Metric measurement - Aggregate value calculated using summarize grouped by expressions chosen and bin() selection. 例如:For example:
    // Reported errors
    union Event, Syslog // Event table stores Windows event records, Syslog stores Linux records
    | where EventLevelName == "Error" // EventLevelName is used in the Event (Windows) records
    or SeverityLevel== "err" // SeverityLevel is used in Syslog (Linux) records
    | summarize AggregatedValue = count() by Computer, bin(TimeGenerated, 15m)
    
  4. 对于指标度量警报逻辑,可以选择使用“聚合对象”选项来指定如何按维度拆分警报For metric measurements alert logic, you can optionally specify how to split the alerts by dimensions using the Aggregate on option. 行分组表达式必须唯一且已排序。Row grouping expression must be unique and sorted.

    备注

    由于 bin() 可能导致不均匀的时间间隔,因此,警报服务会自动将 bin() 函数转换为针对运行时的相应时间的 bin_at() 函数,以确保生成针对确定时间点的结果。As bin() can result in uneven time intervals, the alert service will automatically convert bin() function to bin_at() function with appropriate time at runtime, to ensure results with a fixed point.

    备注

    按警报维度拆分仅适用于当前的 scheduledQueryRules API。Split by alert dimensions is only available for the current scheduledQueryRules API. 如果使用旧版 Log Analytics 警报 API,则需要切换。If you use the legacy Log Analytics Alert API, you will need to switch. 了解有关切换的详细信息Learn more about switching. 仅在 API 版本 2020-05-01-preview 及更高版本中支持大规模的以资源为中心的警报。Resource centric alerting at scale is only supported in the API version 2020-05-01-preview and above.

    “聚合基于”选项

  5. 接下来,根据预览数据设置“运算符”、“阈值”“频率”Next, based on the preview data set the Operator, Threshold Value, and Frequency.

  6. 还可以选择使用“总计或连续违规次数”来设置触发警报的违规次数You can also optionally set the number of violations to trigger an alert by using Total or Consecutive Breaches.

  7. 选择“完成” 。Select Done.

  8. 定义“预警规则名称”、“描述”,然后选择警报“严重性” 。Define the Alert rule name, Description, and select the alert Severity. 这些详细信息将用于所有警报操作。These details are used in all alert actions. 此外,可以通过选择“创建后启用规则”,选择在创建后不激活该预警规则。Additionally, you can choose to not activate the alert rule on creation by selecting Enable rule upon creation.

  9. 如需选择是否要在触发警报后的一段时间内阻止规则操作,可以使用“阻止警报”选项。Choose if you want to suppress rule actions for a time after an alert is fired, use the Suppress Alerts option. 该规则仍会运行并创建警报,但不会触发操作,从而避免干扰。The rule will still run and create alerts but actions won't be triggered to prevent noise. 静音操作值必须大于警报频率才能生效。Mute actions value must be greater than the frequency of alert to be effective.

    对日志警报禁止显示警报

  10. 指定预警规则是否应在满足警报条件时触发一个或多个“操作组”Specify if the alert rule should trigger one or more Action Groups when alert condition is met.

    备注

    有关可以执行的操作的限制,请参阅 Azure 订阅服务限制Refer to the Azure subscription service limits for limits on the actions that can be performed.

  11. 可以选择在日志预警规则中自定义操作:You can optionally customize actions in log alert rules:

    • 自定义电子邮件主题:替代电子邮件操作的电子邮件主题。Custom Email Subject: Overrides the e-mail subject of email actions. 无法修改邮件正文,并且该字段不能用于电子邮件地址。You can't modify the body of the mail and this field isn't for email addresses.
    • 包含自定义 JSON 有效负载:假定操作组包含 Webhook 操作,则替代操作组使用的 Webhook JSON。Include custom Json payload: Overrides the webhook JSON used by Action Groups assuming the action group contains a webhook action. 详细了解用于日志警报的 Webhook 操作Learn more about webhook action for Log Alerts.

    日志警报的操作替代

  12. 如果正确设置了所有字段,则可以单击“创建预警规则”按钮,并创建一个警报。If all fields are correctly set, the Create alert rule button can be clicked and an alert is created.

    几分钟后,警报将处于活动状态,并按前面所述进行触发。Within a few minutes, the alert is active and triggers as previously described.

    创建规则

通过警报管理为 Log Analytics 和 Application Insights 创建日志警报Creating log alert for Log Analytics and Application Insights from the alerts management

备注

以资源为中心的日志当前不支持从警报管理创建Creation from alerts management is currently not supported for resource centric logs

  1. 门户中,选择“Monitor”,然后选择“警报” 。In the portal, select Monitor then choose Alerts.

    监视

  2. 选择“新建预警规则”。Select New Alert Rule.

    添加警报

  3. 随即显示“创建警报”窗格。The Create Alert pane appears. 它有四个部分:It has four parts:

    • 应用警报的资源。The resource to which the alert applies.
    • 要检查的条件。The condition to check.
    • 条件为 true 时要执行的操作。The actions to take if the condition is true.
    • 命名和描述警报的详细信息。The details to name and describe the alert.

    创建规则

  4. 按“选择资源”按钮。Press on Select Resource button. 选择“订阅”和“资源类型”进行筛选,然后选择一个资源 。Filter by choosing the Subscription, Resource Type, and select a resource. 确保资源具有可用的日志。Ensure the resource has logs available.

    选择资源

  5. 接下来,使用添加“条件”按钮查看可用于该资源的信号选项列表。Next, use the add Condition button to view list of signal options available for the resource. 选择“自定义日志搜索”选项。Select Custom log search option.

    选择资源 - 自定义日志搜索

    备注

    警报门户列出了 Log Analytics 和 Application Insights 中保存的查询,这些查询可用作模板警报查询。The alerts portal lists saved queries from Log Analytics and Application Insights and they can be used as template alert queries.

  6. 选择后,在“搜索查询”字段中编写、粘贴或编辑警报查询。Once selected, write, paste, or edit the alerting query in the Search Query field.

  7. 继续执行最后一部分中所述的后续步骤。Continue to the next steps described in the last section.

所有其他资源类型的日志警报Log alert for all other resource types

备注

当前不对 API 版本 2020-05-01-preview 和以资源为中心的日志警报收取额外费用。There are currently no additional charges for the API version 2020-05-01-preview and resource centric log alerts. 未来将公布预览版中的功能的定价以及开始计费之前提供的通知。Pricing for features that are in preview will be announced in the future and a notice provided prior to start of billing. 如果你选择在通知期后继续使用新 API 版本和以资源为中心的日志警报,则将按照适用的费率缴费。Should you choose to continue using new API version and resource centric log alerts after the notice period, you will be billed at the applicable rate.

  1. 从“条件”选项卡开始:Start from the Condition tab:

    1. 请检查“度量值”聚合类型”“聚合粒度”是否正确 。Check that the Measure, Aggregation type, and Aggregation granularity are correct.

      1. 默认情况下,该规则将计算过去 5 分钟内的结果数。By default, the rule counts the number of results in the last 5 minutes.
      2. 如果我们检测到汇总查询结果,则该规则将在几秒钟内自动更新以捕获该结果。If we detect summarized query results, the rule will be updated automatically within a few seconds to capture that.
    2. 如果需要,可以选择按维度拆分警报Choose alert splitting by dimensions, if needed:

      • 资源 ID 列(如果已检测到)将自动选中,并将触发的警报的上下文更改为记录的资源。Resource ID column is selected automatically, if detected, and changes the context of the fired alert to the record's resource.
      • 可以取消选择资源 ID 列以触发针对订阅或资源组的警报。Resource ID column can be de-selected to fire alerts on subscription or resource groups. 如果查询结果基于交叉资源,则取消选择非常有用。De-selecting is useful when query results are based on cross-resources. 例如,用于检查资源组中是否 80% 的虚拟机的 CPU 使用率都很高的查询。For example, a query that check if 80% of the resource group's virtual machines are experiencing high CPU usage.
      • 对于使用维度表的任何数字或文本列类型,还可以最多选择六个附加的拆分。Up to six additional splittings can be also selected for any number or text columns types using the dimensions table.
      • 警报根据基于唯一组合的拆分单独触发,警报有效负载包括此信息。Alerts are fired separately according to splitting based on unique combinations and alert payload includes this information.

      选择聚合参数和拆分

    3. “预览”图表显示一段时间内的查询评估结果。The Preview chart shows query evaluations results over time. 可以更改图表时间段,或选择按维度进行的唯一警报拆分产生的时序。You can change the chart period or select different time series that resulted from unique alert splitting by dimensions.

      预览图表

    4. 接下来,根据预览数据设置“警报逻辑”、“运算符”、“阈值”“频率”Next, based on the preview data, set the Alert logic; Operator, Threshold Value, and Frequency.

      预览具有阈值和警报逻辑的图表

    5. 可以选择在“高级选项”部分中设置“触发警报的违规次数”You can optionally set Number of violations to trigger the alert in the Advanced options section.

      高级选项

  2. 在“操作”选项卡中,选择或创建所需的操作组In the Actions tab, select or create the required action groups.

    “操作”选项卡

  3. 在“详细信息”选项卡中,定义“预警规则详细信息”和“项目详细信息” 。In the Details tab, define the Alert rule details, and Project details. 可以选择设置是否“立即开始运行”,或是否在预警规则触发后的一段时间内“将操作设置为静音”You can optionally set whether to not Start running now, or Mute Actions for a period after the alert rule fires.

    备注

    日志预警规则当前是无状态的,并且每次创建警报时都会触发操作,除非定义了静音。Log alert rules are currently stateless and fires an action every time an alert is created unless muting is defined.

    “详细信息”选项卡

  4. 在“标记”选项卡中,在预警规则资源上设置任何必需的标记。In the Tags tab, set any required tags on the alert rule resource.

    “标记”选项卡

  5. 在“审核 + 创建”选项卡中,将运行验证并告知所有问题。In the Review + create tab, a validation will run and inform of any issues. 审核并批准规则定义。Review and approve the rule definition.

  6. 如果所有字段均正确,请选择“创建”按钮,然后完成预警规则的创建。If all fields are correct, select the Create button and complete the alert rule creation. 可以从警报管理查看所有警报。All alerts can be viewed from the alerts management.

    “审核并创建”选项卡

在 Azure 门户中查看和管理日志警报View & manage log alerts in Azure portal

  1. 门户中,选择相关资源或“Monitor”服务。In the portal, select the relevant resource or the Monitor service. 然后,在“Monitor”部分中选择“警报”。Then select Alerts in the Monitor section.

  2. 警报管理将显示触发的所有警报。The alerts management displays all alerts that fired. 详细了解警报管理Learn more about the alert management.

    备注

    日志预警规则当前为无状态且未解析Log alert rules are currently stateless and do not resolve.

  3. 在顶部栏中选择“管理预警规则”按钮以编辑规则:Select Manage alert rules button on the top bar to edit rules:

     管理警报规则manage alert rules

使用 PowerShell 管理日志警报Managing log alerts using PowerShell

备注

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

备注

API 版本 2020-05-01-preview 当前不支持 PowerShellPowerShell is not currently supported in the API version 2020-05-01-preview

下面列出的 PowerShell cmdlet 可用于使用计划查询规则 API 来管理规则。PowerShell cmdlets listed below are available to manage rules with the Scheduled Query Rules API.

备注

ScheduledQueryRules PowerShell cmdlets 只能管理在当前计划查询规则 API 中创建的规则。ScheduledQueryRules PowerShell cmdlets can only manage rules created in the current Scheduled Query Rules API. 仅在切换到计划的查询规则 api之后,才能使用旧版Log Analytics 警报 api创建的日志警报规则才能使用 PowerShell 进行管理。Log alert rules created using legacy Log Analytics Alert API can only be managed using PowerShell only after switching to Scheduled Query Rules API.

以下是使用 PowerShell 创建日志预警规则的示例步骤:Here are example steps for creating a log alert rule using the PowerShell:

$source = New-AzScheduledQueryRuleSource -Query 'Heartbeat | summarize AggregatedValue = count() by bin(TimeGenerated, 5m), _ResourceId' -DataSourceId "/subscriptions/a123d7efg-123c-1234-5678-a12bc3defgh4/resourceGroups/contosoRG/providers/microsoft.OperationalInsights/workspaces/servicews"

$schedule = New-AzScheduledQueryRuleSchedule -FrequencyInMinutes 15 -TimeWindowInMinutes 30

$metricTrigger = New-AzScheduledQueryRuleLogMetricTrigger -ThresholdOperator "GreaterThan" -Threshold 2 -MetricTriggerType "Consecutive" -MetricColumn "_ResourceId"

$triggerCondition = New-AzScheduledQueryRuleTriggerCondition -ThresholdOperator "LessThan" -Threshold 5 -MetricTrigger $metricTrigger

$aznsActionGroup = New-AzScheduledQueryRuleAznsActionGroup -ActionGroup "/subscriptions/a123d7efg-123c-1234-5678-a12bc3defgh4/resourceGroups/contosoRG/providers/microsoft.insights/actiongroups/sampleAG" -EmailSubject "Custom email subject" -CustomWebhookPayload "{ `"alert`":`"#alertrulename`", `"IncludeSearchResults`":true }"

$alertingAction = New-AzScheduledQueryRuleAlertingAction -AznsAction $aznsActionGroup -Severity "3" -Trigger $triggerCondition

New-AzScheduledQueryRule -ResourceGroupName "contosoRG" -Location "Region Name for your Application Insights App or Log Analytics Workspace" -Action $alertingAction -Enabled $true -Description "Alert description" -Schedule $schedule -Source $source -Name "Alert Name"

还可以通过 PowerShell 使用模板和参数文件创建日志警报:You can also create the log alert using a template and parameters files using PowerShell:

Connect-AzAccount

Select-AzSubscription -SubscriptionName <yourSubscriptionName>

New-AzResourceGroupDeployment -Name AlertDeployment -ResourceGroupName ResourceGroupofTargetResource `
  -TemplateFile mylogalerttemplate.json -TemplateParameterFile mylogalerttemplate.parameters.json

使用 CLI 管理日志警报Managing log alerts using CLI

备注

Azure CLI 支持仅适用于 scheduledQueryRules API 版本 2020-05-01-preview 和更高版本。Azure CLI support is only available for the scheduledQueryRules API version 2020-05-01-preview and above. 以前的 API 版本可将 Azure 资源管理器 CLI 与模板配合使用,如下所述。Pervious API version can use the Azure Resource Manager CLI with templates as described below. 如果使用旧版 Log Analytics 警报 API,则需要切换到使用 CLI。If you use the legacy Log Analytics Alert API, you will need to switch to use CLI. 了解有关切换的详细信息Learn more about switching.

前面几个部分介绍了如何使用 Azure 门户创建、查看和管理日志预警规则。The previous sections described how to create, view, and manage log alert rules using Azure portal. 本部分将介绍如何使用跨平台 Azure CLI 实现相同的结果。This section will describe how to do the same using cross-platform Azure CLI. 使用 Azure CLI 的最快捷方式是通过 Azure Cloud ShellQuickest way to start using Azure CLI is through Azure Cloud Shell. 对于本文,我们将使用 Cloud Shell。For this article, we'll use Cloud Shell.

  1. 请参阅 Azure 门户,选择 Cloud ShellGo to Azure portal, select Cloud Shell.

  2. 在提示符下,可以结合 --help 选项使用命令来详细了解相应的命令及其用法。At the prompt, you can use commands with --help option to learn more about the command and how to use it. 例如,以下命令显示可用于创建、查看和管理日志警报的命令列表:For example, the following command shows you the list of commands available for creating, viewing, and managing log alerts:

    az monitor scheduled-query --help
    
  3. 可以创建日志预警规则来监视系统事件错误的数目:You can create a log alert rule that monitors count of system event errors:

    az monitor scheduled-query create -g {ResourceGroup} -n {nameofthealert} --scopes {vm_id} --condition "count \'union Event, Syslog | where TimeGenerated > ago(1h) | where EventLevelName == \"Error\" or SeverityLevel== \"err\"\' > 2" --description {descriptionofthealert}
    
  4. 可以使用以下命令查看资源组中的所有日志警报:You can view all the log alerts in a resource group using the following command:

    az monitor scheduled-query list -g {ResourceGroup}
    
  5. 可以使用规则的名称或资源 ID 查看特定日志预警规则的详细信息:You can see the details of a particular log alert rule using the name or the resource ID of the rule:

    az monitor scheduled-query show -g {ResourceGroup} -n {AlertRuleName}
    
    az monitor scheduled-query show --ids {RuleResourceId}
    
  6. 可以使用以下命令禁用日志预警规则:You can disable a log alert rule using the following command:

    az monitor scheduled-query update -g {ResourceGroup} -n {AlertRuleName} --enabled false
    
  7. 可以使用以下命令删除日志预警规则:You can delete a log alert rule using the following command:

    az monitor scheduled-query delete -g {ResourceGroup} -n {AlertRuleName}
    

也可以将 Azure 资源管理器 CLI 与模板文件一起使用:You can also use Azure Resource Manager CLI with templates files:

az login

az group deployment create \
    --name AlertDeployment \
    --resource-group ResourceGroupofTargetResource \
    --template-file mylogalerttemplate.json \
    --parameters @mylogalerttemplate.parameters.json

创建成功后,将返回 201。On success for creation, 201 is returned. 更新成功后,将返回 200。On success for update, 200 is returned.

后续步骤Next steps