您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

设置基本警报Set up basic alerts

在出现问题时,会通知管理资源的关键部分。A key part of managing resources is getting notified when problems occur. 警报根据指标、日志或服务运行状况问题中的触发器主动通知你关键情况。Alerts proactively notify you of critical conditions, based on triggers from metrics, logs, or service-health issues. 作为加入 Azure 服务器管理服务的一部分,你可以设置警报和通知,以帮助你的 IT 团队了解任何问题。As part of onboarding the Azure server management services, you can set up alerts and notifications that help keep your IT teams aware of any problems.

Azure Monitor 警报Azure Monitor alerts

Azure Monitor 提供 警报 功能,以便在出现问题时通过电子邮件或消息通知通知您。Azure Monitor offers alerting capabilities to notify you, via email or messaging, when things go wrong. 这些功能基于常见的数据监视平台,其中包括服务器和其他资源生成的日志和指标。These capabilities are based on a common data-monitoring platform that includes logs and metrics generated by your servers and other resources. 通过在 Azure Monitor 中使用一组常用工具,可以分析从多个资源合并的数据并使用它来触发警报。By using a common set of tools in Azure Monitor, you can analyze data that's combined from multiple resources and use it to trigger alerts. 这些触发器可以包括:These triggers can include:

  • 指标值。Metric values.
  • 日志搜索查询。Log search queries.
  • 活动日志事件。Activity log events.
  • 基础 Azure 平台的运行状况。The health of the underlying Azure platform.
  • 测试网站的可用性。Tests for website availability.

有关此服务收集的监视数据源的更详细说明,请参阅 Azure Monitor 数据源的列表See the list of Azure Monitor data sources for a more detailed description of the sources of monitoring data that this service collects.

有关使用 Azure 门户手动创建和管理警报的详细信息,请参阅 Azure Monitor 文档For details about manually creating and managing alerts by using the Azure portal, see the Azure Monitor documentation.

在本指南中,我们建议为基本基础结构监视创建一组15个警报。In this guide, we recommend that you create a set of 15 alerts for basic infrastructure monitoring. 警报工具包 GitHub 存储库中查找部署脚本。Find the deployment scripts in the Alert Toolkit GitHub repository.

此包将为以下内容创建警报:This package creates alerts for:

  • 磁盘空间不足Low disk space
  • 可用内存不足Low available memory
  • CPU 使用率高High CPU use
  • 意外关机Unexpected shutdowns
  • 文件系统损坏Corrupted file systems
  • 常见硬件故障Common hardware failures

包使用 HPE 服务器硬件作为示例。The package uses HPE server hardware as an example. 更改相关配置文件中的设置,以反映 OEM 硬件。Change the settings in the associated configuration file to reflect your OEM hardware. 还可以向配置文件添加更多性能计数器。You can also add more performance counters to the configuration file. 若要部署包,请运行 New-CoreAlerts.ps1 文件。To deploy the package, run the New-CoreAlerts.ps1 file.

后续步骤Next steps

了解支持正在进行的操作的操作和安全机制。Learn about operations and security mechanisms that support your ongoing operations.