包含計量、警示和資源健康情況的 Standard Load Balancer 診斷Standard Load Balancer diagnostics with metrics, alerts and resource health

Azure Standard Load Balancer 會公開下列診斷功能:Azure Standard Load Balancer exposes the following diagnostic capabilities:

  • 多維度計量和警示:透過適用于標準負載平衡器設定的 Azure 監視器 ,提供多維度的診斷功能。Multi-dimensional metrics and alerts: Provides multi-dimensional diagnostic capabilities through Azure Monitor for standard load balancer configurations. 您可以監視、管理標準負載平衡器資源,並對其進行疑難排解。You can monitor, manage, and troubleshoot your standard load balancer resources.

  • 資源健康狀態:您 Load Balancer 的資源健康狀態狀態可在 [監視] 底下的 [資源健康狀態] 頁面中取得。Resource health: The Resource Health status of your Load Balancer is available in the Resource Health page under Monitor. 這項自動檢查會通知您 Load Balancer 資源目前的可用性。This automatic check informs you of the current availability of your Load Balancer resource. 本文會簡要介紹這些功能,以及如何將這些功能使用於標準 Load Balancer。This article provides a quick tour of these capabilities, and it offers ways to use them for Standard Load Balancer.

多維度計量Multi-dimensional metrics

Azure Load Balancer 透過 Azure 入口網站的 Azure 計量提供多維度計量,並協助您取得負載平衡器資源的即時診斷見解。Azure Load Balancer provides multi-dimensional metrics via the Azure Metrics in the Azure portal, and it helps you get real-time diagnostic insights into your load balancer resources.

各種標準 Load Balancer 組態提供下列計量:The various Standard Load Balancer configurations provide the following metrics:

MetricMetric 資源類型Resource type 描述Description 建議的彙總Recommended aggregation
資料路徑可用性Data path availability 公用和內部負載平衡器Public and internal load balancer 標準 Load Balancer 會在資料路徑上持續運用,從區域內到 Load Balancer 前端,再一路到支援 VM 的 SDN 堆疊。Standard Load Balancer continuously exercises the data path from within a region to the load balancer front end, all the way to the SDN stack that supports your VM. 只要狀況良好的執行個體持續存在,測量就會依循與您應用程式的負載平衡流量相同的路徑。As long as healthy instances remain, the measurement follows the same path as your application's load-balanced traffic. 此外,也會驗證您客戶所使用的資料路徑。The data path that your customers use is also validated. 此測量對您的應用程式來說是看不見的,也不會干擾到其他作業。The measurement is invisible to your application and does not interfere with other operations. AverageAverage
健康情況探查狀態Health probe status 公用和內部負載平衡器Public and internal load balancer 標準 Load Balancer 使用分散式健康情況探查服務,可根據您的組態設定監視應用程式端點的健康情況。Standard Load Balancer uses a distributed health-probing service that monitors your application endpoint's health according to your configuration settings. 這個計量會提供負載平衡器集區中每個執行個體端點的彙總檢視,或各端點篩選過的檢視。This metric provides an aggregate or per-endpoint filtered view of each instance endpoint in the load balancer pool. 您可以看到 Load Balancer 藉由健康情況探查設定如何檢視應用程式的健康情況。You can see how Load Balancer views the health of your application, as indicated by your health probe configuration. AverageAverage
SYN (同步) 計數SYN (synchronize) count 公用和內部負載平衡器Public and internal load balancer 標準 Load Balancer 不會終止傳輸控制通訊協定 (TCP) 連線,也不會與 TCP 或 UDP 封包流程互動。Standard Load Balancer does not terminate Transmission Control Protocol (TCP) connections or interact with TCP or UDP packet flows. 流程及其交握一律是在來源與 VM 執行個體之間進行。Flows and their handshakes are always between the source and the VM instance. 若要針對您的 TCP 通訊協定案例進行進一步疑難排解,您可使用 SYN 封包計數器來了解已進行多少次 TCP 連線嘗試。To better troubleshoot your TCP protocol scenarios, you can make use of SYN packets counters to understand how many TCP connection attempts are made. 此計量會回報已收到的 TCP SYN 封包數。The metric reports the number of TCP SYN packets that were received. SumSum
SNAT 連線計數SNAT connection count 公用 Load BalancerPublic load balancer 標準 Load Balancer 會回報偽裝為公用 IP 位址前端的輸出流程數目。Standard Load Balancer reports the number of outbound flows that are masqueraded to the Public IP address front end. 來源網路位址轉譯 (SNAT) 連接埠是可耗盡的資源。Source network address translation (SNAT) ports are an exhaustible resource. 此計量可以指出應用程式有多依賴 SNAT 來處理連出的起始流程。This metric can give an indication of how heavily your application is relying on SNAT for outbound originated flows. 系統會回報成功和失敗之連出 SNAT 流程的計數器,而且可用來對連出流程的健康情況進行疑難排解及了解。Counters for successful and failed outbound SNAT flows are reported and can be used to troubleshoot and understand the health of your outbound flows. SumSum
配置的 SNAT 埠Allocated SNAT ports 公用 Load BalancerPublic load balancer Standard Load Balancer 報告每個後端實例配置的 SNAT 埠數目Standard Load Balancer reports the number of SNAT ports allocated per backend instance 平均。Average.
使用的 SNAT 埠Used SNAT ports 公用 Load BalancerPublic load balancer Standard Load Balancer 報告每個後端實例使用的 SNAT 埠數目。Standard Load Balancer reports the number of SNAT ports that are utilized per backend instance. 平均Average
位元組計數Byte count 公用和內部負載平衡器Public and internal load balancer 標準 Load Balancer 會報告每個前端處理的資料。Standard Load Balancer reports the data processed per front end. 您可能有注意到,位元組不會在後端實例間平均散發。You may notice that the bytes are not distributed equally across the backend instances. 這是預期的情況,因為 Azure 的 Load Balancer 演算法是以流程為基礎This is expected as Azure's Load Balancer algorithm is based on flows SumSum
封包計數Packet count 公用和內部負載平衡器Public and internal load balancer 標準 Load Balancer 會報告每個前端處理的封包。Standard Load Balancer reports the packets processed per front end. SumSum

注意

當您透過 NVA 或防火牆 Syn 封包使用來自內部負載平衡器的流量時,將無法使用位元組計數和封包計數計量,而且會顯示為零。When using distributing traffic from an internal load balancer through an NVA or firewall Syn Packet, Byte Count, and Packet Count metrics are not be available and will show as zero.

注意

最大和最小匯總不能用於 SYN 計數、封包計數、SNAT 連線計數和位元組計數計量Max and min aggregations are not available for the SYN count, packet count, SNAT connection count, and byte count metrics

在 Azure 入口網站中檢視負載平衡器計量View your load balancer metrics in the Azure portal

Azure 入口網站會透過 [計量] 頁面公開負載平衡器計量,此頁面可在特定資源的負載平衡器資源頁面和 [Azure 監視器] 頁面上取得。The Azure portal exposes the load balancer metrics via the Metrics page, which is available on both the load balancer resource page for a particular resource and the Azure Monitor page.

若要檢視標準 Load Balancer 資源的計量:To view the metrics for your Standard Load Balancer resources:

  1. 移至 [計量] 頁面,並執行下列其中一項:Go to the Metrics page and do either of the following:
    • 在負載平衡器資源頁面上,選取下拉式清單中的計量類型。On the load balancer resource page, select the metric type in the drop-down list.
    • 在 Azure 監視器頁面上,選取負載平衡器資源。On the Azure Monitor page, select the load balancer resource.
  2. 設定適當的度量匯總類型。Set the appropriate metric aggregation type.
  3. 選擇性設定必要的篩選和群組。Optionally, configure the required filtering and grouping.
  4. (選擇性)設定時間範圍和匯總。Optionally, configure the time range and aggregation. 依預設,時間會以 UTC 格式顯示。By default time is displayed in UTC.

注意

在解讀某些計量時,時間匯總很重要,因為每分鐘會將資料取樣一次。Time aggregation is important when interpreting certain metrics as data is sampled once per minute. 如果 [時間匯總] 設定為 [5 分鐘],而 [計量匯總類型總和] 用於 SNAT 配置之類的計量,則您的圖形會顯示配置的 SNAT 埠總數五倍。If time aggregation is set to five minutes and metric aggregation type Sum is used for metrics such as SNAT Allocation, your graph will display five times the total allocated SNAT ports.

Standard Load Balancer 的計量

圖: Standard Load Balancer 的資料路徑可用性度量Figure: Data Path Availability metric for Standard Load Balancer

透過 API 以程式設計方式擷取多維度計量Retrieve multi-dimensional metrics programmatically via APIs

如需可供擷取多維度計量定義和值的 API 指導方針,請參閱 Azure 監視 REST API 逐步解說For API guidance for retrieving multi-dimensional metric definitions and values, see Azure Monitoring REST API walkthrough. 您可以藉由新增 [所有計量] 類別的 診斷設定 ,將這些計量寫入至儲存體帳戶。These metrics can be written to a storage account by adding a Diagnostic Setting for the 'All Metrics' category.

資料路徑是否已啟動並可供我的 Load Balancer 前端使用?Is the data path up and available for my Load Balancer Frontend?

展開Expand

資料路徑可用性度量描述區域內資料路徑的健康情況,以及您 Vm 所在的計算主機。The data path availability metric describes the health of the data path within the region to the compute host where your VMs are located. 此計量是 Azure 基礎結構的健康情況反映。The metric is a reflection of the health of the Azure infrastructure. 您可以使用此計量:You can use the metric to:

  • 監視服務的外部可用性Monitor the external availability of your service
  • 挖掘更深入的資料,了解服務部署所在的平台是否狀況良好、您的客體 OS 或應用程式執行個體是否狀況良好。Dig deeper and understand whether the platform on which your service is deployed is healthy or whether your guest OS or application instance is healthy.
  • 找出事件是否與您的服務或基礎資料層相關。Isolate whether an event is related to your service or the underlying data plane. 請勿將此度量與健康情況探查狀態混淆 ( 「後端實例可用性」 ) 。Do not confuse this metric with the health probe status ("Backend Instance availability").

若要取得 Standard Load Balancer 資源的資料路徑可用性:To get the Data Path Availability for your Standard Load Balancer resources:

  1. 確定已選取正確的負載平衡器資源。Make sure the correct load balancer resource is selected.
  2. 在 [計量] 下拉式清單中,選取 [資料路徑可用性]。In the Metric drop-down list, select Data Path Availability.
  3. 在 [彙總] 下拉式清單中,選取 [平均]。In the Aggregation drop-down list, select Avg.
  4. 此外,請將前端 IP 位址或前端埠上的篩選新增為具有必要前端 IP 位址或前端埠的維度,然後依選取的維度將其分組。Additionally, add a filter on the Frontend IP address or Frontend port as the dimension with the required front-end IP address or front-end port, and then group them by the selected dimension.

VIP 探查

圖: Load Balancer 前端探查詳細資料Figure: Load Balancer Frontend probing details

計量會由作用中的頻內測量所產生。The metric is generated by an active, in-band measurement. 區域內的探查服務會產生此測量的流量。A probing service within the region originates traffic for the measurement. 此服務會在您使用公用前端建立部署時啟動,並繼續執行到您移除前端為止。The service is activated as soon as you create a deployment with a public front end, and it continues until you remove the front end.

系統會定期產生符合部署之前端和規則的封包。A packet matching your deployment's front end and rule is generated periodically. 封包會在區域中從來源周遊到主機 (後端集區中 VM 的所在位置)。It traverses the region from the source to the host where a VM in the back-end pool is located. 負載平衡器基礎結構會執行相同的負載平衡和轉譯作業,如同對所有其他流量所做的一樣。The load balancer infrastructure performs the same load balancing and translation operations as it does for all other traffic. 這個探查是在負載平衡端點的頻內。This probe is in-band on your load-balanced endpoint. 在探查抵達計算主機 (後端集區中狀況良好 VM 的所在位置) 之後,計算主機會針對探查服務產生回應。After the probe arrives on the compute host, where a healthy VM in the back-end pool is located, the compute host generates a response to the probing service. 您的 VM 不會看到這個流量。Your VM does not see this traffic.

資料路徑可用性因為下列原因而失敗:Datapath availability fails for the following reasons:

  • 您的部署在後端集區中沒有剩餘的狀況良好 VM。Your deployment has no healthy VMs remaining in the back-end pool.
  • 已發生基礎結構中斷。An infrastructure outage has occurred.

基於診斷用途,您可以使用 資料路徑可用性度量以及健康情況探查狀態For diagnostic purposes, you can use the Data Path Availability metric together with the health probe status.

在大部分的情況下,請使用 平均 彙總。Use Average as the aggregation for most scenarios.

我的 Load Balancer 的後端實例會回應探查嗎?Are the Backend Instances for my Load Balancer responding to probes?

展開Expand 「健康情況探查狀態」計量會描述應用程式部署的健康情況,這個部署是在您設定負載平衡器健康情況探查時由您所設定。The health probe status metric describes the health of your application deployment as configured by you when you configure the health probe of your load balancer. 負載平衡器使用健康情況探查的狀態來判斷新流程要傳送到哪裡。The load balancer uses the status of the health probe to determine where to send new flows. 健康情況探查源自 Azure 基礎結構的位址,在 VM 的客體 OS 內可以看到。Health probes originate from an Azure infrastructure address and are visible within the guest OS of the VM.

若要取得 Standard Load Balancer 資源的健康情況探查狀態:To get the health probe status for your Standard Load Balancer resources:

  1. 選取 [平均] 匯總類型的 健康情況探查狀態 度量。Select the Health Probe Status metric with Avg aggregation type.
  2. 將篩選套用至所需的前端 IP 位址或埠 (或兩個) 。Apply a filter on the required Frontend IP address or port (or both).

健康情況探查會因為下列原因而失敗:Health probes fail for the following reasons:

  • 您可對未接聽、未回應或使用錯誤通訊協定的連接埠設定健康情況探查。You configure a health probe to a port that is not listening or not responding or is using the wrong protocol. 如果您的服務使用伺服器直接回傳 (DSR 或浮動 IP) 規則,請確定服務會接聽 NIC 之 IP 組態的 IP 位址,而且不只是接聽使用前端 IP 位址所設定的回送。If your service is using direct server return (DSR, or floating IP) rules, make sure that the service is listening on the IP address of the NIC's IP configuration and not just on the loopback that's configured with the front-end IP address.
  • 網路安全性群組、VM 的客體 OS 防火牆或應用程式層篩選不允許您的探查。Your probe is not permitted by the Network Security Group, the VM's guest OS firewall, or the application layer filters.

在大部分的情況下,請使用 平均 彙總。Use Average as the aggregation for most scenarios.

如何查看我的輸出連線統計資料?How do I check my outbound connection statistics?

展開Expand 「SNAT 連線」計量會描述[輸出流程](./load-balancer-outbound-connections.md)的成功和失敗連線數量。The SNAT connections metric describes the volume of successful and failed connections for [outbound flows](./load-balancer-outbound-connections.md).

失敗連線數量大於零,表示 SNAT 連接埠耗盡。A failed connections volume of greater than zero indicates SNAT port exhaustion. 您必須進一步調查,以判斷造成失敗的原因。You must investigate further to determine what may be causing these failures. SNAT 連接埠耗盡的外在表現就是無法建立輸出流程SNAT port exhaustion manifests as a failure to establish an outbound flow. 請參閱輸出連線的文章,以了解案例和運作機制,以及了解如何減輕及設計以避免 SNAT 連接埠耗盡。Review the article about outbound connections to understand the scenarios and mechanisms at work, and to learn how to mitigate and design to avoid SNAT port exhaustion.

若要取得 SNAT 連線統計資料:To get SNAT connection statistics:

  1. 選取 [SNAT 連線] 計量類型,並選取 [總和] 作為彙總。Select SNAT Connections metric type and Sum as aggregation.
  2. 依連線 狀態 分組,以使成功和失敗的 SNAT 連線計數以不同的行表示。Group by Connection State for successful and failed SNAT connection counts to be represented by different lines.

SNAT 連線

圖: Load Balancer SNAT 連線計數

*Figure: Load Balancer SNAT connection count*

如何? 檢查我的 SNAT 埠使用量和配置嗎?How do I check my SNAT port usage and allocation?

展開Expand 使用的 SNAT 埠計量會追蹤耗用多少 SNAT 埠來維持輸出流程。The Used SNAT Ports metric tracks how many SNAT ports are being consumed to maintain outbound flows. 這表示在網際網路來源與後端 VM 或負載平衡器後方的虛擬機器擴展集之間建立了多少個唯一的流程,而且沒有公用 IP 位址。This indicates how many unique flows are established between an internet source and a backend VM or virtual machine scale set that is behind a load balancer and does not have a public IP address. 藉由比較您所使用的 SNAT 埠數目與配置的 SNAT 埠計量,您可以判斷您的服務是否遇到,或有 SNAT 耗盡的風險,以及產生的輸出流程失敗。By comparing the number of SNAT ports you are using with the Allocated SNAT Ports metric, you can determine if your service is experiencing or at risk of SNAT exhaustion and resulting outbound flow failure.

如果您的計量指出 輸出流程 失敗的風險,請參考該文章並採取步驟來減輕這個問題,以確保服務健康狀態。If your metrics indicate risk of outbound flow failure, reference the article and take steps to mitigate this to ensure service health.

若要查看 SNAT 埠的使用方式和配置:To view SNAT port usage and allocation:

  1. 將圖形的時間匯總設定為1分鐘,以確保會顯示所需的資料。Set the time aggregation of the graph to 1 minute to ensure desired data is displayed.
  2. 選取已 使用的 Snat 埠 及/或配置的 snat 埠 作為度量類型,並將 平均 配置為匯總Select Used SNAT Ports and/or Allocated SNAT Ports as the metric type and Average as the aggregation
    • 根據預設,這些計量是每個後端 VM 或 VMSS 所配置或使用的 SNAT 埠平均數目,對應至對應至 Load Balancer 的所有前端公用 Ip (透過 TCP 和 UDP 進行匯總)。By default these metrics are the average number of SNAT ports allocated to or used by each backend VM or VMSS, corresponding to all frontend public IPs mapped to the Load Balancer, aggregated over TCP and UDP.
    • 若要查看或配置給負載平衡器使用的 SNAT 埠總數,請使用計量匯總 總和To view total SNAT ports used by or allocated for the load balancer use metric aggregation Sum
  3. 篩選特定的 通訊協定類型、一組 後端 ip 和/或 前端 ipFilter to a specific Protocol Type, a set of Backend IPs, and/or Frontend IPs.
  4. 若要監視每個後端或前端實例的健全狀況,請套用分割。To monitor health per backend or frontend instance, apply splitting.
    • 注意:分割只允許一次顯示單一度量。Note splitting only allows for a single metric to be displayed at a time.
  5. 例如,若要針對每部電腦監視 TCP 流量的 SNAT 使用量,請依 平均 匯總、依 後端 ip 分割,然後依 通訊協定類型 進行篩選。For example, to monitor SNAT usage for TCP flows per machine, aggregate by Average, split by Backend IPs and filter by Protocol Type.

SNAT 配置和使用方式

圖:一組後端 Vm 的平均 TCP SNAT 埠配置和使用量Figure: Average TCP SNAT port allocation and usage for a set of backend VMs

後端實例的 SNAT 使用量

圖:每個後端實例的 TCP SNAT 埠使用量

*Figure: TCP SNAT port usage per backend instance*

如何查看服務的輸入/輸出連線嘗試?How do I check inbound/outbound connection attempts for my service?

展開Expand 「SYN 封包」計量會描述 TCP SYN 封包的數量,涵括與特定前端相關聯的已抵達或已傳送 (針對[輸出流程](./load-balancer-outbound-connections.md)) 封包。A SYN packets metric describes the volume of TCP SYN packets, which have arrived or were sent (for [outbound flows](./load-balancer-outbound-connections.md)) that are associated with a specific front end. 您可以使用此計量來了解服務的 TCP 連線嘗試。You can use this metric to understand TCP connection attempts to your service.

使用 Sum 作為大部分案例的匯總。Use Sum as the aggregation for most scenarios.

SYN 連線

圖: Load Balancer SYN 計數

*Figure: Load Balancer SYN count*

如何查看我的網路頻寬耗用?How do I check my network bandwidth consumption?

展開Expand 「位元組和封包計數器」計量會描述您的服務在每個前端上所傳送或接收的位元組和封包數量。The bytes and packet counters metric describes the volume of bytes and packets that are sent or received by your service on a per-front-end basis.

使用 Sum 作為大部分案例的匯總。Use Sum as the aggregation for most scenarios.

若要取得位元組或封包計數統計資料:To get byte or packet count statistics:

  1. 選取 [ 位元組計數 ] 和/或 [封 包計數 ] 計量類型,並以 Sum 作為匯總。Select the Bytes Count and/or Packet Count metric type, with Sum as the aggregation.
  2. 請執行下列其中一個動作:Do either of the following:
    • 對特定前端 IP、前端連接埠或後端 IP 或後端連接埠套用篩選器。Apply a filter on a specific front-end IP, front-end port, back-end IP, or back-end port.
    • 取得負載平衡器資源的整體統計資料 (不使用任何篩選)。Get overall statistics for your load balancer resource without any filtering.

位元組計數

圖: Load Balancer 的位元組計數

*Figure: Load Balancer byte count*

我要如何診斷我的負載平衡器部署?How do I diagnose my load balancer deployment?

展開Expand 藉由在單一圖表上使用資料路徑可用性和健康情況探查狀態計量的組合,您可以找出問題所在的位置,並解決問題。By using a combination of the Data Path Availability and Health Probe Status metrics on a single chart you can identify where to look for the problem and resolve the problem. 您可以確定 Azure 正常運作,並利用此知識確定地判斷設定或應用程式是根本原因。You can gain assurance that Azure is working correctly and use this knowledge to conclusively determine that the configuration or application is the root cause.

您可以透過健康情況探查計量來了解 Azure 依據您所提供的設定如何看待部署的健康情況。You can use health probe metrics to understand how Azure views the health of your deployment as per the configuration you have provided. 查看健康情況探查永遠是監視或判斷原因時最好的第一步驟。Looking at health probes is always a great first step in monitoring or determining a cause.

您可以更進一步地使用資料路徑可用性度量,以深入瞭解 Azure 如何查看負責您特定部署的基礎資料平面健康情況。You can take it a step further and use Data Path availability metric to gain insight into how Azure views the health of the underlying data plane that's responsible for your specific deployment. 當您合併這兩個計量時,就能找出錯誤可能的位置,如此圖中範例所示:When you combine both metrics, you can isolate where the fault might be, as illustrated in this example:

結合資料路徑可用性和健康情況探查狀態計量

圖:結合資料路徑可用性和健康情況探查狀態計量Figure: Combining Data Path Availability and Health Probe Status metrics

此圖表會顯示以下資訊:The chart displays the following information:

  • 裝載 Vm 的基礎結構無法使用,而且在圖表的開頭為0%。The infrastructure hosting your VMs was unavailable and at 0 percent at the beginning of the chart. 之後,基礎結構的狀況良好,且 Vm 可連線,而且後端中有一個以上的 VM。Later, the infrastructure was healthy and the VMs were reachable, and more than one VM was placed in the back end. 這項資訊是由適用于資料路徑可用性的藍色追蹤所表示,之後是100%。This information is indicated by the blue trace for data path availability, which was later at 100 percent.
  • 健康情況探查狀態(以紫色追蹤表示)在圖表的開頭為0%。The health probe status, indicated by the purple trace, is at 0 percent at the beginning of the chart. 綠色的圓形區域會反白顯示健康情況探查狀態變成狀況良好的位置,以及客戶的部署可以接受新流程的位置。The circled area in green highlights where the health probe status became healthy, and at which point the customer's deployment was able to accept new flows.

此圖表讓客戶可以自行針對部署進行疑難排解,不必猜測或要求支援找出是否發生其他問題。The chart allows customers to troubleshoot the deployment on their own without having to guess or ask support whether other issues are occurring. 由於設定不正確或應用程式失敗導致健康情況探查失敗,所以服務變得無法使用。The service was unavailable because health probes were failing due to either a misconfiguration or a failed application.

設定多維度計量的警示Configure alerts for multi-dimensional metrics

Azure Standard Load Balancer 支援可輕鬆設定多維度計量的警示。Azure Standard Load Balancer supports easily configurable alerts for multi-dimensional metrics. 設定特定計量的自訂閾值,以觸發具有不同嚴重性等級的警示,以提升 touchless 資源監視體驗。Configure custom thresholds for specific metrics to trigger alerts with varying levels of severity to empower a touchless resource monitoring experience.

設定警示:To configure alerts:

  1. 移至負載平衡器的警示子分頁Go to the alert sub-blade for the load balancer
  2. 建立新的警示規則Create new alert rule
    1. 設定警示條件Configure alert condition
    2. (選擇性) 新增自動修復的動作群組(Optional) Add action group for automated repair
    3. 指派警示嚴重性、名稱和描述,以啟用直覺反應Assign alert severity, name and description that enables intuitive reaction

輸入可用性警示Inbound availability alerting

若要針對輸入可用性發出警示,您可以使用資料路徑可用性和健康情況探查狀態計量來建立兩個不同的警示。To alert for inbound availability, you can create two separate alerts using the data path availability and health probe status metrics. 客戶可能會有不同的案例需要特定的警示邏輯,但下列範例將有助於大部分的設定。Customers may have different scenarios that require specific alerting logic, but the below examples will be helpful for most configurations.

使用資料路徑可用性時,您可以在特定負載平衡規則變成無法使用時引發警示。Using data path availability, you can fire alerts whenever a specific load balancing rule becomes unavailable. 您可以設定此警示,方法是設定資料路徑可用性的警示條件,並依前端埠和前端 IP 位址的所有目前值和未來值進行分割。You can configure this alert by setting an alert condition for the data path availability and splitting by all current values and future values for both Frontend Port and Frontend IP Address. 將警示邏輯設定為小於或等於0,將會在任何負載平衡規則變成沒有回應時引發此警示。Setting the alert logic to be less than or equal to 0 will cause this alert to be fired whenever any load balancing rule becomes unresponsive. 根據您所需的評估來設定評估的匯總細微性和頻率。Set the aggregation granularity and frequency of evaluation according to your desired evaluation.

使用健康情況探查狀態時,您可以在指定的後端實例無法回應健康情況探查相當長的時間時發出警示。With health probe status you can alert when a given backend instance fails to respond to the health probe for a significant amount of time. 設定您的警示條件以使用健康情況探查狀態計量,並依後端 IP 位址和後端埠進行分割。Set up your alert condition to use the health probe status metric and split by Backend IP Address and Backend Port. 這可確保您可以針對每個個別的後端實例,分別針對特定埠上的流量提供服務。This will ensure that you can alert separately for each individual backend instance’s ability to serve traffic on a specific port. 使用 平均 匯總類型,並根據您的後端實例的探查頻率,以及您視為狀況良好閾值的情況,設定臨界值。Use the Average aggregation type and set the threshold value according to how frequently your backend instance is probed and what you consider to be your healthy threshold.

您也可以在後端集區層級上發出警示,方法是不要依任何維度分割並使用 平均 匯總類型。You can also alert on a backend pool level by not splitting by any dimensions and using the Average aggregation type. 這可讓您設定警示規則,例如,當我的後端集區成員的50% 狀況不良時發出警示。This will allow you to set up alert rules such as alert when 50% of my backend pool members are unhealthy.

輸出可用性警示Outbound availability alerting

若要針對輸出可用性進行設定,您可以使用 SNAT 連線計數和使用 SNAT 埠計量來設定兩個不同的警示。To configure for outbound availability, you can configure two separate alerts using the SNAT Connection Count and Used SNAT Port metrics.

若要偵測輸出連線失敗,請使用 SNAT 連線計數設定警示,並對線上狀態 = Failed 進行篩選。To detect outbound connection failures, configure an alert using SNAT Connection Count and filtering to Connection State = Failed. 使用 總匯總Use the Total aggregation. 然後,您也可以將這項設定分割為所有目前及未來的值,以針對發生失敗連線的每個後端實例,分別進行警示。You can then also split this by Backend IP Address set to all current and future values to alert separately for each backend instance experiencing failed connections. 如果您預期會看到某些輸出連接失敗,請將閾值設為大於零或較高的數位。Set the threshold to be greater than zero or a higher number if you expect to see some outbound connection failures.

您可以透過使用的 SNAT 埠,以更高的 SNAT 耗盡和輸出連線失敗風險來發出警示。Through Used SNAT Ports you can alert on a higher risk of SNAT exhaustion and outbound connection failure. 使用此警示並使用 平均 匯總時,請確定您是使用後端 IP 位址和通訊協定進行分割。Ensure you are splitting by Backend IP address and Protocol when using this alert and use the Average aggregation. 將臨界值設定為大於您所被認為不安全的每個實例所配置的埠數目) (s 的百分比。Set the threshold to be greater than a percentage(s) of the number of ports you have allocated per instance that you deem unsafe. 例如,當後端實例使用75% 的已配置埠時,您可以設定低嚴重性警示,並在使用其所配置埠的90% 或100% 時使用高嚴重性。For example, you may configure a low severity alert when a backend instance uses 75% of its allocated ports and a high severity when it uses 90% or 100% of its allocated ports.

資源健康情況狀態Resource health status

標準 Load Balancer 資源的健康情況狀態會透過 [監視器] > [服務健康狀態] 底下現有的 [資源健康狀態] 公開。Health status for the Standard Load Balancer resources is exposed via the existing Resource health under Monitor > Service Health. 它會透過測量資料路徑可用性來評估每 兩分鐘 ,以判斷您的前端負載平衡端點是否可用。It is evaluated every two minutes by measuring Data Path Availability which determines whether your Frontend Load Balancing endpoints are available.

資源健康情況狀態Resource health status 描述Description
可用Available 您的標準負載平衡器資源狀況良好且可供使用。Your standard load balancer resource is healthy and available.
已降級Degraded 您的標準負載平衡器有影響效能的平臺或使用者起始事件。Your standard load balancer has platform or user initiated events impacting performance. 資料路徑可用性計量在至少兩分鐘內回報了小於 90% 但大於 25% 的健康情況。The Datapath Availability metric has reported less than 90% but greater than 25% health for at least two minutes. 您將體驗到嚴重的效能影響。You will experience moderate to severe performance impact. 遵循疑難排解 RHC 指南 ,判斷是否有使用者起始的事件造成影響您的可用性。Follow the troubleshooting RHC guide to determine whether there are user initiated events causing impacting your availability.
無法使用Unavailable 您的標準負載平衡器資源狀況不良。Your standard load balancer resource is not healthy. 資料路徑可用性計量回報的健康情況低於至少兩分鐘的25% 健全狀況。The Datapath Availability metric has reported less the 25% health for at least two minutes. 您將會遇到對輸入連線能力造成顯著的效能影響或缺乏可用性。You will experience significant performance impact or lack of availability for inbound connectivity. 可能是使用者或平臺事件造成無法使用的情形。There may be user or platform events causing unavailability. 遵循疑難排解 RHC 指南 ,判斷是否有使用者起始的事件影響您的可用性。Follow the troubleshooting RHC guide to determine whether there are user initiated events impacting your availability.
UnknownUnknown 標準負載平衡器資源的資源健康狀態尚未更新,或尚未收到過去10分鐘的資料路徑可用性資訊。Resource health status for your standard load balancer resource has not been updated yet or has not received Data Path availability information for the last 10 minutes. 此狀態應該是暫時性的,系統會在收到資料後立即反/映正確的狀態。This state should be transient and will reflect correct status as soon as data is received.

若要檢視公用標準 Load Balancer 資源的健康情況:To view the health of your public Standard Load Balancer resources:

  1. 選取 [監視 > 服務健康 情況]。Select Monitor > Service Health.

    監視器頁面

    圖:Azure 監視器上的服務健康情況連結Figure: The Service Health link on Azure Monitor

  2. 選取 [資源健康狀態],然後確定已選取 [訂用帳戶識別碼] 以及 [資源類型 = 負載平衡器]。Select Resource Health, and then make sure that Subscription ID and Resource Type = Load Balancer are selected.

    資源健康情況狀態

    圖:選取資源以檢視健康情況Figure: Select resource for health view

  3. 在清單中,選取 Load Balancer 資源,以檢視其過去的健康情況狀態。In the list, select the Load Balancer resource to view its historical health status.

    Load Balancer 健康情況狀態

    圖:Load Balancer 資源的健康情況檢視Figure: Load Balancer resource health view

一般資源健康狀態原因可在 RHC 檔中取得。Generic resource health status description are available in the RHC documentation. 下表列出 Azure Load Balancer 的特定狀態:For specific statuses for the Azure Load Balancer are listed in the below table:

下一步Next steps