Load Balancer 健康情況探查Load Balancer health probes

使用負載平衡規則搭配 Azure Load Balancer 時,您需要指定健康情況探查,以允許 Load Balancer 偵測後端端點狀態。When using load-balancing rules with Azure Load Balancer, you need to specify health probes to allow Load Balancer to detect the backend endpoint status. 健康情況探查和探查回應的設定會決定要接收新流量的後端集區實例。The configuration of the health probe and probe responses determine which backend pool instances will receive new flows. 您可以使用健康情況探查來偵測後端端點上的應用程式失敗。You can use health probes to detect the failure of an application on a backend endpoint. 您也可以對健康情況探查產生自訂回應,並運用健康情況探查來控制流量,藉此管理負載或排定的停機時間。You can also generate a custom response to a health probe and use the health probe for flow control to manage load or planned downtime. 當健康情況探查失敗時,Load Balancer 將會停止將新的流程傳送至各自狀況不良的實例。When a health probe fails, Load Balancer will stop sending new flows to the respective unhealthy instance. 輸出連線不受影響,只會影響輸入連線能力。Outbound connectivity is not impacted, only inbound connectivity is impacted.

健康情況探查支援多個通訊協定。Health probes support multiple protocols. 特定健康情況探查通訊協定的可用性會因 Load Balancer SKU 而有所不同。The availability of a specific health probe protocol varies by Load Balancer SKU. 此外,此服務的行為會依 Load Balancer 的 SKU 而有所不同,如下表所示:Additionally, the behavior of the service varies by Load Balancer SKU as shown in this table:

標準 SKUStandard SKU 基本 SKUBasic SKU
探查類型Probe types TCP, HTTP, HTTPSTCP, HTTP, HTTPS TCP, HTTPTCP, HTTP
探查關閉行為Probe down behavior 關閉所有探查、繼續所有 TCP 流程。All probes down, all TCP flows continue. 所有探查關閉時,所有 TCP 流量都會到期。All probes down, all TCP flows expire.

重要

完整檢查這份檔,包括以下重要的 設計指引 ,以建立可靠的服務。Review this document in its entirety, including important design guidance below to create a reliable service.

重要

Load Balancer 健康情況探查源自於 IP 位址 168.63.129.16,而且不得封鎖探查以將您的執行個體標示為已啟動。Load Balancer health probes originate from the IP address 168.63.129.16 and must not be blocked for probes to mark up your instance. 檢閱探查來源 IP 位址,以取得詳細資料。Review probe source IP address for details.

重要

不論設定的超時閾值為何,如果伺服器傳回的狀態碼不是 HTTP 200,或透過 TCP 重設終止連線,則 HTTP (S) Load Balancer 健康情況探查會自動探查實例。Regardless of configured time-out threshold, HTTP(S) Load Balancer health probes will automatically probe down an instance if the server returns any status code that is not HTTP 200 OK or if the connection is terminated via TCP reset.

探查設定Probe configuration

健康情況探查設定是由下列元素所組成:Health probe configuration consists of the following elements:

  • 個別探查之間間隔的持續時間Duration of the interval between individual probes
  • 在探查轉換成不同狀態之前必須觀察到的探查回應數目Number of probe responses which have to be observed before the probe transitions to a different state
  • 探查的通訊協定Protocol of the probe
  • 探查的埠Port of the probe
  • 使用 HTTP (S) 探查時,用於 HTTP GET 的 HTTP 路徑HTTP path to use for HTTP GET when using HTTP(S) probes

注意

使用 Azure PowerShell、Azure CLI、範本或 API 時,不強制或檢查探查定義。A probe definition is not mandatory or checked for when using Azure PowerShell, Azure CLI, Templates or API. 只有在使用 Azure 入口網站時,才會完成探查驗證測試。Probe validation tests are only done when using the Azure Portal.

瞭解應用程式信號、偵測信號,以及平臺的反應Understanding application signal, detection of the signal, and reaction of the platform

探查回應的數目適用于兩者The number of probe responses applies to both

  • 允許實例標示為已啟動的成功探查數目,以及the number of successful probes that allow an instance to be marked as up, and
  • 導致實例標示為已關閉的超時探查數目。the number of timed-out probes that cause an instance to be marked as down.

指定的超時時間和間隔值會決定實例是否標記為向上或向下。The timeout and interval values specified determine whether an instance will be marked as up or down. 間隔的持續時間乘以探查回應的數目,會決定必須偵測探查回應的持續時間。The duration of the interval multiplied by the number of probe responses determines the duration during which the probe responses have to be detected. 而服務會在所需的探查完成後回應。And the service will react after the required probes have been achieved.

我們可以使用範例進一步說明此行為。We can illustrate the behavior further with an example. 如果您已將探查回應的數目設定為2,並將間隔設定為5秒,這表示必須在10秒的間隔內觀察到2個探查逾時錯誤。If you have set the number of probe responses to 2 and the interval to 5 seconds, this means 2 probe time-out failures must be observed within a 10 second interval. 當您的應用程式可能變更狀態時,由於傳送探查的時間不會進行同步處理,因此我們可以將時間限制在兩個案例中:Because the time at which a probe is sent is not synchronized when your application may change state, we can bound the time to detect by two scenarios:

  1. 如果您的應用程式在第一次探查抵達之前開始產生超時探查回應,則偵測這些事件將需要10秒的時間 (2 x 5 秒間隔) 加上應用程式的持續時間,開始發出第一個探查抵達的時間。If your application starts producing a time-out probe response just before the first probe arrives, the detection of these events will take 10 seconds (2 x 5 second intervals) plus the duration of the application starting to signal a time-out to when the first probe arrives. 您可以假設這種偵測稍微花了10秒鐘。You can assume this detection to take slightly over 10 seconds.
  2. 如果您的應用程式在第一次探查抵達之後才開始產生超時探查回應,則在下一個探查抵達之前,將不會開始偵測這些事件 (和時間) 再加上10秒 (2 x 5 秒間隔) 。If your application starts producing a time-out probe response just after the first probe arrives, the detection of these events will not begin until the next probe arrives (and times out) plus another 10 seconds (2 x 5 second intervals). 您可以假設這種偵測只會在15秒內完成。You can assume this detection to take just under 15 seconds.

在此範例中,一旦發生偵測,平臺就會花很長的時間來回應這項變更。For this example, once detection has occurred, the platform will then take a small amount of time to react to this change. 這表示根據This means a depending on

  1. 當應用程式開始變更狀態時,when the application begins changing state and
  2. 當偵測到這項變更並符合所需的準則時 (在指定間隔傳送的探查數目) 和when this change is detected and met the required criteria (number of probes sent at the specified interval) and
  3. 當偵測到跨平臺通訊時when the detection has been communicated across the platform

您可以假設超時探查回應的回應將花費最少10秒,且最多隻會有15秒的時間,以回應來自應用程式之信號的變更。you can assume the reaction to a time-out probe response will take between a minimum of just over 10 seconds and a maximum of slightly over 15 seconds to react to a change in the signal from the application. 提供此範例以說明所發生的情況,不過,您無法預測超過上述範例中所述的確切持續時間。This example is provided to illustrate what is taking place, however, it is not possible to forecast an exact duration beyond the above rough guidance illustrated in this example.

注意

健康情況探查會探查後端集區中所有正在執行的實例。The health probe will probe all running instances in the backend pool. 如果實例已停止,則在重新開機之前,將不會進行探查。If an instance is stopped it will not be probed until it has been started again.

探查類型Probe types

健康情況探查所使用的通訊協定可以設定為下列其中一項:The protocol used by the health probe can be configured to one of the following:

可用的通訊協定取決於所使用的 Load Balancer SKU:The available protocols depend on the Load Balancer SKU used:

TCPTCP HTTPHTTP HTTPSHTTPS
標準 SKUStandard SKU
基本 SKUBasic SKU

TCP 探查TCP probe

TCP 探查會利用定義的連接埠執行三向開放 TCP 交握,藉此初始化連線。TCP probes initiate a connection by performing a three-way open TCP handshake with the defined port. TCP 探查會終止與四向 TCP 交握的連線。TCP probes terminate a connection with a four-way close TCP handshake.

最小探查間隔為 5 秒,狀況不良的回應數目下限為 2。The minimum probe interval is 5 seconds and the minimum number of unhealthy responses is 2. 所有間隔的總持續時間不能超過 120 秒。The total duration of all intervals cannot exceed 120 seconds.

在以下情況,TCP 探查會失敗:A TCP probe fails when:

  • 執行個體上的 TCP 接聽程式在逾時期間完全沒有回應。The TCP listener on the instance doesn't respond at all during the timeout period. 探查會根據超時探查要求的數目標示為已關閉,這些要求已設定為在將探查標示為已關閉之前未獲得解答。A probe is marked down based on the number of timed-out probe requests, which were configured to go unanswered before marking down the probe.
  • 探查會從執行個體接收 TCP 重設。The probe receives a TCP reset from the instance.

以下說明如何在 Resource Manager 範本中表示這種探查設定:The following illustrates how you could express this kind of probe configuration in a Resource Manager template:

    {
      "name": "tcp",
      "properties": {
        "protocol": "Tcp",
        "port": 1234,
        "intervalInSeconds": 5,
        "numberOfProbes": 2
      },

HTTP / HTTPS 探查 HTTP / HTTPS probe

注意

HTTPS 探查僅適用於 Standard Load BalancerHTTPS probe is only available for Standard Load Balancer.

HTTP 和 HTTPS 探查建立在 TCP 探查的基礎上,並會發出含有指定路徑的 HTTP GET。HTTP and HTTPS probes build on the TCP probe and issue an HTTP GET with the specified path. 這兩個探查皆支援 HTTP GET 的相對路徑。Both of these probes support relative paths for the HTTP GET. HTTPS 探查就是加入傳輸層安全性 (TLS,之前稱為 SSL) 包裝函式的 HTTP 探查。HTTPS probes are the same as HTTP probes with the addition of a Transport Layer Security (TLS, formerly known as SSL) wrapper. 當執行個體在逾時期限內以 HTTP 狀態 200 回應時,健康情況探查會標示為已啟動。The health probe is marked up when the instance responds with an HTTP status 200 within the timeout period. 依預設,健康情況探查會每隔 15 秒嘗試檢查一次已設定的健康情況探查連接埠。The health probe attempts to check the configured health probe port every 15 seconds by default. 最小探查間隔為 5 秒。The minimum probe interval is 5 seconds. 所有間隔的總持續時間不能超過 120 秒。The total duration of all intervals cannot exceed 120 seconds.

如果探查埠也是服務本身的接聽程式,則 HTTP/HTTPS 探查也可以用來執行您自己的邏輯,以從負載平衡器迴圈中移除實例。HTTP / HTTPS probes can also be useful to implement your own logic to remove instances from load balancer rotation if the probe port is also the listener for the service itself. 例如,如果執行個體使用超過 90% CPU,並傳回非 200 HTTP 狀態,您可以決定移除執行個體。For example, you might decide to remove an instance if it's above 90% CPU and return a non-200 HTTP status.

注意

HTTPS 探查需要使用以整個鏈中的最小簽章雜湊為 SHA256 的憑證作為基礎。The HTTPS Probe requires the use of certificates based that have a minimum signature hash of SHA256 in the entire chain.

如果您使用雲端服務而且有使用 w3wp.exe 的 Web 角色,也能讓網站的監視自動化。If you use Cloud Services and have web roles that use w3wp.exe, you also achieve automatic monitoring of your website. 網站程式碼中的失敗,會將非 200 的狀態傳回給負載平衡器探查。Failures in your website code return a non-200 status to the load balancer probe.

在以下情況,HTTP / HTTPS 探查會失敗:An HTTP / HTTPS probe fails when:

  • 探查端點會傳回 200 以外的 HTTP 回應碼 (例如,403、404 或 500)。Probe endpoint returns an HTTP response code other than 200 (for example, 403, 404, or 500). 這會立即將健康情況探查標記為關閉。This will mark down the health probe immediately.
  • 探查端點在最少探查間隔和30秒的超時時間內都不會回應。Probe endpoint doesn't respond at all during the minimum of the probe interval and 30-second timeout period. 在探查被標示為非執行中之前,直到達到所有逾時時間間隔總和為止,這段時間內已有多個探查要求並未獲得回應。Multiple probe requests might go unanswered before the probe gets marked as not running and until the sum of all timeout intervals has been reached.
  • 探查端點會透過 TCP 重設關閉連線。Probe endpoint closes the connection via a TCP reset.

以下說明如何在 Resource Manager 範本中表示這種探查設定:The following illustrates how you could express this kind of probe configuration in a Resource Manager template:

    {
      "name": "http",
      "properties": {
        "protocol": "Http",
        "port": 80,
        "requestPath": "/",
        "intervalInSeconds": 5,
        "numberOfProbes": 2
      },
    {
      "name": "https",
      "properties": {
        "protocol": "Https",
        "port": 443,
        "requestPath": "/",
        "intervalInSeconds": 5,
        "numberOfProbes": 2
      },

客體代理程式探查 (僅限傳統)Guest agent probe (Classic only)

依預設,雲端服務角色 (背景工作角色和 Web 角色) 會使用客體代理程式進行探查監視。Cloud service roles (worker roles and web roles) use a guest agent for probe monitoring by default. 客體代理程式探查設定是最後的手段。A guest agent probe is a last resort configuration. 請一律明確地將健康情況探查搭配 TCP 或 HTTP 探查使用。Always use a health probe explicitly with a TCP or HTTP probe. 在大部分的應用程式案例中,客體代理程式探查的效果不如明確定義的探查。A guest agent probe is not as effective as explicitly defined probes for most application scenarios.

客體代理程式探查是虛擬機器內客體代理程式的檢查。A guest agent probe is a check of the guest agent inside the VM. 然後,只有在執行個體處於就緒狀態時,它才會接聽並以「HTTP 200 確定」回應。It then listens and responds with an HTTP 200 OK response only when the instance is in the Ready state. (其他狀態為忙碌、正在回收或正在停止。)(Other states are Busy, Recycling, or Stopping.)

如需詳細資訊,請查看設定適用於健全狀況探查的服務定義檔案 (csdef)開始為雲端服務建立公用負載平衡器For more information, see Configure the service definition file (csdef) for health probes or Get started by creating a public load balancer for cloud services.

如果客體代理程式無法以「HTTP 200 確定」回應,則負載平衡器會將執行個體標示為沒有回應。If the guest agent fails to respond with HTTP 200 OK, the load balancer marks the instance as unresponsive. 然後,它會停止將流量傳送到該執行個體。It then stops sending flows to that instance. 負載平衡器會繼續檢查執行個體。The load balancer continues to check the instance.

如果客體代理程式以 HTTP 200 回應,則負載平衡器會再次將新流量傳送到該執行個體。If the guest agent responds with an HTTP 200, the load balancer sends new flows to that instance again.

使用 Web 角色時,網站程式碼通常會在不受 Azure 網狀架構或客體代理程式監視的 w3wp.exe 中執行。When you use a web role, the website code typically runs in w3wp.exe, which isn't monitored by the Azure fabric or guest agent. 在 w3wp.exe 中的失敗 (例如 HTTP 500 回應) 不會向客體代理程式報告。Failures in w3wp.exe (for example, HTTP 500 responses) aren't reported to the guest agent. 因此,負載平衡器不會將該執行個體從循環中剔除。Consequently, the load balancer doesn't take that instance out of rotation.

探查行為Probe up behavior

TCP、HTTP 和 HTTPS 健康情況探查會被視為狀況良好,並在下列情況將後端端點標示為狀況良好:TCP, HTTP, and HTTPS health probes are considered healthy and mark the backend endpoint as healthy when:

  • 健康情況探查在 VM 開機時就成功。The health probe is successful once after the VM boots.
  • 已達到將後端端點標示為狀況良好所需的指定探查數目。The specified number of probes required to mark the backend endpoint as healthy has been achieved.

任何已達到健康狀態良好的後端端點都有資格接收新的流程。Any backend endpoint which has achieved a healthy state is eligible for receiving new flows.

注意

如果健康情況探查波動,負載平衡器會等待較長的時間,讓後端端點恢復正常狀態。If the health probe fluctuates, the load balancer waits longer before it puts the backend endpoint back in the healthy state. 此額外等候時間可保護使用者和基礎結構,且為刻意設計的原則。This extra wait time protects the user and the infrastructure and is an intentional policy.

探查關閉行為Probe down behavior

TCP 連線TCP connections

新的 TCP 連線會成功至剩餘的狀況良好後端端點。New TCP connections will succeed to remaining healthy backend endpoint.

如果後端端點的健康情況探查失敗,則會繼續建立與此後端端點的 TCP 連線。If a backend endpoint's health probe fails, established TCP connections to this backend endpoint continue.

如果後端集區中所有執行個體的所有探查都失敗,則不會有任何新流量傳送至後端集區。If all probes for all instances in a backend pool fail, no new flows will be sent to the backend pool. Standard Load Balancer 將允許已建立的 TCP 流量繼續。Standard Load Balancer will permit established TCP flows to continue. Basic Load Balancer 會終止後端集區的所有現有 TCP 流量。Basic Load Balancer will terminate all existing TCP flows to the backend pool.

Load Balancer 是一種穿透服務 (不會終止 TCP 連線),且流程一律介於用戶端和虛擬機器的客體作業系統與應用程式之間。Load Balancer is a pass through service (does not terminate TCP connections) and the flow is always between the client and the VM's guest OS and application. 具有所有探查關閉的集區會導致前端無法回應 (SYN) 的 TCP 連線開啟嘗試,因為沒有狀況良好的後端端點可接收流程,並以 SYN ACK 回應。A pool with all probes down will cause a frontend to not respond to TCP connection open attempts (SYN) as there is no healthy backend endpoint to receive the flow and respond with an SYN-ACK.

UDP 資料包UDP datagrams

UDP 資料包將會傳遞至狀況良好的後端端點。UDP datagrams will be delivered to healthy backend endpoints.

UDP 是不需連線的,因此 UDP 沒有流程狀態追蹤。UDP is connectionless and there is no flow state tracked for UDP. 如果任何後端端點的健康情況探查失敗,現有的 UDP 流量將會移至後端集區中另一個狀況良好的實例。If any backend endpoint's health probe fails, existing UDP flows will move to another healthy instance in the backend pool.

如果後端集區中所有執行個體的所有探查都失敗,則 Basic 和 Standard Load Balancer 的現有 UDP 流量將會終止。If all probes for all instances in a backend pool fail, existing UDP flows will terminate for Basic and Standard Load Balancers.

探查來源 IP 位址Probe source IP address

Load Balancer 會為其內部健康情況模型使用分散式探查服務。Load Balancer uses a distributed probing service for its internal health model. 探查服務駐留在 VM 所在的每個主機上,可以視需要透過程式設計方式來為每個客戶的組態產生健康情況探查。The probing service resides on each host where VMs and can be programmed on-demand to generate health probes per the customer's configuration. 健康情況探查流量會直接在產生健康情況探查的探查服務和客戶虛擬機器之間產生。The health probe traffic is directly between the probing service that generates the health probe and the customer VM. 所有 Load Balancer 健康情況探查都源自作為其來源的 IP 位址 168.63.129.16。All Load Balancer health probes originate from the IP address 168.63.129.16 as their source. 您可以在非 RFC1918 空間的 VNet 內部使用 IP 位址空間。You can use IP address space inside of a VNet that is not RFC1918 space. 使用全域保留且為 Microsoft 擁有的 IP 位址,會減少 IP 位址與您在 VNet 內所用的 IP 位址空間發生衝突的機會。Using a globally reserved, Microsoft owned IP address reduces the chance of an IP address conflict with the IP address space you use inside the VNet. 此 IP 位址在所有區域內都是相同的,不會改變,而且因為只有內部 Azure 平台元件會從此 IP 位址獲得封包,所以此位址並不會有安全性風險。This IP address is the same in all regions and does not change and is not a security risk because only the internal Azure platform component can source a packet from this IP address.

AzureLoadBalancer 服務標籤會在您的網路安全性群組中識別此來源 IP 位址,且預設即會許可健康情況探查的流量。The AzureLoadBalancer service tag identifies this source IP address in your network security groups and permits health probe traffic by default.

除了 Load Balancer 健康情況探查之外, 下列作業也會使用此 IP 位址In addition to Load Balancer health probes, the following operations use this IP address:

  • 啟用 VM 代理程式來與平台通訊,藉此表示它處於「就緒」狀態Enables the VM Agent to communicating with the platform to signal it is in a “Ready” state
  • 啟用與 DNS 虛擬伺服器的通訊,以提供篩選後的名稱解析給未定義自訂 DNS 伺服器的客戶。Enables communication with the DNS virtual server to provide filtered name resolution to customers that do not define custom DNS servers. 此篩選可確保客戶只可以解析其部署的主機名稱。This filtering ensures that customers can only resolve the hostnames of their deployment.
  • 允許 VM 從 Azure 中的 DHCP 服務取得動態 IP 位址。Enables the VM to obtain a dynamic IP address from the DHCP service in Azure.

設計指引Design guidance

健康情況探查用於讓您的服務具有復原性,並讓服務得以調整級別。Health probes are used to make your service resilient and allow it to scale. 錯誤的設定或不良的設計模式,都可能會影響您服務的可用性和延展性。A misconfiguration or bad design pattern can impact the availability and scalability of your service. 請詳讀整份文件,想一想當這此探查被標示為開啟或關閉時,會對您的案例造成何種影響,又會如何影響您應用程式的可用性。Review this entire document and consider what the impact to your scenario is when this probe response is marked down or marked up, and how it impacts the availability of your application scenario.

當您設計應用程式的健全狀況模型時,您應該探查後端端點上的埠,以反映該實例的健康情況, 以及 您所提供的應用程式服務。When you design the health model for your application, you should probe a port on a backend endpoint that reflects the health of that instance and the application service you are providing. 應用程式連接埠和探查連接埠不一定要相同。The application port and the probe port are not required to be the same. 在某些情況下,探查連接埠可能需要與您應用程式提供服務的連接埠不同。In some scenarios, it may be desirable for the probe port to be different than the port your application provides service on.

有時候對您的應用程式而言,產生健康情況探查回應不僅會偵測到應用程式的健康情況,且無論您的執行個體是否應接收新流量,還能將訊號直接發送給 Load Balancer,這點相當有用。Sometimes it can be useful for your application to generate a health probe response to not only detect your application health, but also signal directly to Load Balancer whether your instance should receive or not receive new flows. 您可以運用探查回應讓應用程式得以建立背壓,並透過讓健康情況探查失敗,或是準備好對應用程式進行維護並開始清空案例的作法,將傳遞到執行個體的新流量加以節流。You can manipulate the probe response to allow your application to create backpressure and throttle delivery of new flows to an instance by failing the health probe or prepare for maintenance of your application and initiate draining your scenario. 使用 Standard Load Balancer 時,探查失敗訊號一律會允許 TCP 流量繼續,直到閒置逾時或連線關閉為止。When using Standard Load Balancer, a probe down signal will always allow TCP flows to continue until idle timeout or connection closure.

針對 UDP 負載平衡,您應該從後端端點產生自訂健康情況探查信號,並使用 TCP、HTTP 或 HTTPS 健康情況探查,並以對應的接聽程式為目標,以反映 UDP 應用程式的健康情況。For UDP load balancing, you should generate a custom health probe signal from the backend endpoint and use either a TCP, HTTP, or HTTPS health probe targeting the corresponding listener to reflect the health of your UDP application.

使用 HA 連接埠負載平衡規則搭配 Standard Load Balancer 時,所有連接埠都會進行負載平衡,而單一健康情況探查的回應則必須反映出整個執行個體的狀態。When using HA Ports load-balancing rules with Standard Load Balancer, all ports are load balanced and a single health probe response must reflect the status of the entire instance.

請勿透過接受健康情況探查的執行個體,將健康情況探查轉化或通過 Proxy 處理至 VNet 中的另一個執行個體,因為此設定可能導致您的案例發生連鎖性失敗。Do not translate or proxy a health probe through the instance that receives the health probe to another instance in your VNet as this configuration can lead to cascading failures in your scenario. 請設想一下以下狀況:有一組第三方設備部署在 Load Balancer 資源的後端集區中,藉以提供設備所需的規模和備援能力,且健康情況探查會設定為負責探查第三方設備通過 Proxy 處理或轉化至設備背後的其他虛擬機器時所使用的連接埠。Consider the following scenario: a set of third-party appliances is deployed in the backend pool of a Load Balancer resource to provide scale and redundancy for the appliances and the health probe is configured to probe a port that the third-party appliance proxies or translates to other virtual machines behind the appliance. 如果您探查的連接埠相同於您要用來將要求轉化或通過 Proxy 處理至設備背後其他虛擬機器的連接埠,任何來自設備背後單一虛擬機器的探查回應都會將設備本身標記為無作用。If you probe the same port you are using to translate or proxy requests to the other virtual machines behind the appliance, any probe response from a single virtual machine behind the appliance will mark the appliance itself dead. 這項設定可能會導致整個應用程式案例的階層式失敗,因為設備背後的單一後端端點。This configuration can lead to a cascading failure of the entire application scenario as a result of a single backend endpoint behind the appliance. 觸發程序可能會是間歇性探查失敗,而會造成 Load Balancer 將原始目的地 (應用程式執行個體) 標示為關閉,接著會停用您整個應用程式案例。The trigger can be an intermittent probe failure that will cause Load Balancer to mark down the original destination (the appliance instance) and in turn can disable your entire application scenario. 請改為探查設備本身的健康情況。Probe the health of the appliance itself instead. 挑選要以何種探查來判定健康情況訊號,對網路虛擬設備 (NVA) 案例而言是一大考量,請務必諮詢您的應用程式供應商,確定什麼健康情況訊號適合此類案例。The selection of the probe to determine the health signal is an important consideration for network virtual appliances (NVA) scenarios and you must consult your application vendor for what the appropriate health signal is for such scenarios.

如果您不允許防火牆原則中有此探查的來源 IP,則健康情況探查將會失敗,因為它無法接觸您的執行個體。If you don't allow the source IP of the probe in your firewall policies, the health probe will fail as it is unable to reach your instance. 接著,Load Balancer 會因為健康情況探查失敗而將您的執行個體標示為已關閉。In turn, Load Balancer will mark down your instance due to the health probe failure. 此種設定會造成負載已平衡的應用程式案例失敗。This misconfiguration can cause your load balanced application scenario to fail.

為了讓 Load Balancer 的健康情況探查將您的執行個體標示為已開啟,您 必須 在任何 Azure 網路安全性群組和本機防火牆原則中允許此 IP 位址。For Load Balancer's health probe to mark up your instance, you must allow this IP address in any Azure network security groups and local firewall policies. 預設中,每個網路安全性群組皆含有服務標籤 AzureLoadBalancer,以許可健康情況探查流量。By default, every network security group includes the service tag AzureLoadBalancer to permit health probe traffic.

如果您想要測試健康情況探查的失敗,或將個別的執行個體標示為已關閉,您可以使用網路安全性群組明確封鎖健康情況探查 (目的地連接埠或來源 IP),並模擬探查失敗。If you wish to test a health probe failure or mark down an individual instance, you can use a network security groups to explicitly block the health probe (destination port or source IP) and simulate the failure of a probe.

請勿使用 Microsoft 所擁有的 IP 位址範圍 (含 168.63.129.16 在內) 來設定您的 VNet。Do not configure your VNet with the Microsoft owned IP address range that contains 168.63.129.16. 這種設定會與健康情況探查的 IP 位址相衝突,而可能導致您的案例失敗。Such configurations will collide with the IP address of the health probe and can cause your scenario to fail.

如果您的 VM 上有多個介面,您必須確保在您接收探查的介面上進行回應。If you have multiple interfaces on your VM, you need to insure you respond to the probe on the interface you received it on. 您可能需要在每個介面上將來源網路位址在虛擬機器中轉換成此位址。You may need to source network address translate this address in the VM on a per interface basis.

請勿啟用 TCP 時間戳記Do not enable TCP timestamps. 啟用 TCP 時間戳記可能會導致健康情況探查失敗,因為 VM 的來賓 OS TCP 堆疊會捨棄 TCP 封包,進而導致 Load Balancer 標示個別的端點。Enabling TCP timestamps can cause health probes to fail due to TCP packets being dropped by the VM's guest OS TCP stack, which results in Load Balancer marking down the respective endpoint. TCP 時間戳記依預設會在安全性強化的虛擬機器映像上定期啟用,而必須停用。TCP timestamps are routinely enabled by default on security hardened VM images and must be disabled.

監視Monitoring

公用和內部 Standard Load Balancer 都會透過 Azure 監視器,將每個端點和後端端點的健康情況探查狀態公開為多維度計量。Both public and internal Standard Load Balancer expose per endpoint and backend endpoint health probe status as multi-dimensional metrics through Azure Monitor. 這些計量可供其他 Azure 服務或合作夥伴應用程式使用。These metrics can be consumed by other Azure services or partner applications.

基本公用 Load Balancer 會透過 Azure 監視器記錄公開每個後端集區摘要的健康情況探查狀態。Basic public Load Balancer exposes health probe status summarized per backend pool via Azure Monitor logs. 內部基本負載平衡器無法使用 Azure 監視器記錄。Azure Monitor logs are not available for internal Basic Load Balancers. 您可以使用 Azure 監視器記錄 來檢查公用負載平衡器探查健全狀況狀態和探查計數。You can use Azure Monitor logs to check on the public load balancer probe health status and probe count. 記錄可以與 Power BI 或 Azure Operation Insights 搭配使用,以提供負載平衡器健康狀態。Logging can be used with Power BI or Azure Operational Insights to provide statistics about load balancer health status.

限制Limitations

  • HTTPS 探查不支援使用用戶端憑證進行相互驗證。HTTPS probes do not support mutual authentication with a client certificate.
  • 您應該假設在啟用 TCP 時間戳記時,健康情況探查將會失敗。You should assume Health probes will fail when TCP timestamps are enabled.

後續步驟Next steps