實作斷路器模式Implement the Circuit Breaker pattern

如稍早所述,您應該處理復原所需的時間可能不確定的錯誤,當您嘗試連接到遠端服務或資源時,可能會發生此錯誤。As noted earlier, you should handle faults that might take a variable amount of time to recover from, as might happen when you try to connect to a remote service or resource. 處理這種類型的錯誤可改善應用程式的穩定性和復原。Handling this type of fault can improve the stability and resiliency of an application.

在分散式環境中,呼叫遠端資源和服務可能會由於暫時性錯誤 (例如網路連線太慢和逾時),或是資源回應緩慢或暫時無法使用而失敗。In a distributed environment, calls to remote resources and services can fail due to transient faults, such as slow network connections and timeouts, or if resources are responding slowly or are temporarily unavailable. 這些錯誤通常會在很短的時間內自我修正,而且強大的雲端應用程式應該能夠使用「重試模式」之類的策略來處理錯誤。These faults typically correct themselves after a short time, and a robust cloud application should be prepared to handle them by using a strategy like the "Retry pattern".

不過有時候,錯誤是由於非預期的事件所致,可能需要更長時間來修正。However, there can also be situations where faults are due to unanticipated events that might take much longer to fix. 這些錯誤的嚴重性範圍包含失去部分連線到整個服務無法運作。These faults can range in severity from a partial loss of connectivity to the complete failure of a service. 在這些情況下,讓應用程式持續重試不太可能會成功的作業可能毫無意義。In these situations, it might be pointless for an application to continually retry an operation that's unlikely to succeed.

相反地,應用程式應該設計成接受作業失敗並據以處理失敗。Instead, the application should be coded to accept that the operation has failed and handle the failure accordingly.

隨意使用 Http 重試可能會導致在您自己的軟體內產生拒絕服務 (DoS) 的攻擊。Using Http retries carelessly could result in creating a Denial of Service (DoS) attack within your own software. 由於微服務失敗或執行緩慢,因此多個用戶端可能會重複重試失敗的要求。As a microservice fails or performs slowly, multiple clients might repeatedly retry failed requests. 這會帶來以失敗服務為目標的流量呈指數增加的危險風險。That creates a dangerous risk of exponentially increasing traffic targeted at the failing service.

因此,您需要某種形式的防禦屏障,在不值得繼續嘗試時停止多餘的要求。Therefore, you need some kind of defense barrier so that excessive requests stop when it isn't worth to keep trying. 該防禦屏障正是斷路器。That defense barrier is precisely the circuit breaker.

斷路器模式的目的與「重試模式」不同。The Circuit Breaker pattern has a different purpose than the "Retry pattern". 「重試模式」可讓應用程式重試作業,期望該作業最終會成功。The "Retry pattern" enables an application to retry an operation in the expectation that the operation will eventually succeed. 斷路器模式可防止應用程式執行可能失敗的作業。The Circuit Breaker pattern prevents an application from performing an operation that's likely to fail. 應用程式可以結合這兩種模式。An application can combine these two patterns. 不過,重試邏輯應該會受到斷路器所傳回之任何例外狀況的影響,而且如果斷路器指出錯誤不是暫時的,就應該放棄重試嘗試。However, the retry logic should be sensitive to any exception returned by the circuit breaker, and it should abandon retry attempts if the circuit breaker indicates that a fault is not transient.

實現帶IHttpClientFactory與波莉的斷路器模式Implement Circuit Breaker pattern with IHttpClientFactory and Polly

與實現重試時一樣,斷路器推薦的方法是利用經過驗證的 .NET 庫(如 Polly)及其與IHttpClientFactory的本機集成。As when implementing retries, the recommended approach for circuit breakers is to take advantage of proven .NET libraries like Polly and its native integration with IHttpClientFactory.

將斷路器策略添加到IHttpClientFactory傳出中間件管道中非常簡單,只需將單個增量代碼添加到IHttpClientFactory使用 時已有的代碼。Adding a circuit breaker policy into your IHttpClientFactory outgoing middleware pipeline is as simple as adding a single incremental piece of code to what you already have when using IHttpClientFactory.

此處用於 HTTP 呼叫重試之程式碼的唯一新增項目,就是用來將斷路器原則新增至所要使用之原則清單的程式碼,如下列程式碼增量所示 (屬於 ConfigureServices() 方法一部分)。The only addition here to the code used for HTTP call retries is the code where you add the Circuit Breaker policy to the list of policies to use, as shown in the following incremental code, part of the ConfigureServices() method.

//ConfigureServices()  - Startup.cs
services.AddHttpClient<IBasketService, BasketService>()
        .SetHandlerLifetime(TimeSpan.FromMinutes(5))  //Sample. Default lifetime is 2 minutes

AddPolicyHandler() 方法會將原則新增至您將使用的 HttpClient 物件。The AddPolicyHandler() method is what adds policies to the HttpClient objects you'll use. 在此案例中,它會為斷路器新增 Polly 原則。In this case, it's adding a Polly policy for a circuit breaker.

為了有更模組化的方法,可在名為 GetCircuitBreakerPolicy() 的個別方法中定義斷路器原則,如下列程式碼所示:To have a more modular approach, the Circuit Breaker Policy is defined in a separate method called GetCircuitBreakerPolicy(), as shown in the following code:

static IAsyncPolicy<HttpResponseMessage> GetCircuitBreakerPolicy()
    return HttpPolicyExtensions
        .CircuitBreakerAsync(5, TimeSpan.FromSeconds(30));

在上述程式碼範例中,斷路器原則設定為在重試 Http 要求時,若連續五次失敗,則會中斷或開啟網路。In the code example above, the circuit breaker policy is configured so it breaks or opens the circuit when there have been five consecutive faults when retrying the Http requests. 發生此情況時,網路會中斷 30 秒:在這段期間,斷路器會立即使呼叫失敗,而不是實際發出。When that happens, the circuit will break for 30 seconds: in that period, calls will be failed immediately by the circuit-breaker rather than actually be placed. 原則會自動將相關的例外狀況和 HTTP 狀態碼解釋為錯誤。The policy automatically interprets relevant exceptions and HTTP status codes as faults.

如果您部署在與執行 HTTP 呼叫的用戶端應用程式或服務不同之環境中的特定資源發生問題,也應該使用斷路器將要求重新導向至後援基礎結構。Circuit breakers should also be used to redirect requests to a fallback infrastructure if you had issues in a particular resource that's deployed in a different environment than the client application or service that's performing the HTTP call. 這樣一來,如果資料中心的中斷只會影響您的後端微服務,但不會影響您的用戶端應用程式,用戶端應用程式就可以重新導向至後援服務。That way, if there's an outage in the datacenter that impacts only your backend microservices but not your client applications, the client applications can redirect to the fallback services. Polly 正在規劃新的原則來自動化此容錯移轉原則案例。Polly is planning a new policy to automate this failover policy scenario.

相對於由 Azure 為您自動管理,上述所有功能適用於從 .NET 程式碼管理容錯移轉的情況,並提供位置透明度。All those features are for cases where you're managing the failover from within the .NET code, as opposed to having it managed automatically for you by Azure, with location transparency.

從使用的角度來看,使用 HttpClient 時,無需在此處添加任何新內容,因為代碼HttpClient``IHttpClientFactory與 使用 時的代碼相同,如前幾節所示。From a usage point of view, when using HttpClient, there’s no need to add anything new here because the code is the same than when using HttpClient with IHttpClientFactory, as shown in previous sections.

在 eShopOnContainers 中測試 Http 重試和斷路器Test Http retries and circuit breakers in eShopOnContainers

只要您在 Docker 主機中啟動 eShopOnContainers 解決方案,就需要啟動多個容器。Whenever you start the eShopOnContainers solution in a Docker host, it needs to start multiple containers. 某些容器的啟動和初始化速度會比較慢,例如 SQL Server 容器。Some of the containers are slower to start and initialize, like the SQL Server container. 特別是當您第一次將 eShopOnContainers 應用程式部署至 Docker 時更是如此,因為需要設定映像和資料庫。This is especially true the first time you deploy the eShopOnContainers application into Docker because it needs to set up the images and the database. 某些容器的啟動速度會比其他容器慢,而導致其餘服務一開始擲回 HTTP 例外狀況,即使在 docker-compose 層級設定容器之間的相依性亦然,如先前章節所述。The fact that some containers start slower than others can cause the rest of the services to initially throw HTTP exceptions, even if you set dependencies between containers at the docker-compose level, as explained in previous sections. 容器之間的這些 docker-compose 相依性只會在處理序層。Those docker-compose dependencies between containers are just at the process level. 容器的入口點進程可能已啟動,但 SQL Server 可能尚未準備好進行查詢。The container's entry point process might be started, but SQL Server might not be ready for queries. 結果可能是一連串的錯誤,而且嘗試取用該特定容器時,應用程式可能會收到例外狀況。The result can be a cascade of errors, and the application can get an exception when trying to consume that particular container.

當應用程式正在部署至雲端時,您也可能會在啟動時看到這種錯誤類型。You might also see this type of error on startup when the application is deploying to the cloud. 在這種情況下,協調器在平衡群集節點上的容器數量時,可能會將容器從一個節點或 VM 移動到另一個節點(即啟動新實例)。In that case, orchestrators might be moving containers from one node or VM to another (that is, starting new instances) when balancing the number of containers across the cluster's nodes.

啟動所有容器時,'eShopOnContainers' 解決這些問題的方法是使用稍早所述的重試模式。The way 'eShopOnContainers' solves those issues when starting all the containers is by using the Retry pattern illustrated earlier.

在 eShopOnContainers 中測試斷路器Test the circuit breaker in eShopOnContainers

您可以透過幾個方法來中斷/開啟網路,並使用 eShopOnContainers 進行測試。There are a few ways you can break/open the circuit and test it with eShopOnContainers.

一個選項是將斷路器原則中允許的重試次數減少為 1,然後將整個解決方案重新部署至 Docker。One option is to lower the allowed number of retries to 1 in the circuit breaker policy and redeploy the whole solution into Docker. 使用單一重試,HTTP 要求很有可能會在部署期間失敗,此時斷路器會開啟,而且您會收到錯誤。With a single retry, there's a good chance that an HTTP request will fail during deployment, the circuit breaker will open, and you get an error.

另一個選項是使用在購物籃微服務中實作的自訂中介軟體。Another option is to use custom middleware that's implemented in the Basket microservice. 啟用此中介軟體時,它會攔截所有的 HTTP 要求並傳回狀態碼 500。When this middleware is enabled, it catches all HTTP requests and returns status code 500. 您可以藉由提出失敗 URI 的 GET 要求來啟用中介軟體,如下所示:You can enable the middleware by making a GET request to the failing URI, like the following:

  • GET http://localhost:5103/failing
    此要求會傳回中介軟體的目前狀態。This request returns the current state of the middleware. 如果啟用中介軟體,要求會傳回狀態碼 500。If the middleware is enabled, the request return status code 500. 如果停用中介軟體,則沒有回應。If the middleware is disabled, there's no response.

  • GET http://localhost:5103/failing?enable
    此要求會啟用中介軟體。This request enables the middleware.

  • GET http://localhost:5103/failing?disable
    此要求會停用中介軟體。This request disables the middleware.

例如,應用程式開始執行之後,您可以在任何瀏覽器中使用下列 URI 提出要求,來啟用中介軟體。For instance, once the application is running, you can enable the middleware by making a request using the following URI in any browser. 請注意,訂購微服務會使用連接埠 5103。Note that the ordering microservice uses port 5103.


您可以接著使用 URI http://localhost:5103/failing 來檢查狀態,如圖 8-5 所示。You can then check the status using the URI http://localhost:5103/failing, as shown in Figure 8-5.


圖 8-5Figure 8-5. 檢查"失敗"ASP.NET中間件的狀態 – 在這種情況下,禁用。Checking the state of the "Failing" ASP.NET middleware – In this case, disabled.

此時,只要您呼叫/叫用它,購物籃微服務就會以狀態碼 500 回應。At this point, the Basket microservice responds with status code 500 whenever you call invoke it.

中介軟體開始執行之後,您可以嘗試從 MVC Web 應用程式下訂單。Once the middleware is running, you can try making an order from the MVC web application. 由於要求失敗,因此會開啟網路。Because the requests fail, the circuit will open.

在下列範例中,您可以看到 MVC Web 應用程式在下訂單的邏輯中有一個 catch 區塊。In the following example, you can see that the MVC web application has a catch block in the logic for placing an order. 如果程式碼攔截到開路例外狀況,它會向使用者顯示易懂訊息以通知他們等候。If the code catches an open-circuit exception, it shows the user a friendly message telling them to wait.

public class CartController : Controller
    public async Task<IActionResult> Index()
            var user = _appUserParser.Parse(HttpContext.User);
            //Http requests using the Typed Client (Service Agent)
            var vm = await _basketSvc.GetBasket(user);
            return View(vm);
        catch (BrokenCircuitException)
            // Catches error when Basket.api is in circuit-opened mode
        return View();

    private void HandleBrokenCircuitException()
        TempData["BasketInoperativeMsg"] = "Basket Service is inoperative, please try later on. (Business message due to Circuit-Breaker)";

摘要如下。Here's a summary. 重試原則嘗試提出 HTTP 要求幾次,並收到 HTTP 錯誤。The Retry policy tries several times to make the HTTP request and gets HTTP errors. 當重試次數達到針對斷路器原則設定的最大數目時 (在本例中為 5),應用程式會擲回 BrokenCircuitException。When the number of retries reaches the maximum number set for the Circuit Breaker policy (in this case, 5), the application throws a BrokenCircuitException. 結果是易懂訊息,如圖 8-6 所示。The result is a friendly message, as shown in Figure 8-6.

MVC Web 應用程式的螢幕截圖,帶有購物籃服務失效錯誤。

圖 8-6Figure 8-6. 斷路器傳回錯誤至 UICircuit breaker returning an error to the UI

您可以實作何時開啟/中斷網路的不同邏輯。You can implement different logic for when to open/break the circuit. 或者,如有後援資料中心或備援後端系統,您可以嘗試對不同後端微服務提出 HTTP 要求。Or you can try an HTTP request against a different back-end microservice if there's a fallback datacenter or redundant back-end system.

最後,CircuitBreakerPolicy 的另一個可能性是使用 Isolate (這會強制開啟並保持開啟網路) 和 Reset (這會再次將它關閉)。Finally, another possibility for the CircuitBreakerPolicy is to use Isolate (which forces open and holds open the circuit) and Reset (which closes it again). 這些可用來建置公用程式 HTTP 端點,以直接在原則上叫用 Isolate 和 Reset。These could be used to build a utility HTTP endpoint that invokes Isolate and Reset directly on the policy. 您也可以在生產環境中使用這類 HTTP 端點 (經過適當保護),來暫時隔離下游系統,例如當您想要將它升級時。Such an HTTP endpoint could also be used, suitably secured, in production for temporarily isolating a downstream system, such as when you want to upgrade it. 或者,它可以手動啟動網路,來保護疑似故障的下游系統。Or it could trip the circuit manually to protect a downstream system you suspect to be faulting.

其他資源Additional resources