沒有快取的反模式No Caching antipattern

反模式是常見的設計瑕疵,在壓力的情況下可能會中斷您的軟體或應用程式,而且不應該被忽略。Anti-patterns are common design flaws that can break your software or applications under stress situations and should not be overlooked. 當處理許多並行要求的雲端應用程式重複提取相同的資料時,就不會發生快取 反模式A no caching antipattern occurs when a cloud application that handles many concurrent requests, repeatedly fetches the same data. 這樣可以減少效能和擴充性。This can reduce performance and scalability.

如果沒有進行資料快取,可能會造成許多非預期行為,包括:When data is not cached, it can cause a number of undesirable behaviors, including:

  • 從存取成本高 (I/O 額外負荷或延遲方面) 的資源中重複擷取相同資訊。Repeatedly fetching the same information from a resource that is expensive to access, in terms of I/O overhead or latency.
  • 重複為多個要求建構相同物件或資料結構。Repeatedly constructing the same objects or data structures for multiple requests.
  • 對有服務配額及會對用戶端進行節流 (如果超過特定限制) 的遠端服務進行過多呼叫。Making excessive calls to a remote service that has a service quota and throttles clients past a certain limit.

而這些問題可能會導致回應時間不佳、資料存放區的競爭增加和延展性不佳。In turn, these problems can lead to poor response times, increased contention in the data store, and poor scalability.

無快取反模式的範例Examples of no caching antipattern

下列範例會使用 Entity Framework 來連接到資料庫。The following example uses Entity Framework to connect to a database. 每個用戶端要求都會產生傳送至資料庫的呼叫,即使有多個要求都擷取完全相同的資料。Every client request results in a call to the database, even if multiple requests are fetching exactly the same data. 重複要求的成本 (在 I/O 額外負荷與資料存取費用方面) 會快速累積。The cost of repeated requests, in terms of I/O overhead and data access charges, can accumulate quickly.

public class PersonRepository : IPersonRepository
{
    public async Task<Person> GetAsync(int id)
    {
        using (var context = new AdventureWorksContext())
        {
            return await context.People
                .Where(p => p.Id == id)
                .FirstOrDefaultAsync()
                .ConfigureAwait(false);
        }
    }
}

您可以在這裡找到完整的範例。You can find the complete sample here.

此反模式會發生通常是因為:This antipattern typically occurs because:

  • 不使用快取會更容易實作,並在低負載下可以正常運作。Not using a cache is simpler to implement, and it works fine under low loads. 快取會讓程式碼更複雜。Caching makes the code more complicated.
  • 不是很清楚使用快取的優點和缺點。The benefits and drawbacks of using a cache are not clearly understood.
  • 顧慮到維護快取資料精確度和最新狀態所造成的額外負荷。There is concern about the overhead of maintaining the accuracy and freshness of cached data.
  • 應用程式已從內部部署系統進行移轉,其中網路延遲並不是問題,且系統已在成本高且效能高的硬體上執行,因此快取並不在原始設計的考量中。An application was migrated from an on-premises system, where network latency was not an issue, and the system ran on expensive high-performance hardware, so caching wasn't considered in the original design.
  • 開發人員不清楚將快取用在特定案例中的價值。Developers aren't aware that caching is a possibility in a given scenario. 例如,開發人員可能不會想到在實作 Web API 時,使用 ETag。For example, developers may not think of using ETags when implementing a web API.

如何修正無快取反模式How to fix the no caching antipattern

最常用的快取策略是「隨選」或「另行快取」策略。The most popular caching strategy is the on-demand or cache-aside strategy.

  • 進行讀取時,應用程式會嘗試從快取讀取資料。On read, the application tries to read the data from the cache. 如果資料不在快取中,則應用程式會從資料來源擷取資料,並將它新增至快取。If the data isn't in the cache, the application retrieves it from the data source and adds it to the cache.
  • 進行寫入時,應用程式會直接將變更寫入資料來源,並從快取中移除舊的值。On write, the application writes the change directly to the data source and removes the old value from the cache. 下一次需要這些資料時,資料會被擷取並新增至快取中。It will be retrieved and added to the cache the next time it is required.

此方法適用於經常變更的資料。This approach is suitable for data that changes frequently. 以下是上一個範例的更新,改為使用另行快取模式。Here is the previous example updated to use the Cache-Aside pattern.

public class CachedPersonRepository : IPersonRepository
{
    private readonly PersonRepository _innerRepository;

    public CachedPersonRepository(PersonRepository innerRepository)
    {
        _innerRepository = innerRepository;
    }

    public async Task<Person> GetAsync(int id)
    {
        return await CacheService.GetAsync<Person>("p:" + id, () => _innerRepository.GetAsync(id)).ConfigureAwait(false);
    }
}

public class CacheService
{
    private static ConnectionMultiplexer _connection;

    public static async Task<T> GetAsync<T>(string key, Func<Task<T>> loadCache, double expirationTimeInMinutes)
    {
        IDatabase cache = Connection.GetDatabase();
        T value = await GetAsync<T>(cache, key).ConfigureAwait(false);
        if (value == null)
        {
            // Value was not found in the cache. Call the lambda to get the value from the database.
            value = await loadCache().ConfigureAwait(false);
            if (value != null)
            {
                // Add the value to the cache.
                await SetAsync(cache, key, value, expirationTimeInMinutes).ConfigureAwait(false);
            }
        }
        return value;
    }
}

請注意,GetAsync 方法現在會呼叫CacheService 類別,而不是直接呼叫資料庫。Notice that the GetAsync method now calls the CacheService class, rather than calling the database directly. CacheService 類別會先嘗試從 Azure Cache for Redis 中取得項目。The CacheService class first tries to get the item from Azure Cache for Redis. 如果快取中找不到該值,則 CacheService 會叫用由呼叫端傳遞給它的匿名函式。If the value isn't found in the cache, the CacheService invokes a lambda function that was passed to it by the caller. 匿名函式會負責從資料庫擷取資料。The lambda function is responsible for fetching the data from the database. 這項實作會將存放庫從特定快取解決方案中分離,以及從資料庫中分離 CacheServiceThis implementation decouples the repository from the particular caching solution, and decouples the CacheService from the database.

快取策略的考慮Considerations for caching strategy

  • 如果無法使用快取,可能是因為暫時性錯誤,請勿將錯誤傳回用戶端。If the cache is unavailable, perhaps because of a transient failure, don't return an error to the client. 相反地,請從原始資料來源擷取資料。Instead, fetch the data from the original data source. 但是,請注意,當快取正在復原時,原始資料存放區可能忙於處理要求,因而導致逾時和連線失敗。However, be aware that while the cache is being recovered, the original data store could be swamped with requests, resulting in timeouts and failed connections. (畢竟,這是一個優先使用快取的動機。)使用斷路器模式等技術可避免資料來源癱瘓。(After all, this is one of the motivations for using a cache in the first place.) Use a technique such as the Circuit Breaker pattern to avoid overwhelming the data source.

  • 快取動態資料的應用程式應該設計成支援最終一致性。Applications that cache dynamic data should be designed to support eventual consistency.

  • 針對 Web API,您可以透過在要求和回應訊息中包含快取控制標頭,以及使用 ETag 識別物件版本,來支援用戶端快取。For web APIs, you can support client-side caching by including a Cache-Control header in request and response messages, and using ETags to identify versions of objects. 如需詳細資訊,請參閱 API 實作For more information, see API implementation.

  • 您不需要快取整個實體。You don't have to cache entire entities. 如果大部分的實體是靜態,只有小部分會頻繁地變更,則快取靜態項目,並從資料來源擷取動態項目。If most of an entity is static but only a small piece changes frequently, cache the static elements and retrieve the dynamic elements from the data source. 此方法有助於減少針對資料來源執行的 I/O 數量。This approach can help to reduce the volume of I/O being performed against the data source.

  • 在某些情況下,如果變動性資料是短期的,則快取該資料可能十分實用。In some cases, if volatile data is short-lived, it can be useful to cache it. 例如,試想持續傳送狀態更新的裝置。For example, consider a device that continually sends status updates. 這就很適合在此資訊到達時就加以快取,根本無須將該資料寫入永久存放區。It might make sense to cache this information as it arrives, and not write it to a persistent store at all.

  • 為防止資料過時,許多快取解決方案都可支援設定到期日,因此在指定的間隔之後,資料就會自動從快取中移除。To prevent data from becoming stale, many caching solutions support configurable expiration periods, so that data is automatically removed from the cache after a specified interval. 您可能需要根據您的情況調整到期時間。You may need to tune the expiration time for your scenario. 比起可能會快速過期的變動性資料,高度靜態的資料可以在快取中停留較長時間。Data that is highly static can stay in the cache for longer periods than volatile data that may become stale quickly.

  • 如果快取解決方案未提供內建到期日,您可能需要實作會偶爾清除快取的背景處理程序,以避免快取資料無限成長。If the caching solution doesn't provide built-in expiration, you may need to implement a background process that occasionally sweeps the cache, to prevent it from growing without limits.

  • 除了從外部資料來源快取資料,您也可以使用快取來儲存複雜的計算結果。Besides caching data from an external data source, you can use caching to save the results of complex computations. 但是在您這麼做之前,請檢測應用程式以判斷應用程式是否真的是與 CPU 繫結。Before you do that, however, instrument the application to determine whether the application is really CPU bound.

  • 在應用程式啟動時就準備好快取可能會有所幫助。It might be useful to prime the cache when the application starts. 請將最可能使用的資料填入快取。Populate the cache with the data that is most likely to be used.

  • 請務必包含偵測快取命中和快取遺漏的檢測設備。Always include instrumentation that detects cache hits and cache misses. 您可以使用此資訊來調整快取原則,例如哪些資料要快取,以及資料要在快取中保留多久才算過期。Use this information to tune caching policies, such what data to cache, and how long to hold data in the cache before it expires.

  • 如果缺少快取是瓶頸,則新增快取可能會增加要求數量,甚至使 Web 前端負載過重。If the lack of caching is a bottleneck, then adding caching may increase the volume of requests so much that the web front end becomes overloaded. 用戶端可能會開始收到 HTTP 503 (服務無法使用) 錯誤。Clients may start to receive HTTP 503 (Service Unavailable) errors. 這些都表示您應該擴增前端。These are an indication that you should scale out the front end.

如何偵測沒有快取反模式How to detect a no caching antipattern

您可以執行下列步驟,以協助識別缺少快取是否會造成效能問題:You can perform the following steps to help identify whether lack of caching is causing performance problems:

  1. 檢閱應用程式程式設計。Review the application design. 清查應用程式使用的所有資料存放區。Take an inventory of all the data stores that the application uses. 針對每個資料存放區,判斷應用程式是否使用快取。For each, determine whether the application is using a cache. 可能的話,判斷資料變更的頻率。If possible, determine how frequently the data changes. 一開始,理想的快取候選項目包含變更緩慢的資料和經常讀取的靜態參考資料。Good initial candidates for caching include data that changes slowly, and static reference data that is read frequently.

  2. 檢測應用程式和監視即時系統,以便找出應用程式擷取資料或計算資訊的頻率。Instrument the application and monitor the live system to find out how frequently the application retrieves data or calculates information.

  3. 在測試環境中分析應用程式,以針對資料存取作業或其他經常執行的計算所造成的額外負荷,擷取相關的低層級度量。Profile the application in a test environment to capture low-level metrics about the overhead associated with data access operations or other frequently performed calculations.

  4. 在測試環境中執行負載測試,以識別系統在工作負載正常和負載過重時如何回應。Perform load testing in a test environment to identify how the system responds under a normal workload and under heavy load. 負載測試應使用實際工作負載,來模擬在生產環境中觀察到的資料存取模式。Load testing should simulate the pattern of data access observed in the production environment using realistic workloads.

  5. 檢查基礎資料存放區的資料存取統計資料,並檢閱重複相同資料要求的頻率。Examine the data access statistics for the underlying data stores and review how often the same data requests are repeated.

範例診斷Example diagnosis

下列各節會將這些步驟套用到稍早所述的範例應用程式。The following sections apply these steps to the sample application described earlier.

檢測應用程式和監視即時系統Instrument the application and monitor the live system

檢測應用程式並進行監視,以取得使用者在生產環境中所做的特定要求相關資訊。Instrument the application and monitor it to get information about the specific requests that users make while the application is in production.

下圖顯示 New Relic 在負載測試期間擷取的監視資料。The following image shows monitoring data captured by New Relic during a load test. 在此情況下,唯一執行的 HTTP GET 作業是 Person/GetAsyncIn this case, the only HTTP GET operation performed is Person/GetAsync. 但在實際生產環境中,了解每個要求執行的相對頻率,可以讓您深入了解應該要快取哪些資源。But in a live production environment, knowing the relative frequency that each request is performed can give you insight into which resources should be cached.

New Relic 顯示 CachingDemo 應用程式的伺服器要求

如果您需要更深入的分析,您可以使用分析工具,來擷取測試環境中 (不是生產系統) 的低層級效能資料。If you need a deeper analysis, you can use a profiler to capture low-level performance data in a test environment (not the production system). 觀看 I/O 要求比率、記憶體使用量和 CPU 使用率等。Look at metrics such as I/O request rates, memory usage, and CPU utilization. 這些度量資訊可能會顯示大量的資料存放區或服務要求,或是執行相同計算的重複處理程序。These metrics may show a large number of requests to a data store or service, or repeated processing that performs the same calculation.

對應用程式進行負載測試Load test the application

下圖顯示針對範例應用程式進行負載測試的結果。The following graph shows the results of load testing the sample application. 負載測試會模擬多達 800 位使用者執行一系列典型作業的步驟負載。The load test simulates a step load of up to 800 users performing a typical series of operations.

未快取案例的效能負載測試結果

每秒中成功執行的測試數目到達平穩,而其他要求會因此變慢。The number of successful tests performed each second reaches a plateau, and additional requests are slowed as a result. 平均測試時間會隨著工作負載穩定地增加。The average test time steadily increases with the workload. 一旦使用者負載達到尖峰,回應時間會趨於平整。The response time levels off once the user load peaks.

檢查資料存取統計資料Examine data access statistics

資料存放區提供的資料存取統計資料和其他資訊,可帶來十分有用的資訊,例如哪些查詢最常重複。Data access statistics and other information provided by a data store can give useful information, such as which queries are repeated most frequently. 例如,在 Microsoft SQL Server 中,sys.dm_exec_query_stats 管理檢視會有最近執行之查詢的統計資訊。For example, in Microsoft SQL Server, the sys.dm_exec_query_stats management view has statistical information for recently executed queries. 每個查詢的文字皆可用於 sys.dm_exec-query_plan 檢視。The text for each query is available in the sys.dm_exec-query_plan view. 您可以使用 SQL Server Management Studio 等工具來執行下列 SQL 查詢,並決定執行查詢的頻率。You can use a tool such as SQL Server Management Studio to run the following SQL query and determine how frequently queries are performed.

SELECT UseCounts, Text, Query_Plan
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
CROSS APPLY sys.dm_exec_query_plan(plan_handle)

結果中的 UseCount 資料行會指出每個查詢的執行頻率。The UseCount column in the results indicates how frequently each query is run. 下圖顯示第三個查詢執行了 250,000 次以上,大幅超過任何其他查詢。The following image shows that the third query was run more than 250,000 times, significantly more than any other query.

在 SQL Server 管理伺服器中查詢動態管理檢視的結果

以下是造成此資料庫要求數量的 SQL 查詢:Here is the SQL query that is causing so many database requests:

(@p__linq__0 int)SELECT TOP (2)
[Extent1].[BusinessEntityId] AS [BusinessEntityId],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[LastName] AS [LastName]
FROM [Person].[Person] AS [Extent1]
WHERE [Extent1].[BusinessEntityId] = @p__linq__0

這是 Entity Framework 在稍早所示的 GetByIdAsync 方法中產生的查詢。This is the query that Entity Framework generates in GetByIdAsync method shown earlier.

執行快取策略解決方案並驗證結果Implement the cache strategy solution and verify the result

當您加入快取之後,重複負載測試,並與沒有快取的稍早負載測試結果進行比較。After you incorporate a cache, repeat the load tests and compare the results to the earlier load tests without a cache. 以下是將快取加入至範例應用程式之後,產生的負載測試結果。Here are the load test results after adding a cache to the sample application.

快取案例的效能負載測試結果

成功的測試數量仍到達平穩狀態,但是使用者負載量較高。The volume of successful tests still reaches a plateau, but at a higher user load. 此負載的要求比率遠大於稍早版本。The request rate at this load is significantly higher than earlier. 平均測試時間仍會隨著負載增加,但最大回應時間為 0.05 毫秒,相較於之前的 1 毫秒,已改善 —20× 倍。Average test time still increases with load, but the maximum response time is 0.05 ms, compared with 1 ms earlier—a 20× improvement.