另行快取模式Cache-Aside pattern

依需要從資料存放區將資料載入快取中。Load data on demand into a cache from a data store. 這可以改善效能並且有助於維持快取中所保留資料與基礎資料存放區中資料之間的一致性。This can improve performance and also helps to maintain consistency between data held in the cache and data in the underlying data store.

內容和問題Context and problem

應用程式使用快取以減少重複存取資料存放區中存放的資訊。Applications use a cache to improve repeated access to information held in a data store. 但是,我們不能期望快取的資料會永遠與資料存放區中的資料完全一致。However, it's impractical to expect that cached data will always be completely consistent with the data in the data store. 應用程式應該實作的策略是,協助確保快取中的資料盡可能保持最新,也可以偵測並處理當快取中的資料變成過時的情況。Applications should implement a strategy that helps to ensure that the data in the cache is as up-to-date as possible, but can also detect and handle situations that arise when the data in the cache has become stale.

解決方案Solution

許多商業的快取系統提供貫穿式讀取和貫穿式寫入/事後寫入作業。Many commercial caching systems provide read-through and write-through/write-behind operations. 在這些系統中,應用程式會參考快取來擷取資料。In these systems, an application retrieves data by referencing the cache. 如果資料不在快取中,就會從資料存放區擷取,並加入快取。If the data isn't in the cache, it's retrieved from the data store and added to the cache. 快取中保留的資料若有任何修改,也會自動寫回資料存放區。Any modifications to data held in the cache are automatically written back to the data store as well.

如果快取不提供這項功能,使用快取的應用程式就必須負責維護資料。For caches that don't provide this functionality, it's the responsibility of the applications that use the cache to maintain the data.

應用程式可以實作另行快取策略,模擬貫穿式讀取快取的功能。An application can emulate the functionality of read-through caching by implementing the cache-aside strategy. 此策略會依需要將資料載入快取。This strategy loads data into the cache on demand. 此圖說明使用另行快取模式在快取中儲存資料。The figure illustrates using the Cache-Aside pattern to store data in the cache.

使用另行快取模式在快取中儲存資料

如果應用程式更新資訊,可以對資料存放區進行修改,並且讓快取中對應的項目失效,來遵循貫穿式寫入策略。If an application updates information, it can follow the write-through strategy by making the modification to the data store, and by invalidating the corresponding item in the cache.

而當之後需要此項目時,使用另行快取策略將從資料存放區擷取更新過的資料,再加回快取。When the item is next required, using the cache-aside strategy will cause the updated data to be retrieved from the data store and added back into the cache.

問題和考量Issues and considerations

當您決定如何實作此模式時,請考慮下列幾點:Consider the following points when deciding how to implement this pattern:

快取資料的存留期Lifetime of cached data. 許多快取實作的到期原則是,如果資料在指定的期間內沒有被存取過,就會讓資料無效並從快取移除。Many caches implement an expiration policy that invalidates data and removes it from the cache if it's not accessed for a specified period. 如果要讓另行快取有效,請確定到期原則符合使用該資料的應用程式的存取模式。For cache-aside to be effective, ensure that the expiration policy matches the pattern of access for applications that use the data. 不要讓到期期間太短,因為這會造成應用程式要持續從資料存放區擷取資料並加入快取。Don't make the expiration period too short because this can cause applications to continually retrieve data from the data store and add it to the cache. 同樣地,不要讓到期期間太長,而讓快取的資料有可能變成過時。Similarly, don't make the expiration period so long that the cached data is likely to become stale. 請記住,快取對於相對靜態的資料或經常讀取的資料是最有效的。Remember that caching is most effective for relatively static data, or data that is read frequently.

收回資料Evicting data. 大部分的快取相較於資料來源的資料存放區都有大小限制,因此必要時會收回資料。Most caches have a limited size compared to the data store where the data originates, and they'll evict data if necessary. 大部分的快取會採用最近最少使用過的原則來選取要收回的項目,但這可以自訂。Most caches adopt a least-recently-used policy for selecting items to evict, but this might be customizable. 設定快取的全域到期屬性和其他屬性,以及每個快取項目的到期屬性,可確保快取符合成本效益。Configure the global expiration property and other properties of the cache, and the expiration property of each cached item, to ensure that the cache is cost effective. 對快取中的每個項目套用全域收回原則不一定永遠適合。It isn't always appropriate to apply a global eviction policy to every item in the cache. 例如,如果從資料存放區擷取快取的項目所耗費的資源很高,在快取中保留此項目可能會比經常存取成本較低的項目有利。For example, if a cached item is very expensive to retrieve from the data store, it can be beneficial to keep this item in the cache at the expense of more frequently accessed but less costly items.

預備快取Priming the cache. 許多解決方案會使用應用程式可能需要啟動處理程序的資料預先填入快取。Many solutions prepopulate the cache with the data that an application is likely to need as part of the startup processing. 如果有些資料到期或被收回,另行快取模式就還是很有用。The Cache-Aside pattern can still be useful if some of this data expires or is evicted.

一致性Consistency. 實作另行快取模式並不保證資料存放區與快取之間的一致性。Implementing the Cache-Aside pattern doesn't guarantee consistency between the data store and the cache. 資料存放區中的項目可能會隨時由外部處理程序變更,而這項變更可能要到下次載入項目時才會反映在快取中。An item in the data store can be changed at any time by an external process, and this change might not be reflected in the cache until the next time the item is loaded. 在資料存放區之間會複寫資料的系統中,如果經常發生同步處理,這個問題可能會變得很嚴重。In a system that replicates data across data stores, this problem can become serious if synchronization occurs frequently.

本機 (記憶體內) 快取Local (in-memory) caching. 快取對於應用程式執行個體可以是本機的,並儲存在記憶體中。A cache could be local to an application instance and stored in-memory. 如果應用程式會重複存取相同的資料,另行快取在這樣的環境中就相當有用。Cache-aside can be useful in this environment if an application repeatedly accesses the same data. 不過,本機快取屬私人性質,因此不同的應用程式執行個體對於相同的快取資料會分別建立一份複本。However, a local cache is private and so different application instances could each have a copy of the same cached data. 這項資料在快取之間可能很快就會變得不一致,因此可能需要更頻繁地讓私人快取中保留的資料到期,然後重新整理資料。This data could quickly become inconsistent between caches, so it might be necessary to expire data held in a private cache and refresh it more frequently. 在這些情況下,請考慮使用共用或分散式快取機制。In these scenarios, consider investigating the use of a shared or a distributed caching mechanism.

使用此模式的時機When to use this pattern

使用此模式的時機包括:Use this pattern when:

  • 快取並不提供原生的貫穿式讀取和貫穿式寫入作業。A cache doesn't provide native read-through and write-through operations.
  • 資源需求無法預測。Resource demand is unpredictable. 此模式可讓應用程式依需要載入資料。This pattern enables applications to load data on demand. 它不會假設應用程式將會事先需要哪些資料。It makes no assumptions about which data an application will require in advance.

此模式可能不適合下列時機︰This pattern might not be suitable:

  • 當快取的資料集是靜態的。When the cached data set is static. 如果資料可放入可用的快取空間,在啟動時使用資料預先填入快取,並套用讓資料不會過期的原則。If the data will fit into the available cache space, prime the cache with the data on startup and apply a policy that prevents the data from expiring.
  • 在裝載於 Web 伺服陣列中的 Web 應用程式中快取工作階段狀態資訊。For caching session state information in a web application hosted in a web farm. 在此環境中,您應該避免引入以用戶端-伺服器同質性為基礎的相依性。In this environment, you should avoid introducing dependencies based on client-server affinity.

範例Example

在 Microsoft Azure 您可以使用 Azure Cache for Redis 來建立可由多個應用程式實例共用的分散式快取。In Microsoft Azure you can use Azure Cache for Redis to create a distributed cache that can be shared by multiple instances of an application.

下列程式碼範例使用 StackExchange.Redis 用戶端,這是為 .NET 撰寫的 Redis 用戶端程式庫。This following code examples use the StackExchange.Redis client, which is a Redis client library written for .NET. 若要連接到 Azure Cache for Redis 的實例,請呼叫靜態 ConnectionMultiplexer.Connect 方法並傳入連接字串。To connect to an Azure Cache for Redis instance, call the static ConnectionMultiplexer.Connect method and pass in the connection string. 方法會傳回 ConnectionMultiplexer,代表連接。The method returns a ConnectionMultiplexer that represents the connection. 在您的應用程式中共用 ConnectionMultiplexer 執行個體的其中一種方法,就是擁有可傳回已連接執行個體的靜態屬性,類似下列範例。One approach to sharing a ConnectionMultiplexer instance in your application is to have a static property that returns a connected instance, similar to the following example. 此方法提供安全執行緒方式,只初始化單一已連接的執行個體。This approach provides a thread-safe way to initialize only a single connected instance.

private static ConnectionMultiplexer Connection;

// Redis connection string information
private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
{
    string cacheConnection = ConfigurationManager.AppSettings["CacheConnection"].ToString();
    return ConnectionMultiplexer.Connect(cacheConnection);
});

public static ConnectionMultiplexer Connection => lazyConnection.Value;

下列程式碼範例中的 GetMyEntityAsync 方法示範如何實作另行快取模式。The GetMyEntityAsync method in the following code example shows an implementation of the Cache-Aside pattern. 此方法會使用貫穿式讀取方法從快取中擷取物件。This method retrieves an object from the cache using the read-through approach.

物件的識別方式是使用整數識別碼做為索引鍵。An object is identified by using an integer ID as the key. GetMyEntityAsync 方法會嘗試使用此金鑰從快取擷取項目。The GetMyEntityAsync method tries to retrieve an item with this key from the cache. 如果找到相符的項目,就會傳回。If a matching item is found, it's returned. 如果快取中沒有符合的項目,GetMyEntityAsync 方法就會從資料存放區擷取物件、將它加入快取,然後將它傳回。If there's no match in the cache, the GetMyEntityAsync method retrieves the object from a data store, adds it to the cache, and then returns it. 實際從資料存放區中讀取資料的程式碼不會在這裡顯示,因為它取決於資料存放區。The code that actually reads the data from the data store is not shown here, because it depends on the data store. 請注意,快取的項目已設定為過期,以避免在其他地方經過更新而變成過時。Note that the cached item is configured to expire to prevent it from becoming stale if it's updated elsewhere.

// Set five minute expiration as a default
private const double DefaultExpirationTimeInMinutes = 5.0;

public async Task<MyEntity> GetMyEntityAsync(int id)
{
  // Define a unique key for this method and its parameters.
  var key = $"MyEntity:{id}";
  var cache = Connection.GetDatabase();

  // Try to get the entity from the cache.
  var json = await cache.StringGetAsync(key).ConfigureAwait(false);
  var value = string.IsNullOrWhiteSpace(json)
                ? default(MyEntity)
                : JsonConvert.DeserializeObject<MyEntity>(json);

  if (value == null) // Cache miss
  {
    // If there's a cache miss, get the entity from the original store and cache it.
    // Code has been omitted because it is data store dependent.
    value = ...;

    // Avoid caching a null value.
    if (value != null)
    {
      // Put the item in the cache with a custom expiration time that
      // depends on how critical it is to have stale data.
      await cache.StringSetAsync(key, JsonConvert.SerializeObject(value)).ConfigureAwait(false);
      await cache.KeyExpireAsync(key, TimeSpan.FromMinutes(DefaultExpirationTimeInMinutes)).ConfigureAwait(false);
    }
  }

  return value;
}

這些範例會使用 Azure Cache for Redis 來存取存放區,並從快取中取出資訊。The examples use Azure Cache for Redis to access the store and retrieve information from the cache. 如需詳細資訊,請參閱 使用 Azure Cache for Redis 以及 如何使用 Azure Cache for Redis 來建立 Web 應用程式For more information, see Using Azure Cache for Redis and How to create a Web App with Azure Cache for Redis.

下面顯示的 UpdateEntityAsync 方法示範如何讓快取中的物件在值被應用程式變更時變成無效。The UpdateEntityAsync method shown below demonstrates how to invalidate an object in the cache when the value is changed by the application. 程式碼會更新原始資料存放區,然後從快取移除快取的項目。The code updates the original data store and then removes the cached item from the cache.

public async Task UpdateEntityAsync(MyEntity entity)
{
    // Update the object in the original data store.
    await this.store.UpdateEntityAsync(entity).ConfigureAwait(false);

    // Invalidate the current cache object.
    var cache = Connection.GetDatabase();
    var id = entity.Id;
    var key = $"MyEntity:{id}"; // The key for the cached object.
    await cache.KeyDeleteAsync(key).ConfigureAwait(false); // Delete this key from the cache.
}

注意

步驟的順序很重要。The order of the steps is important. 從快取中移除項目 之前 更新資料存放區。Update the data store before removing the item from the cache. 如果您先移除快取的項目,在資料存放區更新之前,會有一小段時間用戶端可能擷取項目。If you remove the cached item first, there is a small window of time when a client might fetch the item before the data store is updated. 這會造成快取遺漏 (因為已從快取中移除項目),導致要從資料存放區中擷取舊版的項目,並新增回快取。That will result in a cache miss (because the item was removed from the cache), causing the earlier version of the item to be fetched from the data store and added back into the cache. 結果會是過時的快取資料。The result will be stale cache data.

以下是實作此模式的相關資訊︰The following information may be relevant when implementing this pattern:

  • 快取指引Caching Guidance. 提供如何在雲端解決方案中快取資料,以及當您實作快取時應該考慮的問題的其他資訊。Provides additional information on how you can cache data in a cloud solution, and the issues that you should consider when you implement a cache.

  • 資料一致性入門Data Consistency Primer. 雲端應用程式通常使用分散在資料存放區各處的資料。Cloud applications typically use data that's spread across data stores. 在此環境中管理和維護資料的一致性,是系統一個相當關鍵的部分,尤其是可能發生並行存取和可用性的問題。Managing and maintaining data consistency in this environment is a critical aspect of the system, particularly the concurrency and availability issues that can arise. 此入門說明有關分散式資料之間一致性的問題,並摘要說明應用程式如何實作最終一致性,以維持資料的可用性。This primer describes issues about consistency across distributed data, and summarizes how an application can implement eventual consistency to maintain the availability of data.