Azure Stack HCI 中的容錯和儲存效率Fault tolerance and storage efficiency in Azure Stack HCI

適用于: Azure Stack HCI、版本 20H2;Windows Server 2019Applies to: Azure Stack HCI, version 20H2; Windows Server 2019

本主題會介紹儲存空間直接存取中可用的復原選項,並概述其規模需求、儲存效率,以及這每一項的一般優點和取捨。This topic introduces the resiliency options available in Storage Spaces Direct and outlines the scale requirements, storage efficiency, and general advantages and tradeoffs of each. 其中也會提供一些引導您入門的使用指示,並參考一些絕佳的論文、部落格和其他內容來深入了解。It also presents some usage instructions to get you started, and references some great papers, blogs, and additional content where you can learn more.

如果您已經熟悉儲存空間,您可以跳到摘要區段。If you are already familiar with Storage Spaces, you may want to skip to the Summary section.

概觀Overview

儲存空間的核心是為您的資料提供容錯,通常稱為「復原」。At its heart, Storage Spaces is about providing fault tolerance, often called "resiliency," for your data. 其執行方式類似於 RAID,但分散在伺服器上並執行於軟體中。Its implementation is similar to RAID, except distributed across servers and implemented in software.

如同 RAID,儲存空間有幾種不同的方式可執行此動作,這可讓您在容錯、儲存效率和計算複雜度之間做出不同的取捨。As with RAID, there are a few different ways Storage Spaces can do this, which make different tradeoffs between fault tolerance, storage efficiency, and compute complexity. 這些廣泛分為兩類:「鏡像」和「同位」,後者有時稱為「抹除碼」。These broadly fall into two categories: "mirroring" and "parity," the latter sometimes called "erasure coding."

鏡像Mirroring

鏡像可為所有資料保存多個複本,以提供容錯能力。Mirroring provides fault tolerance by keeping multiple copies of all data. 這最類似於 RAID-1。This most closely resembles RAID-1. 這類資料的等量化和放置方式並非一般人所能了解 (若要深入了解請參閱此部落格),但可確定的是,任何使用鏡像進行儲存的資料,都會以整體的方式寫入多次。How that data is striped and placed is non-trivial (see this blog to learn more), but it is absolutely true to say that any data stored using mirroring is written, in its entirety, multiple times. 每個複本會寫入至不同的硬體 (不同伺服器中的不同磁碟機),而理論上其故障只會單獨發生。Each copy is written to different physical hardware (different drives in different servers) that are assumed to fail independently.

儲存空間提供兩種鏡像,即「雙向」和「三向」。Storage Spaces offers two flavors of mirroring – "two-way" and "three-way."

雙向鏡像Two-way mirror

雙向鏡像會為所有內容寫入兩個複本。Two-way mirroring writes two copies of everything. 其儲存效率為 50% (若要寫入 1 TB 的資料,您至少需要 2 TB 的實體儲存體容量)。Its storage efficiency is 50 percent – to write 1 TB of data, you need at least 2 TB of physical storage capacity. 同樣地,您至少需要兩個硬體「容錯網域」(若使用儲存空間直接存取,這表示需要兩部伺服器)。Likewise, you need at least two hardware 'fault domains' – with Storage Spaces Direct, that means two servers.

two-way-mirror

警告

如果您有兩部以上的伺服器,建議您改用三向鏡像。If you have more than two servers, we recommend using three-way mirroring instead.

三向鏡像Three-way mirror

三向鏡像會為所有內容寫入三個複本。Three-way mirroring writes three copies of everything. 其儲存效率為 33.3% (若要寫入 1 TB 的資料,您至少需要 3 TB 的實體儲存體容量)。Its storage efficiency is 33.3 percent – to write 1 TB of data, you need at least 3 TB of physical storage capacity. 同樣地,您至少需要三個硬體容錯網域 (若使用儲存空間直接存取,這表示需要三部伺服器)。Likewise, you need at least three hardware fault domains – with Storage Spaces Direct, that means three servers.

三向鏡像在同一時間可容許至少兩個硬體問題 (磁碟機或伺服器),而安全無虞。Three-way mirroring can safely tolerate at least two hardware problems (drive or server) at a time. 例如,如果您在一個磁碟機或伺服器突然故障時重新啟動另一部伺服器,所有資料都將保有安全性,且持續可供存取。For example, if you're rebooting one server when suddenly another drive or server fails, all data remains safe and continuously accessible.

three-way-mirror

ParityParity

同位編碼 (通常稱為「抹除碼」) 提供使用位元運算的容錯功能,這非常複雜Parity encoding, often called "erasure coding," provides fault tolerance using bitwise arithmetic, which can get remarkably complicated. 此功能的運作方式比鏡像複雜,有許多絕佳的線上資源 (例如,此第三方抹除碼入門指南) 可協助您了解此概念。The way this works is less obvious than mirroring, and there are many great online resources (for example, this third-party Dummies Guide to Erasure Coding) that can help you get the idea. 簡單來說,其提供更好的儲存效率,而且不會犧牲容錯能力。Sufficed to say it provides better storage efficiency without compromising fault tolerance.

儲存空間提供兩種同位,即「單一」同位和「雙」同位,後者會在大規模上採用稱為「區域重建碼 (local reconstruction codes)」的先進技術。Storage Spaces offers two flavors of parity – "single" parity and "dual" parity, the latter employing an advanced technique called "local reconstruction codes" at larger scales.

重要

我們建議針對大部分效能相關的工作負載使用鏡像。We recommend using mirroring for most performance-sensitive workloads. 若要深入了解如何根據工作負載平衡效能和容量,請參閱規劃磁碟區To learn more about how to balance performance and capacity depending on your workload, see Plan volumes.

單一同位Single parity

單一同位只會保留一個位元組同位符號,只會針對一次的失敗提供容錯功能。Single parity keeps only one bitwise parity symbol, which provides fault tolerance against only one failure at a time. 此方式最類似於 RAID-5。It most closely resembles RAID-5. 若要使用單一同位,您至少需要三個硬體容錯網域 (若使用儲存空間直接存取,這表示需要三部伺服器)。To use single parity, you need at least three hardware fault domains – with Storage Spaces Direct, that means three servers. 由於三向鏡像可在相同規模上提供容錯功能,因此我們不鼓勵使用單一同位。Because three-way mirroring provides more fault tolerance at the same scale, we discourage using single parity. 但是,如果您堅持使用,此功能是可完全支援的。But, it's there if you insist on using it, and it is fully supported.

警告

我們不鼓勵使用單一同位,因為其一次只能安全地容忍一個硬體失敗:如果您在突然有另一部磁碟機或伺服器失敗時重新啟動一部伺服器,則您將會遇到停機狀況。We discourage using single parity because it can only safely tolerate one hardware failure at a time: if you're rebooting one server when suddenly another drive or server fails, you will experience downtime. 如果您只有三部伺服器,建議您使用三向鏡像。If you only have three servers, we recommend using three-way mirroring. 如果您有四部或四部以上的伺服器,請參閱下一節。If you have four or more, see the next section.

雙同位Dual parity

雙同位會執行 Reed-Solomon 錯誤修正碼,以保留兩個同位符號,藉此提供與三向鏡像相同的容錯能力 (也就是一次最多兩個失敗),但其具有更好的儲存效率。Dual parity implements Reed-Solomon error-correcting codes to keep two bitwise parity symbols, thereby providing the same fault tolerance as three-way mirroring (i.e. up to two failures at once), but with better storage efficiency. 此方式最類似於 RAID-6。It most closely resembles RAID-6. 若要使用雙同位,您至少需要四個硬體容錯網域 (若使用儲存空間直接存取,這表示需要四部伺服器)。To use dual parity, you need at least four hardware fault domains – with Storage Spaces Direct, that means four servers. 就該規模而言,儲存效率為 50% (若要儲存 2 TB 的資料,您需要 4 TB 的實體儲存體容量)。At that scale, the storage efficiency is 50% – to store 2 TB of data, you need 4 TB of physical storage capacity.

dual-parity

您擁有愈多硬體容錯網域數目,雙同位的儲存效率就會愈高,從 50% 提高到 80%。The storage efficiency of dual parity increases the more hardware fault domains you have, from 50 percent up to 80 percent. 例如,七個網域 (若使用儲存空間直接存取,這表示有七部伺服器) 的效率會跳到 66.7% (若要儲存 4 TB 的資料,您只需要 6 TB 的實體儲存體容量)。For example, at seven (with Storage Spaces Direct, that means seven servers) the efficiency jumps to 66.7 percent – to store 4 TB of data, you need just 6 TB of physical storage capacity.

dual-parity-wide

請參閱摘要一節,了解每個規模的雙同位效率和區域重建碼。See the Summary section for the efficiency of dual party and local reconstruction codes at every scale.

區域重建碼Local reconstruction codes

儲存空間引進由 Microsoft Research 所開發的先進技術,稱為「區域重建碼 (local reconstruction codes)」或 LRC。Storage Spaces introduces an advanced technique developed by Microsoft Research called "local reconstruction codes," or LRC. 規模較大時,雙同位會使用 LRC 將其編碼/解碼分割成幾個較小的群組,以降低進行寫入或從失敗中復原所需的額外負荷。At large scale, dual parity uses LRC to split its encoding/decoding into a few smaller groups, to reduce the overhead required to make writes or recover from failures.

使用硬碟 (HDD) 時,群組大小為四個符號;使用固態硬碟 (SSD) 時,群組大小為六個符號。With hard disk drives (HDD) the group size is four symbols; with solid-state drives (SSD), the group size is six symbols. 例如,以下是配置硬碟和 12 個硬體容錯網域 (亦即12部伺服器) 的樣子 – 各具有四個資料符號的兩個群組。For example, here's what the layout looks like with hard disk drives and 12 hardware fault domains (meaning 12 servers) – there are two groups of four data symbols. 其可達到 72.7% 的儲存效率。It achieves 72.7 percent storage efficiency.

local-reconstruction-codes

我們建議您參閱此深入但十分易讀的逐步解說:區域重建碼如何處理各種不同的失敗案例,以及其為何如此吸引人,其作者是我們自己的夥伴 Claus JoergensenWe recommend this in-depth yet eminently readable walk-through of how local reconstruction codes handle various failure scenarios, and why they're appealing, by our very own Claus Joergensen.

鏡像加速同位Mirror-accelerated parity

儲存空間直接存取磁碟區可以是部分鏡像和部分同位。A Storage Spaces Direct volume can be part mirror and part parity. 寫入項目會先在鏡像部分中登陸,然後逐漸移入同位部分。Writes land first in the mirrored portion and are gradually moved into the parity portion later. 實際上,這是使用鏡像來加速抹除碼Effectively, this is using mirroring to accelerate erasure coding.

若要混合三向鏡像和雙同位,您需要至少四個容錯網域,也就是四部伺服器。To mix three-way mirror and dual parity, you need at least four fault domains, meaning four servers.

鏡像加速同位的儲存效率介於全鏡像或全同位所具有的效率之間,而且取決於您選擇的比例。The storage efficiency of mirror-accelerated parity is in between what you'd get from using all mirror or all parity, and depends on the proportions you choose. 例如,此簡報在第 37 分鐘時示範了使用 12 部伺服器時,效率達到 46%、54% 和 65% 的各種混合模式For example, the demo at the 37-minute mark of this presentation shows various mixes achieving 46 percent, 54 percent, and 65 percent efficiency with 12 servers.

重要

我們建議針對大部分效能相關的工作負載使用鏡像。We recommend using mirroring for most performance-sensitive workloads. 若要深入了解如何根據工作負載平衡效能和容量,請參閱規劃磁碟區To learn more about how to balance performance and capacity depending on your workload, see Plan volumes.

摘要Summary

本節將摘要說明儲存空間直接存取中可用的復原類型、使用每種類型的最低規模需求、每個類型可容許的失敗次數,以及對應的儲存效率。This section summarizes the resiliency types available in Storage Spaces Direct, the minimum scale requirements to use each type, how many failures each type can tolerate, and the corresponding storage efficiency.

復原類型Resiliency types

災害復原Resiliency 失敗容錯Failure tolerance 儲存效率Storage efficiency
雙向鏡像Two-way mirror 11 50.0%50.0%
三向鏡像Three-way mirror 22 33.3%33.3%
雙同位Dual parity 22 50.0% - 80.0%50.0% - 80.0%
MixedMixed 22 33.3% - 80.0%33.3% - 80.0%

最低規模需求Minimum scale requirements

災害復原Resiliency 最低容錯網域需求Minimum required fault domains
雙向鏡像Two-way mirror 22
三向鏡像Three-way mirror 33
雙同位Dual parity 44
MixedMixed 44

提示

除非您使用底座或機架容錯,否則容錯網域的數目會參照伺服器數目。Unless you are using chassis or rack fault tolerance, the number of fault domains refers to the number of servers. 若您符合儲存空間直接存取的最低需求,每部伺服器中的磁碟機數目就不會影響您可以使用的復原類型。The number of drives in each server does not affect which resiliency types you can use, as long as you meet the minimum requirements for Storage Spaces Direct.

適用於混合式部署的雙同位效率Dual parity efficiency for hybrid deployments

下表針對同時包含硬碟 (HDD) 和固態硬碟 (SSD) 的混合式部署,說明其每個規模的雙同位儲存效率和區域重建碼。This table shows the storage efficiency of dual parity and local reconstruction codes at each scale for hybrid deployments which contain both hard disk drives (HDD) and solid-state drives (SSD).

容錯網域Fault domains 版面配置Layout 效率Efficiency
22
33
44 RS 2+2RS 2+2 50.0%50.0%
55 RS 2+2RS 2+2 50.0%50.0%
66 RS 2+2RS 2+2 50.0%50.0%
77 RS 4+2RS 4+2 66.7%66.7%
88 RS 4+2RS 4+2 66.7%66.7%
99 RS 4+2RS 4+2 66.7%66.7%
1010 RS 4+2RS 4+2 66.7%66.7%
1111 RS 4+2RS 4+2 66.7%66.7%
1212 LRC (8, 2, 1)LRC (8, 2, 1) 72.7%72.7%
1313 LRC (8, 2, 1)LRC (8, 2, 1) 72.7%72.7%
1414 LRC (8, 2, 1)LRC (8, 2, 1) 72.7%72.7%
1515 LRC (8, 2, 1)LRC (8, 2, 1) 72.7%72.7%
1616 LRC (8, 2, 1)LRC (8, 2, 1) 72.7%72.7%

全快閃部署的雙同位效率Dual parity efficiency for all-flash deployments

下表針對僅包含固態硬碟 (SSD) 的全快閃部署,說明其每個規模的雙同位儲存效率和區域重建碼。This table shows the storage efficiency of dual parity and local reconstruction codes at each scale for all-flash deployments which contain only solid-state drives (SSD). 同位配置可以使用較大的群組大小,並在全快閃設定中達到更佳的儲存效率。The parity layout can use larger group sizes and achieve better storage efficiency in an all-flash configuration.

容錯網域Fault domains 版面配置Layout 效率Efficiency
22
33
44 RS 2+2RS 2+2 50.0%50.0%
55 RS 2+2RS 2+2 50.0%50.0%
66 RS 2+2RS 2+2 50.0%50.0%
77 RS 4+2RS 4+2 66.7%66.7%
88 RS 4+2RS 4+2 66.7%66.7%
99 RS 6+2RS 6+2 75.0%75.0%
1010 RS 6+2RS 6+2 75.0%75.0%
1111 RS 6+2RS 6+2 75.0%75.0%
1212 RS 6+2RS 6+2 75.0%75.0%
1313 RS 6+2RS 6+2 75.0%75.0%
1414 RS 6+2RS 6+2 75.0%75.0%
1515 RS 6+2RS 6+2 75.0%75.0%
1616 LRC (12, 2, 1)LRC (12, 2, 1) 80.0%80.0%

範例Examples

除非您只有兩部伺服器,否則建議您使用三向鏡像和/或雙同位,因為其提供更好的容錯能力。Unless you have only two servers, we recommend using three-way mirroring and/or dual parity, because they offer better fault tolerance. 具體而言,即使兩個容錯網域 (若使用儲存空間直接存取,則指兩部伺服器) 因為同時發生的失敗而受到影響,這些方式也可確保所有資料能保持安全且持續可供存取。Specifically, they ensure that all data remains safe and continuously accessible even when two fault domains – with Storage Spaces Direct, that means two servers - are affected by simultaneous failures.

所有項目都保持上線的範例Examples where everything stays online

這六個範例會示範三向鏡像和/或雙同位檢視可以容忍的狀況。These six examples show what three-way mirroring and/or dual parity can tolerate.

  • 1. 一個磁碟機遺失 (包括快取磁碟機)1. One drive lost (includes cache drives)
  • 2. 一部伺服器遺失2. One server lost

fault-tolerance-examples-1-and-2

  • 3. 一部伺服器和一部磁碟機遺失3. One server and one drive lost
  • 4. 不同伺服器中的兩個磁碟機遺失4. Two drives lost in different servers

fault-tolerance-examples-3-and-4

  • 5. 有兩個以上的磁碟機遺失,但最多兩部伺服器受到影響5. More than two drives lost, so long as at most two servers are affected
  • 6. 兩部伺服器遺失6. Two servers lost

fault-tolerance-examples-5-and-6

...在每個案例中,所有磁碟區都會保持線上狀態。...in every case, all volumes will stay online. (請確定您的叢集會維護仲裁功能。)(Make sure your cluster maintains quorum.)

所有項目都離線的範例Examples where everything goes offline

在其存留期內,儲存空間可容忍任何數目的失敗,因為其會在每次失敗後還原至完整復原 (假設時間足夠)。Over its lifetime, Storage Spaces can tolerate any number of failures, because it restores to full resiliency after each one, given sufficient time. 不過,在任何指定時間內,最多只能有兩個容錯網域可在因為失敗而受到影響時安全無虞。However, at most two fault domains can safely be affected by failures at any given moment. 因此,以下是三向鏡像和/或雙同位無法容許的範例。The following are therefore examples of what three-way mirroring and/or dual parity cannot tolerate.

  • 7. 三部或三部以上伺服器中同時遺失磁碟機7. Drives lost in three or more servers at once
  • 8. 三部或三部以上伺服器同時遺失8. Three or more servers lost at once

fault-tolerance-examples-7-and-8

使用量Usage

查看 [ 建立磁片區]。Check out Create volumes.

後續步驟Next steps

如需進一步了解本文中所述的主旨,請參閱下列主題:For further reading on subjects mentioned in this article, see the following: