高可用性和 Azure SQL DatabaseHigh-availability and Azure SQL Database

Azure SQL Database 中高可用性架構的目標是確保您的資料庫在 99.99% 的時間內啟動並執行, 而不需擔心維護作業和中斷的影響。The goal of the High Availability architecture in Azure SQL Database is to guarantee that your database is up and running 99.99% of time, without worrying about the impact of maintenance operations and outages. Azure 會自動處理重要的服務工作, 例如修補、備份、Windows 和 SQL 升級, 以及未計畫的事件, 例如基礎硬體、軟體或網路失敗。Azure automatically handles critical servicing tasks, such as patching, backups, Windows and SQL upgrades, as well as unplanned events such as underlying hardware, software or network failures. 當基礎 SQL 實例修補或故障時, 如果您在應用程式中採用重試邏輯, 停機時間就不明顯。When the underlying SQL instance is patched or fails over, the downtime is not noticeable if you employ retry logic in your app. 即使在最關鍵的情況下,Azure SQL Database 也可以快速復原,確保您的資料隨時可用。Azure SQL Database can quickly recover even in the most critical circumstances ensuring that your data is always available.

高可用性解決方案的設計, 是為了確保認可的資料永遠不會因為失敗而遺失, 維護作業不會影響您的工作負載, 而且資料庫不會是您軟體架構中的單一失敗點。The high availability solution is designed to ensure that committed data is never lost due to failures, that maintenance operations do not affect your workload, and that the database will not be a single point of failure in your software architecture. 在升級或維護資料庫時,不會有維護視窗或停機時間要求您停止工作負載。There are no maintenance windows or downtimes that should require you to stop the workload while the database is upgraded or maintained.

Azure SQL Database 中使用了兩種高可用性架構模型:There are two high-availability architectural models that are used in Azure SQL Database:

  • 以計算和儲存體分隔為基礎的標準可用性模型。Standard availability model that is based on a separation of compute and storage. 它依賴遠端儲存層的高可用性和可靠性。It relies on high availability and reliability of the remote storage tier. 此架構的目標是以預算為導向的商務應用程式, 可容忍維護活動期間的效能降低。This architecture targets budget-oriented business applications that can tolerate some performance degradation during maintenance activities.
  • 以資料庫引擎進程叢集為基礎的 Premium 可用性模型。Premium availability model that is based on a cluster of database engine processes. 這是因為一律有可用資料庫引擎節點的仲裁。It relies on the fact that there is always a quorum of available database engine nodes. 此架構以高 IO 效能、高交易率為目標的任務關鍵性應用程式, 並保證在維護活動期間對工作負載的效能影響降至最低。This architecture targets mission critical applications with high IO performance, high transaction rate and guarantees minimal performance impact to your workload during maintenance activities.

Azure SQL Database 會在 SQL Server 資料庫引擎和 Windows OS 的最新穩定版本上執行, 大部分的使用者都不會注意到會持續執行升級。Azure SQL Database runs on the latest stable version of SQL Server Database Engine and Windows OS, and most users would not notice that upgrades are performed continuously.

基本、標準和一般目的服務層級可用性Basic, Standard, and General Purpose service tier availability

這些服務層級會利用標準可用性架構。These service tiers leverage the standard availability architecture. 下圖顯示四個具有分隔計算和儲存層的不同節點。The following figure shows four different nodes with the separated compute and storage layers.

分隔計算與儲存體

標準可用性模型包含兩個層級:The standard availability model includes two layers:

  • 無狀態計算層, 會執行sqlserver.exe處理常式, 而且只包含連接的 SSD 上的暫時性和快取資料, 例如 TempDB、模型資料庫、計畫快取、緩衝集區和資料行存放區集區。A stateless compute layer that runs the sqlserver.exe process and contains only transient and cached data on the attached SSD, such as TempDB, model database, plan cache, buffer pool and column store pool. 此無狀態節點由 Azure Service Fabric 操作, 可sqlserver.exe初始化、控制節點的健全狀況, 並在必要時執行容錯移轉至另一個節點。This stateless node is operated by Azure Service Fabric that initializes sqlserver.exe, controls health of the node, and performs failover to another node if necessary.
  • 具狀態資料層, 其中包含儲存在 Azure Blob 儲存體中的資料庫檔案 (.mdf/.ldf)。A stateful data layer with the database files (.mdf/.ldf) that are stored in Azure Blob storage. Azure blob 儲存體具有內建的資料可用性和冗余功能。Azure blob storage has built-in data availability and redundancy feature. 它保證即使 SQL Server 進程損毀, 也會保留資料檔案中記錄檔或頁面中的每筆記錄。It guarantees that every record in the log file or page in the data file will be preserved even if SQL Server process crashes.

每當資料庫引擎或作業系統升級, 或偵測到失敗時, Azure Service Fabric 會將無狀態 SQL Server 進程移至具有足夠可用容量的另一個無狀態計算節點。Whenever the database engine or the operating system is upgraded, or a failure is detected, Azure Service Fabric will move the stateless SQL Server process to another stateless compute node with sufficient free capacity. 移動時不會影響 Azure Blob 儲存體中的資料, 而且資料/記錄檔會附加到新初始化的 SQL Server 進程。Data in Azure Blob storage is not affected by the move, and the data/log files are attached to the newly initialized SQL Server process. 此程式保證 99.99% 的可用性, 但繁重的工作負載可能會在轉換期間遇到效能降低的情況, 因為新的 SQL Server 實例是以冷快取開始。This process guarantees 99.99% availability, but a heavy workload may experience some performance degradation during the transition since the new SQL Server instance starts with cold cache.

進階與商務關鍵性服務層級可用性Premium and Business Critical service tier availability

高階和商務關鍵服務層級會利用高階可用性模型, 它會整合單一節點上的計算資源 (SQL Server 資料庫引擎程式) 和儲存體 (本機連接的 SSD)。Premium and Business Critical service tiers leverage the Premium availability model, which integrates compute resources (SQL Server Database Engine process) and storage (locally attached SSD) on a single node. 藉由將計算和儲存體複寫至建立三到四個節點叢集的其他節點, 即可達到高可用性。High availability is achieved by replicating both compute and storage to additional nodes creating a three to four-node cluster.

資料庫引擎節點的叢集

基礎資料庫檔案 (.mdf/.ldf) 會放在附加的 SSD 儲存體上, 以提供非常低的延遲 IO 給您的工作負載。The underlying database files (.mdf/.ldf) are placed on the attached SSD storage to provide very low latency IO to your workload. 高可用性會使用類似于 SQL Server Always On 可用性群組的技術來執行。High availability is implemented using a technology similar to SQL Server Always On Availability Groups. 叢集包含可供讀寫客戶工作負載存取的單一主要複本 (SQL Server 進程), 以及最多三個包含資料複本的次要複本 (計算和儲存體)。The cluster includes a single primary replica (SQL Server process) that is accessible for read-write customer workloads, and up to three secondary replicas (compute and storage) containing copies of data. 主要節點會依序持續將變更推送至次要節點, 並確保資料至少會同步處理至一個次要複本, 然後才認可每個交易。The primary node constantly pushes changes to the secondary nodes in order and ensures that the data is synchronized to at least one secondary replica before committing each transaction. 此程式可確保如果主要節點因為任何原因而損毀, 則一律會有完全同步處理的節點來進行故障切換。This process guarantees that if the primary node crashes for any reason, there is always a fully synchronized node to fail over to. 容錯移轉是由 Azure Service Fabric 起始。The failover is initiated by the Azure Service Fabric. 一旦次要複本成為新的主要節點, 就會建立另一個次要複本, 以確保叢集擁有足夠的節點 (仲裁集)。Once the secondary replica becomes the new primary node, another secondary replica is created to ensure the cluster has enough nodes (quorum set). 一旦容錯移轉完成後, SQL 連線會自動重新導向至新的主要節點。Once failover is complete, SQL connections are automatically redirected to the new primary node.

額外的好處是, premium 可用性模型包含將唯讀 SQL 連線重新導向至其中一個次要複本的能力。As an extra benefit, the premium availability model includes the ability to redirect read-only SQL connections to one of the secondary replicas. 這項功能稱為讀取相應放大。它提供 100% 額外的計算容量, 不需要額外費用即可從主要複本關閉載入唯讀作業, 例如分析工作負載。This feature is called Read Scale-Out. It provides 100% additional compute capacity at no extra charge to off-load read-only operations, such as analytical workloads, from the primary replica.

區域備援設定Zone redundant configuration

根據預設, premium 可用性模型的節點叢集會建立在相同的資料中心內。By default, the cluster of nodes for the premium availability model is created in the same datacenter. 隨著Azure 可用性區域的引進, SQL Database 可以將叢集中不同的複本放到相同區域中的不同可用性區域。With the introduction of Azure Availability Zones, SQL Database can place different replicas in the cluster to different availability zones in the same region. 為了避免發生單點失敗,系統也會跨多個區域將控制環複寫成三個閘道環 (GW)。To eliminate a single point of failure, the control ring is also duplicated across multiple zones as three gateway rings (GW). Azure 流量管理員 (ATM) 會控制特定閘道的路由。The routing to a specific gateway ring is controlled by Azure Traffic Manager (ATM). 由於 Premium 或業務關鍵服務層級中的區域冗余設定並不會建立額外的資料庫冗余, 因此您可以不需要額外成本來加以啟用。Because the zone redundant configuration in the Premium or Business Critical service tiers does not create additional database redundancy, you can enable it at no extra cost. 藉由選取區域多餘的設定, 您可以讓 Premium 或業務關鍵資料庫彈性地復原到較大的一組失敗, 包括嚴重的資料中心中斷, 而不需要變更應用程式邏輯。By selecting a zone redundant configuration, you can make your Premium or Business Critical databases resilient to a much larger set of failures, including catastrophic datacenter outages, without any changes to the application logic. 您也可以將任何現有的進階或業務關鍵資料庫或彈性集區轉換成區域備援組態。You can also convert any existing Premium or Business Critical databases or pools to the zone redundant configuration.

由於區域的冗余資料庫在不同的資料中心內有複本, 而它們之間有一些距離, 因此增加的網路延遲可能會增加認可時間, 因而影響某些 OLTP 工作負載的效能。Because the zone redundant databases have replicas in different datacenters with some distance between them, the increased network latency may increase the commit time and thus impact the performance of some OLTP workloads. 您一律可以停用區域備援設定來回到單一區域設定。You can always return to the single-zone configuration by disabling the zone redundancy setting. 此程式是與一般服務層級升級類似的線上作業。This process is an online operation similar to the regular service tier upgrade. 在此程序結束時,資料庫或集區會從區域備援環移轉成單一區域環,或反之亦然。At the end of the process, the database or pool is migrated from a zone redundant ring to a single zone ring or vice versa.

重要

目前只有在選取區域的 Premium 和商務關鍵服務層級中才支援區域多餘的資料庫和彈性集區。Zone redundant databases and elastic pools are currently only supported in the Premium and Business Critical service tiers in select regions. 使用業務關鍵層時, 只有在選取第5代計算硬體時, 才可以使用區域冗余設定。When using the Business Critical tier, zone redundant configuration is only available when the Gen5 compute hardware is selected. 如需有關支援區域多餘資料庫之區域的最新資訊, 請參閱依區域的服務支援For up to date information about the regions that support zone redundant databases, see Services support by region.

下圖說明區域備援版的高可用性架構:The zone redundant version of the high availability architecture is illustrated by the following diagram:

高可用性架構區域備援

加速資料庫復原 (ADR)Accelerated Database Recovery (ADR)

加速資料庫復原 (ADR)是新的 SQL Database 引擎功能, 可大幅提升資料庫的可用性, 特別是在有長時間執行的交易存在時。Accelerated Database Recovery (ADR) is a new SQL database engine feature that greatly improves database availability, especially in the presence of long running transactions. ADR 目前適用於單一資料庫、彈性集區和 Azure SQL 資料倉儲。ADR is currently available for single databases, elastic pools, and Azure SQL Data Warehouse.

結論Conclusion

Azure SQL Database 具備內建的高可用性解決方案, 與 Azure 平臺緊密整合。Azure SQL Database features a built-in high availability solution, that is deeply integrated with the Azure platform. 這取決於失敗偵測和復原的 Service Fabric、用於資料保護的 Azure Blob 儲存體, 以及在可用性區域上, 以提供更高的容錯能力。It is dependent on Service Fabric for failure detection and recovery, on Azure Blob storage for data protection, and on Availability Zones for higher fault tolerance. 此外, Azure SQL database 會利用 SQL Server 的 Always On 可用性群組技術來進行複寫和容錯移轉。In addition, Azure SQL database leverages the Always On Availability Group technology from SQL Server for replication and failover. 這些技術的結合可讓應用程式完全實現混合儲存體模型的優點, 並支援最嚴苛的 Sla。The combination of these technologies enables applications to fully realize the benefits of a mixed storage model and support the most demanding SLAs.

後續步驟Next steps