Azure SQL 中的加速資料庫復原Accelerated Database Recovery in Azure SQL

適用於: Azure SQL Database Azure SQL 受控執行個體

加速資料庫復原 (ADR) 是 SQL Server database engine 功能,可透過重新設計 SQL Server 資料庫引擎復原程式,大幅改善資料庫可用性(特別是在有長時間執行的交易時)。Accelerated Database Recovery (ADR) is a SQL Server database engine feature that greatly improves database availability, especially in the presence of long running transactions, by redesigning the SQL Server database engine recovery process.

ADR 目前適用于 Azure SQL Database、Azure SQL 受控執行個體、Azure Synapse Analytics 中的資料庫,以及從 SQL Server 2019 開始的 Azure Vm 上的 SQL Server。ADR is currently available for Azure SQL Database, Azure SQL Managed Instance, databases in Azure Synapse Analytics, and SQL Server on Azure VMs starting with SQL Server 2019.

注意

在 Azure SQL Database 和 Azure SQL 受控執行個體中預設會啟用 ADR,且不支援任何產品的 ADR。ADR is enabled by default in Azure SQL Database and Azure SQL Managed Instance and disabling ADR for either product is not supported.

概觀Overview

ADR 的主要優點為:The primary benefits of ADR are:

  • 快速且一致的資料庫復原Fast and consistent database recovery

    使用 ADR,長時間執行的交易便不會影響整體復原時間,可快速且一致地進行資料庫復原,不論系統內使用中的交易數或其大小為何。With ADR, long running transactions do not impact the overall recovery time, enabling fast and consistent database recovery irrespective of the number of active transactions in the system or their sizes.

  • 瞬間交易回復Instantaneous transaction rollback

    使用 ADR,交易回復會在瞬間完成,不論交易處於使用中狀態的時間長度,或是已執行的更新數為何。With ADR, transaction rollback is instantaneous, irrespective of the time that the transaction has been active or the number of updates that has performed.

  • 積極性記錄截斷Aggressive log truncation

    使用 ADR,交易記錄會積極地進行截斷,即使有使用中的長時間執行交易,它也能防止其失去控制。With ADR, the transaction log is aggressively truncated, even in the presence of active long-running transactions, which prevents it from growing out of control.

標準資料庫復原處理序Standard database recovery process

資料庫復原會遵循 >aries 復原模式並包含三個階段,如下圖所示,並會在圖表後面詳細說明。Database recovery follows the ARIES recovery model and consists of three phases, which are illustrated in the following diagram and explained in more detail following the diagram.

目前的復原處理序

  • 分析階段Analysis phase

    從最後一個成功檢查點的開頭開始向前掃描交易記錄 (或最舊的中途分頁 LSN) 直到結束為止,以判斷資料庫停止時每個交易的狀態。Forward scan of the transaction log from the beginning of the last successful checkpoint (or the oldest dirty page LSN) until the end, to determine the state of each transaction at the time the database stopped.

  • 重做階段Redo phase

    從最舊的未認可交易直到結束的交易記錄順向掃描,可藉由重做所有獲認可的作業,來將資料庫帶到其在發生毀損時所處的狀態。Forward scan of the transaction log from the oldest uncommitted transaction until the end, to bring the database to the state it was at the time of the crash by redoing all committed operations.

  • 復原階段Undo phase

    對於每個在發生毀損時處於作用中的交易,向後周遊記錄,以復原此交易所執行的作業。For each transaction that was active as of the time of the crash, traverses the log backwards, undoing the operations that this transaction performed.

根據這項設計,SQL Server 資料庫引擎從非預期的重新開機進行復原所需的時間,) 大約是 (與當機時系統中最長的使用中交易大小成正比。Based on this design, the time it takes the SQL Server database engine to recover from an unexpected restart is (roughly) proportional to the size of the longest active transaction in the system at the time of the crash. 復原需要回復所有未完成的交易。Recovery requires a rollback of all incomplete transactions. 所需時間與交易執行的工作和其已使用時間成比例。The length of time required is proportional to the work that the transaction has performed and the time it has been active. 因此,復原程式可能會花很長的時間來顯示長時間執行的交易 (例如大型的大量插入作業或針對大型資料表) 的索引建立作業。Therefore, the recovery process can take a long time in the presence of long-running transactions (such as large bulk insert operations or index build operations against a large table).

同時,根據此設計來取消/回復大型交易也會花費很長的時間,因為它使用如前所述的相同復原階段。Also, cancelling/rolling back a large transaction based on this design can also take a long time as it is using the same Undo recovery phase as described above.

此外,當有長時間執行的交易時,SQL Server database engine 無法截斷交易記錄,因為復原和復原進程需要它們的對應記錄檔記錄。In addition, the SQL Server database engine cannot truncate the transaction log when there are long-running transactions because their corresponding log records are needed for the recovery and rollback processes. 由於這項 SQL Server 資料庫引擎的設計,有些客戶用來面對問題,那就是交易記錄檔的大小變得很大,而且會耗用大量的磁片磁碟機空間。As a result of this design of the SQL Server database engine, some customers used to face the problem that the size of the transaction log grows very large and consumes huge amounts of drive space.

加速資料庫復原程序The Accelerated Database Recovery process

ADR 藉由完全重新設計 SQL Server 資料庫引擎復原程式來解決上述問題:ADR addresses the above issues by completely redesigning the SQL Server database engine recovery process to:

  • 透過避免從最舊使用中交易的開頭進行掃描,或是掃描至開頭,來固定時間或使其能夠立即完成。Make it constant time/instant by avoiding having to scan the log from/to the beginning of the oldest active transaction. 使用 ADR,只會從最後一次成功的檢查點 (或是最舊的中途分頁記錄序號 (LSN)) 開始處理交易記錄。With ADR, the transaction log is only processed from the last successful checkpoint (or oldest dirty page Log Sequence Number (LSN)). 因此,復原時間不會受到長時間執行的交易影響。As a result, recovery time is not impacted by long running transactions.
  • 將所需的交易記錄空間減至最小,因為不再需要處理整個交易的記錄。Minimize the required transaction log space since there is no longer a need to process the log for the whole transaction. 因此,可在進行檢查點和備份時積極地截斷交易記錄。As a result, the transaction log can be truncated aggressively as checkpoints and backups occur.

從高層級而言,ADR 會透過對所有實體資料庫修改建立版本,並只復原邏輯作業 (其數量有限且可以立即進行復原) 來快速地進行資料庫復原。At a high level, ADR achieves fast database recovery by versioning all physical database modifications and only undoing logical operations, which are limited and can be undone almost instantly. 任何在發生毀損時處於作用中的交易都會標記為已中止,因此,並行使用者查詢會忽略這些交易所產生的任何版本。Any transaction that was active as of the time of a crash are marked as aborted and, therefore, any versions generated by these transactions can be ignored by concurrent user queries.

ADR 復原處理序與目前復原處理序具有相同的三個階段。The ADR recovery process has the same three phases as the current recovery process. 下圖說明這三個階段使用 ADR 運作的方式,並且會在下圖後面進行更詳細的說明。How these phases operate with ADR is illustrated in the following diagram and explained in more detail following the diagram.

ADR 復原處理序

  • 分析階段Analysis phase

    處理序與以往的處理序相同,但會加上重新建構 sLog 並複製未建立版本作業的記錄檔記錄。The process remains the same as before with the addition of reconstructing sLog and copying log records for non-versioned operations.

  • 重做 階段Redo phase

    分成兩個階段 (P)Broken into two phases (P)

    • 階段 1Phase 1

      從 sLog 重做 (最舊的未認可交易至最後一個檢查點)。Redo from sLog (oldest uncommitted transaction up to last checkpoint). 重做是快速的作業,因為它只需要處理來自 sLog 的幾個記錄。Redo is a fast operation as it only needs to process a few records from the sLog.

    • 階段 2Phase 2

      從交易記錄進行重做,會從最後一個檢查點 (而不是最舊的未認可交易) 開始Redo from Transaction Log starts from last checkpoint (instead of oldest uncommitted transaction)

  • 復原階段Undo phase

    使用 ADR 的復原階段可透過下列方式以幾乎瞬間的方式來完成:使用 sLog 來復原非版本控制的作業及具備邏輯還原的持續版本存放區 (PV),來執行以版本為基礎的資料列層級復原。The Undo phase with ADR completes almost instantaneously by using sLog to undo non-versioned operations and Persisted Version Store (PVS) with Logical Revert to perform row level version-based Undo.

ADR 復原元件ADR recovery components

ADR 的四個關鍵元件為:The four key components of ADR are:

  • 持續版本存放區 (PVS)Persisted version store (PVS)

    持續版本存放區是新的 SQL Server 資料庫引擎機制,用來保存資料庫本身所產生的資料列版本,而不是傳統 tempdb 版本存放區。The persisted version store is a new SQL Server database engine mechanism for persisting the row versions generated in the database itself instead of the traditional tempdb version store. PVS 會啟用資源隔離,以及改善可讀取之次要複本的可用性。PVS enables resource isolation as well as improves availability of readable secondaries.

  • 邏輯還原Logical revert

    邏輯還原是非同步的處理序,負責執行資料列層級版本式復原,為所有建立版本的作業提供交易立即回復和復原。Logical revert is the asynchronous process responsible for performing row-level version-based Undo - providing instant transaction rollback and undo for all versioned operations. 邏輯還原的完成方式:Logical revert is accomplished by:

    • 追蹤所有已中止的交易,並將這些交易標示為其他交易不可見。Keeping track of all aborted transactions and marking them invisible to other transactions.
    • 對所有使用者交易使用 PVS 來執行復原,而不是實際掃描交易記錄並一次復原一項變更。Performing rollback by using PVS for all user transactions, rather than physically scanning the transaction log and undoing changes one at a time.
    • 在交易中止後立即釋出所有鎖定。Releasing all locks immediately after transaction abort. 由於中止牽涉到直接標記記憶體中的變更,這個流程非常有效率,而且不需要長時間保留鎖定。Since abort involves simply marking changes in memory, the process is very efficient and therefore locks do not have to be held for a long time.
  • sLogsLog

    sLog 是一種次要的記憶體內部記錄串流,其儲存未建立版本作業的記錄檔記錄 (例如中繼資料快取無效判定、取得鎖定等)。sLog is a secondary in-memory log stream that stores log records for non-versioned operations (such as metadata cache invalidation, lock acquisitions, and so on). sLog 是:The sLog is:

    • 體積小並位於記憶體內部Low volume and in-memory
    • 透過在檢查點處理序期間進行序列化來保存在磁碟上Persisted on disk by being serialized during the checkpoint process
    • 在交易認可時定期截斷Periodically truncated as transactions commit
    • 透過只處理未建立版本的作業來加速重做和復原Accelerates redo and undo by processing only the non-versioned operations
    • 透過只保留需要的記錄檔記錄來啟用積極性交易記錄截斷Enables aggressive transaction log truncation by preserving only the required log records
  • 清除工具Cleaner

    清除工具是一種非同步處理序,會定期喚醒並清理不需要的分頁版本。The cleaner is the asynchronous process that wakes up periodically and cleans page versions that are not needed.

加速資料庫復原模式Accelerated Database Recovery Patterns

ADR 對於下列類型的工作負載產生的效益最大:The following types of workloads benefit most from ADR:

  • 具有長時間執行交易的工作負載。Workloads with long-running transactions.
  • 曾經歷使用中交易造成交易記錄大幅成長的工作負載。Workloads that have seen cases where active transactions are causing the transaction log to grow significantly.
  • 由於長時間執行的復原 ((例如非預期的服務重新開機或手動交易回復) ),而經歷很長一段時間的資料庫無法使用的工作負載。Workloads that have experienced long periods of database unavailability due to long running recovery (such as unexpected service restart or manual transaction rollback).