異動資料擷取 (SSIS)Change Data Capture (SSIS)

SQL ServerSQL Server中,異動資料擷取會針對將累加式載入從來源資料表有效執行到資料超市和資料倉儲的挑戰,提供有效的方案。In SQL ServerSQL Server, change data capture offers an effective solution to the challenge of efficiently performing incremental loads from source tables to data marts and data warehouses.

什麼是異動資料擷取?What is Change Data Capture?

來源資料表會隨著時間變更。Source tables change over time. 以這些資料表為基礎的資料超市或資料倉儲必須反映這些變更。A data mart or data warehouse that is based on those tables needs to reflect these changes. 不過,定期複製完整來源之快照集的程序會耗費太多時間和資源。However, a process that periodically copies a snapshot of the entire source consumes too much time and resources. 包括時間戳記資料行、觸發程序或複雜查詢的替代方式通常有損效能並增加複雜度。Alternate approaches that include timestamp columns, triggers, or complex queries often hurt performance and increase complexity. 需要的是變更資料結構化的可靠資料流,讓消費者可以將其輕鬆地套用到資料的目標表示法。What is needed is a reliable stream of change data that is structured so that it can easily be applied by consumers to target representations of the data. SQL ServerSQL Server 中的異動資料擷取會提供這個解決方案。Change data capture in SQL ServerSQL Server provides this solution.

Database EngineDatabase Engine 的異動資料擷取功能會擷取套用到 SQL ServerSQL Server 資料表的插入、更新與刪除活動,並以容易取用的關聯式格式,提供變更的詳細資料。The change data capture feature of the Database EngineDatabase Engine captures insert, update, and delete activity applied to SQL ServerSQL Server tables, and makes the details of the changes available in an easily-consumed, relational format. 異動資料擷取所使用的變更資料表包含鏡像追蹤來源資料表之資料行結構的資料行,以及了解逐資料列發生之變更所需的中繼資料。The change tables used by change data capture contain columns that mirror the column structure of the tracked source tables, along with the metadata needed to understand the changes that have occurred on a row by row basis.

注意

並非每個 MicrosoftMicrosoft SQL ServerSQL Server版本中都無法異動資料擷取。Change data capture is not available in every edition of MicrosoftMicrosoft SQL ServerSQL Server. 如需 SQL ServerSQL Server版本支援的功能清單,請參閱 SQL Server 2016 版本支援的功能For a list of features that are supported by the editions of SQL ServerSQL Server, see Features Supported by the Editions of SQL Server 2016.

異動資料擷取在 Integration Services 中的運作方式How Change Data Capture Works in Integration Services

Integration ServicesIntegration Services 封裝可以輕易地收集 SQL ServerSQL Server 資料庫中的異動資料,從而對資料倉儲執行有效率的累加式載入。An Integration ServicesIntegration Services package can easily harvest the change data in the SQL ServerSQL Server databases to perform efficient incremental loads to a data warehouse. 不過,在您可以使用 Integration ServicesIntegration Services 載入變更資料前,管理員必須在您要擷取變更的資料庫和資料表上啟用異動資料擷取。However, before you can use Integration ServicesIntegration Services to load change data, an administrator must enable change data capture on the database and the tables from which you want to capture changes. 如需如何在資料庫上設定異動資料擷取的詳細資訊,請參閱啟用和停用異動資料擷取 (SQL Server)For more information on how to configure change data capture on a database, see Enable and Disable Change Data Capture (SQL Server).

一旦管理員已經在資料庫上啟用異動資料擷取,您就可以建立執行累加式變更資料載入的封裝。Once an administrator has enabled change data capture on the database, you can create a package that performs an incremental load of the change data. 下圖顯示建立可從單一資料表執行累加式載入這種封裝的步驟:The following diagram shows the steps for creating such a package that performs an incremental load from a single table:

異動資料擷取封裝建立步驟Change Data Capture Package Creation Steps

如上圖所示,建立可執行累加式變更資料載入的封裝包含下列步驟:As shown in the previous diagram, creating a package that performs an incremental load of changed data involves the following steps:

步驟 1:設計控制流程Step 1: Designing the Control Flow
在封裝的控制流程中,必須定義下列工作:In the control flow in the package, the following tasks need to be defined:

  • 針對您要擷取的來源資料,計算變更間隔的開始和結束 datetime 值。Calculate the starting and ending datetime values for the interval of changes to the source data that you want to retrieve.

    若要計算這些值,請搭配 Integration ServicesIntegration Services datetime 函數使用「執行 SQL」工作或 運算式。To calculate these values, use an Execute SQL task or Integration ServicesIntegration Services expressions with datetime functions. 然後您可以用封裝變數儲存這些端點,以便稍後在封裝中使用。You then store these endpoints in package variables for use later in the package.

    如需詳細資訊,請參閱指定變更資料的間隔For more information: Specify an Interval of Change Data

  • 判斷所選間隔的變更資料是否就緒。Determine whether the change data for the selected interval is ready. 由於非同步的擷取程序可能還沒有達到所選的端點,因此這是必要的步驟。This step is necessary because the asynchronous capture process might not yet have reached the selected endpoint.

    若要判斷資料是否就緒,如果必要,開始使用「For 迴圈」容器延遲執行,直到所選間隔的變更資料就緒為止。To determine whether the data is ready, start with a For Loop container to delay execution, if necessary, until the change data for the selected interval is ready. 在迴圈容器內部,使用「執行 SQL」工作查詢由異動資料擷取所維護的時間對應資料表。Inside the loop container, use an Execute SQL task to query the time mapping tables maintained by change data capture. 然後,使用呼叫 Thread.Sleep 方法的「指令碼」工作,或搭配 WAITFOR 陳述式使用另一個「執行 SQL」工作,暫時延遲封裝的執行 (如有必要)。Then, use a Script task that calls the Thread.Sleep method, or another Execute SQL task with a WAITFOR statement, to delay the execution of the package temporarily, if necessary. 或者,使用其他「指令碼」工作記錄錯誤條件或逾時。Optionally, use another Script task to log an error condition or a timeout.

    如需詳細資訊,請參閱判斷變更資料是否就緒For more information: Determine Whether the Change Data Is Ready

  • 準備將用於查詢變更資料的查詢字串。Prepare the query string that will be used to query for the change data.

    使用「指令碼」工作或「執行 SQL」工作來組合將用於查詢變更的 SQL 陳述式。Use a Script task or an Execute SQL task to assemble the SQL statement that will be used to query for changes.

    如需詳細資訊,請參閱準備查詢變更資料For more information: Prepare to Query for the Change Data

    步驟 2:設定異動資料的查詢Step 2: Setting Up the Query for Change Data
    建立將會查詢資料的資料表值函數。Create the table-valued function that will query for the data.

    使用 Transact-SQLSQL Server Management Studio 來開發及儲存查詢。Use Transact-SQLSQL Server Management Studio to develop and save the query.

    如需詳細資訊,請參閱擷取與了解變更資料For more information: Retrieve and Understand the Change Data

    步驟 3:設計資料流程Step 3: Designing the Data Flow
    在封裝的資料流程中,必須定義下列工作:In the data flow of the package, the following tasks need to be defined:

  • 從變更資料表擷取變更資料。Retrieve the change data from the change tables.

    若要擷取資料,使用來源元件來查詢所選間隔內之變更的變更資料表。To retrieve the data, use a source component to query the change tables for the changes that fall within the selected interval. 此來源會呼叫您必須在先前已經建立的 Transact-SQL 資料表值函數。The source calls a Transact-SQL table-valued function that you must have previously created.

    如需詳細資訊,請參閱擷取與了解變更資料For more information: Retrieve and Understand the Change Data

  • 將變更分割為要處理的插入、更新與刪除。Split the changes into inserts, updates, and deletes for processing.

    若要分割變更,使用「條件式分割」轉換,將插入、更新與刪除導引到不同的輸出以便進行適當的處理。To split the changes, use a Conditional Split transformation to direct inserts, updates, and deletes to different outputs for appropriate processing.

    如需詳細資訊,請參閱處理插入、更新與刪除For more information: Process Inserts, Updates, and Deletes

  • 將插入、刪除與更新套用到目的地。Apply the inserts, deletes, and updates to the destination.

    若要將變更套用到目的地,請使用目的地元件,將插入套用到目的地。To apply the changes to the destination, use a destination component to apply the inserts to the destination. 同時,搭配參數化的 UPDATE 和 DELETE 陳述式使用「OLE DB 命令」轉換,將更新與刪除套用到目的地。Also, use OLE DB Command transformations with parameterized UPDATE and DELETE statements to apply updates and deletes to the destination. 您也可以使用目的地元件來套用更新與刪除,以便將資料列儲存到暫存資料表中。You can also apply updates and deletes by using destination components to save the rows to temporary tables. 接著,使用「執行 SQL」工作,根據暫存資料表的目的地,執行大量更新與大量刪除作業。Then, use Execute SQL tasks to perform bulk update and bulk delete operations against the destination from the temporary tables.

    如需詳細資訊,請參閱將變更套用到目的地For more information: Apply the Changes to the Destination

多個資料表的資料變更Change Data from Multiple Tables

在上圖與上述步驟中所述的程序包含來自單一資料表的累加式載入。The process outlined in the previous diagram and steps involves an incremental load from a single table. 當您必須從多個資料表執行累加式載入時,整個程序都相同。When having to perform an incremental load from multiple tables, the overall process is the same. 不過,必須變更封裝的設計以配合多個資料表的處理。However, the design of the package needs to be changed to accommodate the processing of multiple tables. 如需如何建立可從多個資料表執行累加式載入之封裝的詳細資訊,請參閱 執行多個資料表的累加式載入For more information on how to create a package that performs an incremental load from multiples tables, see Perform an Incremental Load of Multiple Tables.

異動資料擷取封裝的範例Samples of Change Data Capture Packages

Integration ServicesIntegration Services 提供兩個範例,示範如何在封裝中使用異動資料擷取。 provides two samples that demonstrate how to use change data capture in packages. 如需詳細資訊,請參閱下列主題:For more information, see the following topics:

sqlblog.com 上的部落格文章: SSIS Design Pattern – Incremental Load(SSIS 設計模式 - 累加式載入)Blog entry, SSIS Design Pattern – Incremental Load, on sqlblog.com