CDC 控制工作CDC Control Task

CDC 控制工作是用來控制異動資料擷取 (CDC) 封裝的開發週期。The CDC Control task is used to control the life cycle of change data capture (CDC) packages. 它會處理 CDC 封裝與初始載入封裝的同步處理,以及 CDC 封裝執行中所處理之記錄序號 (LSN) 範圍的管理。It handles CDC package synchronization with the initial load package, the management of Log Sequence Number (LSN) ranges that are processed in a run of a CDC package. 此外,CDC 控制工作也會處理錯誤狀況和復原。In addition, the CDC Control task deals with error scenarios and recovery.

CDC 控制工作在 SSIS 封裝變數中維護 CDC 封裝的狀態,也可以將它保存在資料庫資料表中,以便於跨封裝啟動和在一起執行通用 CDC 處理序的多個封裝之間維護狀態 (例如某個工作可能負責初始載入,而另一個工作則負責滴漏式饋送更新)。The CDC Control task maintains the state of the CDC package in an SSIS package variable and it can also persist it in a database table so that the state is maintained across package activations and between multiple packages that together perform a common CDC process (for example, one task may be responsible for the initial loading and the other for the trickle-feed updates).

CDC 控制工作支援兩個作業群組。The CDC Control task supports two groups of operations. 一個群組處理初始載入和變更處理的同步處理,另一個群組則管理 CDC 封裝執行期間 LSN 變更處理範圍及追蹤何者已成功處理。One group handles the synchronization of initial load and change processing, and the other manages the change-processing range of LSNs for a run of a CDC package and keeps track of what was processed successfully.

下列作業會處理初始載入和變更處理的同步處理:The following operations handle the synchronization of initial load and change processing:

作業Operation 描述Description
ResetCdcStateResetCdcState 此作業是用來重設與目前 CDC 內容相關聯的永續性 CDC 狀態。This operation is used to reset the persistent CDC state associated with the current CDC context. 執行此作業之後,LSN 時間戳記 sys.fn_cdc_get_max_lsn 資料表中的目前最大 LSN 就會變成下一個處理範圍的範圍開頭。After this operation is run, the current maximum LSN from the LSN-timestamp sys.fn_cdc_get_max_lsn table becomes the start of the range for the next processing range. 此作業需要來源資料庫的連接。This operation requires a connection to the source database.
MarkInitialLoadStartMarkInitialLoadStart 在初始載入封裝開始時使用此作業,以便在初始載入封裝開始讀取來源資料表之前記錄來源資料庫中目前的 LSN。This operation is used at the beginning of an initial-load package to record the current LSN in the source database before the initial-load package starts reading the source tables. 這需要來源資料庫的連接,以呼叫 sys.fn_cdc_get_max_lsnThis requires a connection to the source database to call sys.fn_cdc_get_max_lsn.

如果您在 SQL Server 2017SQL Server 2017 CDC (亦即,非 Oracle) 上工作時選取了 MarkInitialLoadStart,連線管理員中指定的使用者就必須是 db_owner 或系統管理員。If you select MarkInitialLoadStart when working on SQL Server 2017SQL Server 2017 CDC (that is, not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.
MarkInitialLoadEndMarkInitialLoadEnd 在初始載入封裝結束時使用此作業,以便在初始載入封裝完成讀取來源資料表之後記錄來源資料庫中目前的 LSN。This operation is used at the end of an initial-load package to record the current LSN in the source database after the initial-load package finished reading the source tables. 這個 LSN 的決定方式如下:記錄進行此作業時的目前時間,然後在 CDC 資料庫中查詢 cdc.lsn_time_mapping 資料表,尋找該時間之後發生的變更。This LSN is determined by recording the current time when this operation occurred and then querying the cdc.lsn_time_mapping table in the CDC database looking for a change that occurred after that time.

如果您在 SQL Server 2017SQL Server 2017 CDC (亦即,非 Oracle) 上工作時選取了 MarkInitialLoadEnd,連線管理員中指定的使用者就必須是 db_owner 或系統管理員。If you select MarkInitialLoadEnd when working on SQL Server 2017SQL Server 2017 CDC (that is , not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.
MarkCdcStartMarkCdcStart 此作業是在從快照集資料庫進行初始載入時使用。This operation is used when then the initial load is made from a snapshot database. 在此情況下,變更處理應該在快照集 LSN 之後立即開始。In this case, the change processing should start immediately after the snapshot LSN. 您可以指定要使用的快照集資料庫名稱,CDC 控制工作就會在 SQL Server 2017SQL Server 2017 中查詢快照集 LSN。You can specify the name of the snapshot database to use and the CDC Control task queries SQL Server 2017SQL Server 2017 for the snapshot LSN. 您還可以選擇直接指定快照集 LSN。You also have the option to directly specify the snapshot LSN.

如果您在 SQL Server 2017SQL Server 2017 CDC (亦即,非 Oracle) 上工作時選取了 MarkCdcStart,連線管理員中指定的使用者就必須是 db_owner 或系統管理員。If you select MarkCdcStart when working on SQL Server 2017SQL Server 2017 CDC (that is , not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.

下列作業用來管理處理範圍:The following operations are used to manage the processing range:

作業Operation 描述Description
GetProcessingRangeGetProcessingRange 此作業是在叫用使用 CDC 來源資料流程的資料流程之前使用。This operation is used before invoking the data flow that uses the CDC Source data flow. 叫用此作業時,它會建立 CDC 來源資料流程所讀取的 LSN 範圍。It establishes a range of LSNs that the CDC Source data flow reads when invoked. 此範圍會儲存在資料流程處理期間 CDC 來源所使用的 SSIS 封裝變數中。The range is stored in an SSIS package variable that is used by the CDC Source during data-flow processing.

如需儲存之狀態的詳細資訊,請參閱 定義狀態變數For more information about the states that are stored, see Define a State Variable.
MarkProcessedRangeMarkProcessedRange 在每個 CDC 執行之後 (CDC 資料流程順利完成之後) 執行此作業,以便記錄 CDC 執行期間完整處理的最後一個 LSN。: This operation is executed after each CDC run (after the CDC data flow is completed successfully) to record the last LSN that was fully processed in the CDC run. 下次執行 GetProcessingRange 時,這個位置就是下一個處理範圍的開頭。The next time GetProcessingRange is executed, this position is the start of the processing range.

處理 CDC 狀態持續性Handling CDC State Persistency

CDC 控制工作會在啟動之間維護永續性狀態。The CDC Control task maintains a persistent state between activations. 儲存在 CDC 狀態中的資訊用來決定及維護 CDC 封裝的處理範圍,以及用於偵測錯誤狀態。The information stored in the CDC state is used to determine and maintain the processing range for the CDC package and for detecting error conditions. 永續性狀態儲存為字串。The persistent state is stored as a string. 如需詳細資訊,請參閱 定義狀態變數For more information, see Define a State Variable.

CDC 控制工作支援兩種狀態持續性類型。The CDC Control task supports two types of state persistency

  • 手動狀態持續性:在此情況下,CDC 控制工作會管理儲存在封裝變數中的狀態,但是封裝開發人員必須在呼叫 CDC 控制之前從永續性存放區讀取變數,然後在最後一次呼叫 CDC 控制而且 CDC 執行完成之後將變數寫回該永續性存放區。Manual State Persistency: In this case, the CDC Control task manages the state stored in a package variable but the package developer must read the variable from a persistent store before calling the CDC Control and then write it back to that persistent store after the CDC Control is last called and the CDC run completes.

  • 自動狀態持續性:CDC 狀態會儲存在資料庫資料表中。Automatic State Persistency: The CDC state is stored in a table in a database. 此狀態是在 [要用於儲存狀態的資料表] 屬性 (位於儲存狀態的選定連接管理員) 所指名資料表中, StateName 屬性所提供的名稱下儲存。The state is stored under a name provided in the StateName property in a table named in the Table to Use for Storing State property, which is located in a selected connection manager for storing the state. 預設是來源連接管理員,但常見作法是將它做為目標連接管理員。The default is the source connection manager but the common practice is for it to be the target connection manager. CDC 控制工作會更新狀態資料表中的狀態值,並認可此值做為環境交易的一部分。The CDC Control task updates the state value in the state table and this is committed as part of the ambient transaction.

錯誤處理Error Handling

在下列情況下,CDC 控制工作可能會報告錯誤:The CDC Control task may report an error when:

  • 它無法讀取 CDC 永續性狀態或永續性狀態更新失敗。It fails to read the persistent CDC state or when updating the persistent state fails.

  • 它無法從來源資料庫讀取目前的 LSN 資訊。It fails to read the current LSN information from the source database.

  • CDC 狀態讀取不一致。The CDC state read is not consistent.

    在所有這些情況下,CDC 控制工作都會報告錯誤,而 SSIS 則會以處理控制流程錯誤的標準方式予以處理。In all of these cases, the CDC Control task reports an error that can be handled in the standard way SSIS handles control-flow errors.

    在未呼叫標記處理的範圍的情況下,直接在取得處理範圍作業之後叫用另一個取得處理範圍作業時,CDC 控制工作可能也會報告警告。The CDC Control task may also report a warning when the Get Processing Range operation is invoked directly after another Get Processing Range operation without Mark Processed Range being called. 這種情況表示上次執行失敗,或者另一個 CDC 封裝可能正在以相同的 CDC 狀態名稱下執行。This is an indication that the previous run failed or that another CDC package may be running using the same CDC state name.

設定 CDC 控制工作Configuring the CDC Control Task

您可以透過 SSIS 設計師或以程式設計方式設定屬性。You can set properties through SSIS Designer or programmatically.

本節內容In This Section

定義狀態變數Define a State Variable

CDC 控制工作編輯器CDC Control Task Editor

使用 [CDC 控制工作編輯器] 對話方塊,即可設定 CDC 控制工作。Use the CDC Control Task Editor dialog box to configure the CDC Control task. CDC 控制工作組態包括定義 CDC 資料庫的連接、CDC 工作作業,以及狀態管理資訊。The CDC Control task configuration includes defining a connection to the CDC database, the CDC task operation and the state management information.

若要了解有關 CDC 控制工作的詳細資訊,請參閱< CDC Control Task>。To learn more about the CDC Control task, see CDC Control Task.

若要開啟 CDC 控制工作編輯器To open the CDC Control Task Editor

  1. SQL Server Data ToolsSQL Server Data Tools中,開啟具有 CDC 控制工作的 SQL Server 2017 Integration Services (SSIS)SQL Server 2017 Integration Services (SSIS) 封裝。In SQL Server Data ToolsSQL Server Data Tools, open the SQL Server 2017 Integration Services (SSIS)SQL Server 2017 Integration Services (SSIS) package that has the CDC Control task.

  2. 在 [控制流程] 索引標籤中,按兩下 CDC 控制工作。On the Control Flow tab, double-click the CDC Control task.

選項。Options

SQL Server CDC 資料庫 ADO.NET 連接管理員SQL Server CDC database ADO.NET connection manager
從清單中選取現有的連接管理員,或按一下 [新增] 建立新的連接。Select an existing connection manager from the list, or click New to create a new connection. 此連接必須指向啟用 CDC 而且包含選取之變更資料表的 [SQL Server]SQL Server 資料庫。The connection must be to a [SQL Server]SQL Server database that is enabled for CDC and where the selected change table is located.

CDC 控制作業CDC Control Operation
選取要針對此工作執行的作業。Select the operation to run for this task. 所有作業都會使用儲存在 SSIS 封裝變數中的狀態變數,這個變數會儲存狀態並且在封裝中的不同元件之間傳遞狀態。All operations use the state variable that is stored in an SSIS package variable that stores the state and passes it between the different components in the package.

  • 標記初始載入開始:此作業是在從不含快照集的使用中資料庫執行初始載入時使用。Mark initial load start: This operation is used when executing an initial load from an active database without a snapshot. 初始載入封裝開始讀取來源資料表之前,系統會在初始載入封裝的開頭叫用此作業,以便在來源資料庫中記錄目前的 LSN。It is invoked at the beginning of an initial-load package to record the current LSN in the source database before the initial-load package starts reading the source tables. 這需要來源資料庫的連接。This requires a connection to the source database.

    如果您在 SQL Server 2017SQL Server 2017 CDC (亦即非 Oracle) 上工作時選取了 [標記初始載入開始],連接管理員中指定的使用者就必須是 db_owner系統管理員If you select Mark Initial Load Start when working on SQL Server 2017SQL Server 2017 CDC (that is, not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.

  • 標記初始載入結束:此作業是在從不含快照集的使用中資料庫執行初始載入時使用。Mark initial load end: This operation is used when executing an initial load from an active database without a snapshot. 初始載入封裝讀取來源資料表完成之後,系統會在初始載入封裝的結尾叫用此作業,以便在來源資料庫中記錄目前的 LSN。It is invoked at the end of an initial-load package to record the current LSN in the source database after the initial-load package finished reading the source tables. 這個 LSN 的決定方式如下:記錄進行此作業時的目前時間,然後在 CDC 資料庫中查詢 cdc.lsn_time_mapping 資料表,尋找該時間之後發生的變更。This LSN is determined by recording nthe current time when this operation occurred and then querying the cdc.lsn_time_mapping table in the CDC database looking for a change that occurred after that time

    如果您在 SQL Server 2017SQL Server 2017 CDC (亦即非 Oracle) 上工作時選取了 [標記初始載入結束],連接管理員中指定的使用者就必須是 db_owner系統管理員If you select Mark Initial Load End when working on SQL Server 2017SQL Server 2017 CDC (that is, not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.

  • 標記 CDC 開始:此作業是在從快照集資料庫或靜止資料庫進行初始載入時使用。Mark CDC start: This operation is used when then the initial load is made from a snapshot database database or from a quiescence database. 系統會在初始載入封裝中的任何時間點叫用此作業。It is invoked at any point within the initial load package. 此作業所接受的參數可以是快照集 LSN、快照集資料庫的名稱 (從中自動衍生快照集 LSN),也可以保留空白 (在此情況中,目前資料庫 LSN 就會當做變更處理封裝的啟始 LSN 使用)。The operation accepts a parameter that can be a snapshot LSN, a name of a snapshot database (from which the snapshot LSN will be derived automatically) or it can be left empty, in which case the current database LSN is used as the start LSN for the change processing package.

    此作業是用來取代「標記初始載入開始/結束」作業。This operation is used instead of the Mark Initial Load Start/End operations.

    如果您在 SQL Server 2017SQL Server 2017 CDC (亦即非 Oracle) 上工作時選取了 [標記 CDC 開始],連接管理員中指定的使用者就必須是 db_owner系統管理員If you select Mark CDC Start when working on SQL Server 2017SQL Server 2017 CDC (that is, not Oracle) the user specified in the connection manager must be either db_owner or sysadmin.

  • 取得處理範圍:此作業是在叫用使用 CDC 來源資料流程的資料流程之前用於變更處理封裝中。Get processing range: This operation is used in a change processing package before invoking the data flow that uses the CDC Source data flow. 叫用此作業時,它會建立 CDC 來源資料流程所讀取的 LSN 範圍。It establishes a range of LSNs that the CDC Source data flow reads when invoked. 此範圍會儲存在資料流程處理期間 CDC 來源所使用的 SSIS 封裝變數中。The range is stored in an SSIS package variable that is used by the CDC Source during data-flow processing.

    如需儲存之可能 CDC 狀態的詳細資訊,請參閱 定義狀態變數For more information about the possible CDC states that are stored, see Define a State Variable.

  • 標記處理的範圍:此作業是在 CDC 回合結束時 (CDC 資料流程順利完成之後) 用於變更處理封裝中,以便記錄 CDC 回合中完整處理的最後一個 LSN。Mark processed range: This operation is used in a change processing package at the end of a CDC run (after the CDC data flow is completed successfully) to record the last LSN that was fully processed in the CDC run. 下次執行 GetProcessingRange 時,這個位置就會決定下一個處理範圍的開頭。The next time GetProcessingRange is executed, this position determines the start of the next processing range.

  • 重設 CDC 狀態:此作業是用來重設與目前 CDC 內容相關聯的持續性 CDC 狀態。Reset CDC state: This operation is used to reset the persistent CDC state associated with the current CDC context. 執行此作業之後,LSN 時間戳記 sys.fn_cdc_get_max_lsn 資料表中的目前最大 LSN 就會變成下一個處理範圍的範圍開頭。After this operation is run, the current maximum LSN from the LSN-timestamp sys.fn_cdc_get_max_lsn table becomes the start of the range for the next processing range. 此作業需要來源資料庫的連接。This operation requires a connection to the source database.

    此作業使用時機的範例如下:當您只想要處理新建立的變更記錄,而忽略所有舊的變更記錄時。An example of when this operation is used is when you want to process only the newly created change records and ignore all old change records.

    包含 CDC 狀態的變數Variable containing the CDC state
    選取儲存工作作業之狀態資訊的 SSIS 封裝變數。Select the SSIS package variable that stores the state information for the task operation. 您應該在開始之前定義變數。You should define a variable before you begin. 如果您選取 [自動狀態持續性],系統就會自動載入並儲存狀態變數。If you select Automatic state persistence, the state variable is loaded and saved automatically.

    如需定義狀態變數的詳細資訊,請參閱 定義狀態變數For more information about defining the state variable, see Define a State Variable.

    要啟動 CDC 的 SQL Server LSN/快照集名稱:SQL Server LSN to start the CDC/Snapshot name:
    輸入目前的來源資料庫 LSN 或從中執行初始載入以決定 CDC 啟動位置之快照集資料庫的名稱。Type the current source database LSN or the name of the snapshot database from which the initial load is performed to determine where the CDC starts. 只有當 [CDC 控制作業] 設定為 [標記 CDC 開始] 時,才能使用這個選項。This is available only if the CDC Control Operation is set to Mark CDC Start.

    如需有關這些作業的詳細資訊,請參閱< CDC Control Task>。For more information about these operations, see CDC Control Task

    自動在資料庫資料表中儲存狀態Automatically store state in a database table
    選取此核取方塊,可讓 CDC 控制工作自動在指定資料庫所包含的狀態資料表中載入並儲存 CDC 狀態。Select this check box for the CDC Control task to automatically handle loading and storing the CDC state in a state table contained in the specified database. 如果沒有選取此選項,開發人員就必須在封裝啟動時載入 CDC 狀態,而且每次 CDC 狀態變更時都必須儲存狀態。When not selected, the developer must load the CDC State when the package starts and save it whenever the CDC State changes.

    儲存狀態之資料庫的連接管理員Connection manager for the database where the state is stored
    從清單中選取現有的 ADO.NET 連接管理員,或按一下 [新增] 建立新的連接。Select an existing ADO.NET connection manager from the list, or click New to create a new connection. 這個連接會指向包含狀態資料表的 [SQL Server]SQL Server 資料庫。This connection is to a [SQL Server]SQL Server database that contains the State table. 狀態資料表包含狀態資訊。The State table contains the State information.

    只有當您已選取 [自動狀態持續性] 時,才能使用這個選項,而且它是必要參數。This is available only if Automatic state persistence is selected and it is a required parameter.

    要用於儲存狀態的資料表Table to use for storing state
    輸入要用於儲存 CDC 狀態之狀態資料表的名稱。Type the name of the state table to be used for storing the CDC state. 指定的資料表必須具有兩個名為 namestate 的資料行,而且這兩個資料行的資料類型都必須是 varchar (256)The table specified must have two columns called name and state and both columns must be of the data type varchar (256).

    您可以選擇性地選取 [新增] 取得 SQL 指令碼,以便建置含有必要資料行的新狀態資料表。You can optionally select New to get an SQL script that builds a new State table with the required columns. 選取 [自動狀態持續性] 時,開發人員必須根據上述需求建立狀態資料表。When Automatic state persistence is selected, the developer must create a state table according to the requirements listed above.

    只有當您已選取 [自動狀態持續性] 時,才能使用這個選項,而且它是必要參數。This is available only if Automatic state persistence is selected and it is a required parameter.

    狀態名稱State name
    輸入要與持續性 CDC 狀態產生關聯的名稱。Type a name to associate with the persistent CDC state. 使用相同 CDC 內容的完整載入和 CDC 封裝都將指定一般狀態名稱。The full load and CDC packages that work with the same CDC context will specify a common state name. 這個名稱是用於查閱狀態資料表中的狀態資料列。This name is used for looking up the state row in the state table