Lesson 1: Configuring Package Transactions and Checkpoints
Estimated lesson time: 30 minutes
Most relational databases, like SQL Server, perform operations in atomic units. This means that a single statement or series of statements is either successful and affects data or is not successful and the system returns the data to the state it was in before the attempted statement execution. The unit of work that needs to be completed successfully in order for the data to be applied is called a transaction.
In SSIS, transactions can be set at various levels of the package, and you can coordinate transactions with package restartability features. In other words, if a package fails at a certain point, it can be configured to start from the point of failure or from an earlier step when the package is rerun. In SSIS, this configuration process is called adding checkpoints. Checkpoints work together with transactions to enable package restartability.
Defining Package and Task Transaction Settings
You can set package transactions at the entire package level or at any control flow container level or task level. Transactions in SSIS use the Windows Distributed Transaction Coordinator (DTC); the DTC service needs to be started on the machine for transactions to work. Any service or program that is enabled to work with the DTC can be part of a transaction in SSIS.
To enable a transaction within a package, you need to set the TransactionOption property of the task or container to Required. Figure 5-1 highlights the properties of a package at the control flow level, which means they apply to the package as a whole. The TransactionOption property is the same on any control flow object.
Figure 5-1 A task's or container's TransactionOption property must be set to Required to enable a transaction within a package.
When deciding whether and how to implement a transaction, follow these guidelines:
- For transactions to be enabled in SSIS, you need to turn on the DTC service, and the tasks that you want to be part of the transaction must work with the DTC service natively.
- If a series of tasks must be completed as a single unit, in which either all the tasks are successful and committed or an error occurs and none of the tasks are committed, then place the tasks within a Sequence Container, and then set the TransactionOption property of the container to Required.
- A task can inherit the transaction setting of its parent when the TransactionOption property is set to Supported, which is the default setting when creating a task or container.
- You can prevent a task from participating in a transaction by setting its TransactionOp-tion setting to NotSupported.
- Transactions work at the control flow level and not within a data flow. This means that you can turn on a transaction for a data flow task, but you cannot turn it on separately for selected components within the data flow; either the entire data process will be successful or it will be rolled back.
Implementing Restartability Checkpoints
At times, especially if you are working with complicated or long-running packages, you will want the ability to restart a package if it fails and have it start at the point of failure. In other words, you might not want the tasks that were already successful to be run again if the package is restarted. This can be done by enabling checkpoints in the package.
Enabling restartability within a package requires, first, enabling a package to use checkpoints, and, second, setting the specific tasks and containers to write checkpoints. To turn on checkpoints within a package, follow these steps:
- Within the package, open the Properties window, and then select the Control Flow tab, which will reveal the properties of the package.
- Set the SaveCheckpoints property at the package level to True. This allows checkpoints to be saved during package execution.
- In the CheckpointFileName property, provide a valid path and file name to a checkpoint file. Packages use files to maintain their state information, so if a package fails and is then restarted, the package can read the checkpoint file to determine where it left off and to track the state information at the last successful task.
- Set the CheckpointUsage to IfExists, which causes the package to run from the beginning if the file is not present or to run from the identified point if the file exists.
IMPORTANT Checkpoint file usage
If you set the CheckpointUsage to Always, the checkpoint file must be present or the package will not start.
Figure 5-2 shows the package properties, highlighting the checkpoint properties set in steps 1 through 4, above.
After you enable checkpoints in a package, the final step is to set checkpoints at the various tasks within your package. To do this, set the FailPackageOnFailure property at each task or container to True. When you have set checkpoints for the control-flow objects, your packages are set up to restart in case of failure. Figure 5-3 shows two executions of the same package with checkpoints turned on.
Figure 5-2 Setting checkpoint properties for a package
Figure 5-3 Two executions of a package that have checkpoints turned on
As you can see, this package failed at step 3 of the first run. After the problem was corrected, the second execution of the package started at step 3 and continued to completion.
BEST PRACTICES Managing multiple tasks with transactions and restartability
Sometimes, a series of tasks will need to be rolled back if one of the tasks fails, but you want to go back only to the point of failure, not to the start of the package. To do this, you need to enable checkpoints in the package and then use a Sequence Container that holds all the associated tasks. The Sequence Container needs to have the transactions turned on by setting the TransactionOption to Required at the container level.
When a package has the restartability settings enabled and checkpoints set, it will manage the checkpoint file during its execution. The following processes happen behind the scenes for a package that has checkpoints set:
- When the package loads for execution, the checkpoint file is created.
- For every task or container that is configured for checkpoints, the checkpoint file is updated to include the latest runtime information such as variable values and the successful tasks.
- If the package fails, then the file remains in place so that the next time the package is executed, it will read the checkpoint file for execution. Alternatively, if you want the package to begin from the first task or container, the file can be manually deleted.
- If the package execution is successful, then the checkpoint file will be deleted automatically.
Practice: Implementing Package and Task Transactions
In this practice, you will turn on transactions at the container level and observe the results when a task fails within the container.
Exercise 1: Enabling Transactions
- Navigate to the Control Panel/Administrative Tools/Services console, and then start the Distributed Transaction Coordinator service.
- Open the SSIS project that you created in Chapter 4, and then open the package called MyPackage.dtsx. Or you can use the 'Start Here' project in the Source\Ch 05 folder in the installed practice files.
- From the View menu in the menu bar, open the Properties window, and then click the pushpin in the Properties window to lock the window in the open position.
- In the Control Flow Designer, select the Sequence Container (by clicking on it), and then note the Sequence Container properties listed in the Properties window.
- Set the TransactionOption to Required using the drop-down list.
- Save the package by clicking the Save icon in the toolbar.
Exercise 2: Observing a Transaction Rollback
- With the package from Practice 1 still opened, expand the Sequence Container, and drag and drop a new Execute SQL Task to the bottom of the Sequence Container workspace.
- Connect the green precedence arrow from the Data Flow to the new Execute SQL Task by dragging the green arrow from the bottom of the Data Flow onto the Execute SQL Task.
- Edit the Execute SQL Task by double-clicking on the task. In the Execute SQL Task Editor, change the Connection property to the AdventureWorks connection.
- Change the Name property within the Execute SQL Task Editor to Force Failure.
- Select OK in the Execute SQL Task Editor to return to the Control Flow.
- With the Force Failure Execute SQL Task still selected, open the Properties window and change the ForceExecutionResult property to Failure. By setting this property to Failure, you are specifying that the task should fail intentionally, which is something you might do for testing purposes.
- Open a new database query, in SSMS, against the AdventureWorks database.
- Run the following SQL statement and observe the results: SELECT COUNT(*) FROM Sales_Summary
- In the Business Intelligence Development Studio (BIDS), execute the SSIS package you just modified, which will intentionally fail at the last step.
- Stop the package execution, and rerun the query from step 8.
- Observe that even though the data flow was successful, the data was rolled back because the Sequence Container was configured with a transaction, and the last task within the Sequence Container failed.
- You add a sequence container to a package that contains several tasks, one of which calls a command on a legacy system and another of which a Data Flow Task imports data into SQL Server. Even with DTC started and transactions turned on, your sequence container fails before the tasks even run. What is the problem?
- What do you need to set in order to use checkpoint properties at the package level, after you have turned the checkpoint properties on?
Quick Check Answers
- The transactions featured in SSIS use the DTC service. However, not all systems support DTC, and a transaction cannot be forced on a non-compliant system, so the container will fail. You should remove the legacy task from the sequence container that has the transaction.
- You need to set the FailPackageOnFailure property to True for tasks to write to the checkpoint file. However, if you want to rerun any successful tasks that occur before the failure task, you need to use a sequence container around the group of related tasks that require transactions.
© Microsoft. All Rights Reserved.