SSIS Process Control
A significant advancement to SSIS is the package architecture design for its process control management. You’ve already learned that the SSIS process control architecture includes the control flow, data flow, and event handler components. Each of these process control components includes common and unique sets of objects for you to use when designing and creating your packages.
SSIS Control Flow
SSIS package objects (containers, data flow tasks, administration tasks, precedence constraints, and variables) are elements of the control flow component of the process control architecture. The control flow is the highest-level control process. It allows you to orchestrate and manage the run-time process activities of data flow and other processes within a package. In fact, you can design a control flow by using an Execute Package task to manage the sequence of processing for a set of existing packages in a Master Package concept. This capability allows you to combine individual packages into a highly manageable workflow process. Use precedence constraints to set the process rules and to specify sequence within the control flow. An SSIS package consists of a control flow and one or more objects. Data flow and event handler process control components are optional.
SSIS Data Flow
When you want to extract, transform, and load data within a package, you add an SSIS data flow task to the package control flow. Each data flow task creates its own data flow process control component for processing at run time. You configure each data flow to manage data sources, data destinations, and optional data transformations for any kind of data manipulation your packages might require. You can have as many data flow components within a package as you need to handle all the kinds of data sources and destinations you might have.
The SSIS data flow component provides a comprehensive set of pre-defined data sources and destination objects to enable you to design and develop packages easily for most of the databases and data source files you might have within your IT environment. You can add custom data sources if you need them. Data destinations allow you to deliver data from a data flow process in a variety of formats. An SSIS package can even provide data directly to an application by storing it in an ASP.NET DataReader destination object. Using this destination-type object, you don’t have to place the data in a persistent data store, and you can design application integrations, enabling near real-time data delivery.
A set of data transformation task objects is provided within SSIS data flow. These transformation tasks have been designed to meet most, if not all, of the kinds of data conversion, manipulation, standardization, merging, splitting, fuzzy matching, and other types of transformations without having to write complicated programming code. You will learn about many of these transformation tasks, data sources, and destination objects later, in Part II of this book, “Designing Packages.”
SSIS Data Pipeline
The SSIS data flow process control component and its tasks are processed by the data flow engine within SSIS. A key feature of the SSIS data flow engine is the data pipeline, shown in Figure 1-1, which uses memory buffers to improve processing performance. The data pipeline enables parallel data processing options and reduces or eliminates multiple passes of reading and writing of the data during package execution and processing. This level of efficiency means you can process significantly more data in shorter periods of time than is possible if you rely simply on stored procedures for your ETL processes.
Figure 1-1 The SSIS data flow data pipeline
Maximum data processing performance for SSIS packages is achieved because the data pipeline uses buffers to manipulate data in memory. Source data, whether it’s relational, structured as XML data, or stored in flat files like spreadsheets or comma-delimited text files, is converted into table-like structures containing columns and rows and loaded directly into memory buffers without the need of staging the data first in temporary tables. Transformations within a data flow operate on the in-memory buffered data as well as on sorting, merging, modifying, and enhancing the data before sending it to the next transformation or on to its final destination. By avoiding the overhead of re-reading from and writing to disk, the processes required to move and manipulate data can operate at optimal speed.
SSIS Event Handler
The event handler process control, unlike the data flow process control, is not managed by the control flow. When you want to control processing at specific occurrences of events during package execution, you use the SSIS event handler process control component. An event handler runs in response to an event raised by the package or by a task or container within the package. Typically, event handlers are created in a package to perform special processing as a result of data anomalies, to trigger other programs, or to launch other packages based upon the event state within the running package. For example, you can create an event handler to send an e-mail alert notification in the event of a task or package for either a success or a failure or simply for a completion state.
You will learn more about SSIS package architecture and its objects and process control components later, in Part II of this book.
© Microsoft. All Rights Reserved.