Best Practices for Integration Services Configurations

by Jamie Thomson, SQL Server MVP and Senior Technical Consultant at Conchango PLC. You can view Jamie’s blog at http://blogs.conchango.com/jamiethomson/.

Introduction

This article explains what SQL Server Integration Services configurations are used for, why you should use Integration Services configurations, and what options you have for leveraging configurations. It will also make some simple recommendations that are based on my experiences of building Integration Services packages in a real-world environment. An understanding of the terms “package”, “Business Intelligence Development Studio”, and “dtexec.exe” in the context of Integration Services is assumed.

Why Use Configurations?

Integration Services packages are used primarily to manage the movement of data from one place to another. To do that, the package needs to know where it is moving data from and where it is moving data to. (I call these places the data sinks.)

Typically Integration Services packages are built on a different environment to where they are intended to be executed in production. For that reason alone, it is important that the package developer has a way of informing the package where those data sinks are located.

What Is an Integration Services Configuration?

People often refer to the .dtsConfig file that the configuration wizard within the Business Intelligence Development Studio produces as an Integration Services configuration, but that is not the case. A .dtsConfig file is actually a collection of Integration Services configurations, and that distinction is important.

A configuration is a name-value pair. The name is a property within a package that needs to be changed, and the value is the value to be assigned to that property. In the following example configuration taken from a .dtsConfig file, the property is ServerName, and its new value is SomeServer:

<Configuration ConfiguredType="Property"
               Path="\Package.Connections[AdventureWorks].Properties[ServerName]"
               ValueType="String">
  <ConfiguredValue>SomeServer</ConfiguredValue>
</Configuration>

Keep this important point in mind: a .dtsConfig file is not a configuration, it is a collection of one or more configurations.

Here is another example taken from a configuration that is stored in a SQL Server database. In this example, I have highlighted the property to be changed and the value that will be assigned to it:

Figure 1: Example of a SQL Server configuration

There are several other methods of providing a configuration using this name-value pair notation. Those methods will be explored later in this article.

Integration Services Properties and Objects

An Integration Services package consists of many objects, most of which fall into one of the following categories:

  • Containers (this includes all tasks, as well as other containers such as the ForEach Loop Enumerator)
  • Variables
  • Connection Managers
  • Data Flow Components

However you cannot apply a configuration to an object; you can only apply a configuration to a property of an object. For example, if you wanted to change the value returned when referencing a package variable, you would apply the configuration to the Value property of the variable, not to the variable object itself. In the example property path below, you can see how we are calling out a particular variable, and then the Value property of that variable:

\Package.Variables[User::MyVar].Properties[Value]         

Property paths are integral to how configurations work, and learning the syntax can be helpful, although it is not required.

Configuration Types

There five basic types of Integration Services configurations.

XML Configuration File

This is the most commonly used configuration type. An XML file, which by default has the .dtsConfig suffix, is created that contains one or more configurations. XML configuration files are popular because they can be easily deployed onto multiple environments simply by copying the file.

A disadvantage of XML configuration files is that the path to the file must be hard-coded within the package. However this requirement can be alleviated by using indirect configurations.

“Indirect configurations” is a term used to describe the practice of using an operating system environment variable to redirect the Integration Services execution engine to the location of the XML configuration file.The term “indirect configuration” is a slight misnomer, because the environment variable does not point at a configuration, it points at a configuration file; but this has become the accepted nomenclature. 

A configuration stored in an XML configuration file stores both the property path and the value outside the package.

Environment Variable Configuration

An environment variable configuration takes the value for a property from whatever is stored in a named environment variable.

An environment variable configuration stores the property path inside the package and the value outside the package.

Note that an environment variable configuration by itself is not an example of an indirect configuration.

Parent Package Configuration

Parent package configurations are used to fetch a value from a variable in a calling package.

Just like environment variable configurations, a parent package configuration stores the property path inside the package and the value outside the package.

Registry Configuration

The value to be applied to a package property is stored in a registry entry.

Just like environment variable configurations and parent package configurations, a registry configuration stores the property path inside the package and the value outside the package.

SQL Server Configuration

SQL Server configurations are stored in a SQL Server table. The table can have any name you like, and can be in any database on any server that you like.

If you are using a SQL Server configuration, then you can and should make it an indirect configuration. As explained earlier in this article, when using indirect configurations, an environment variable stores the location of the configurations themselves.

A configuration stored in SQL Server stores both the property path and the value outside the package.

Feature Comparison of Configuration Types

This table provides a high-level feature comparison of each configuration type.

Configuration Type

Where is the property path stored?

Where is the value stored?

Indirect configurations possible?

XML Configuration File

External to the package

External to the package

Yes

Environment variable

In the package

External to the package

No

Parent package

In the package

External to the package

No

Registry

In the package

External to the package

No

SQL Server

External to the package

External to the package

Yes

Best practices for Integration Services configurations

Consider command-line options as an alternative to configurations

You may expect that I am going to recommend that you should always define configurations in your Integration Services packages. Although I generally recommend that approach, Integration Services provides an important alternative to defining configurations within packages - namely the /SET and /CONFIGFILE options of dtexec.exe:

  • The /SET option of dtexec lets the person running the package apply a value to some property in the package that is being run. The property in question is defined using the same property path syntax as discussed previously.
  • The /CONFIGFILE option of dtexec lets the person running the package tell the package to use an XML configuration file, even if one has not been defined in the package.

There are significant advantages to using /SET and /CONFIGFILE. Above all, these run-time options mean that your packages can be more portable. That is, they do not have any dependencies on the execution environment. All of the configuration options discussed previously rely on the existence of either a file location, an environment variable, a SQL Server instance, a registry entry or a parent package. However, if you don’t define any configurations in yourpackages, then /SET and /CONFIGFILE let you take your packages to any environment and still provide configuration values at run time. 

There are also disadvantages to using the /SET and /CONFIGFILE options. For example, the Execute Package Task does not have a similar feature that reproduces their behavior. Therefore, if you have a solution that uses parent and child packages, you will probably find that you need to make use of parent package configurations. Another disadvantage of /SET and /CONFIGFILE is that there is no similar feature in Business Intelligence Development Studio that lets you test those options at design time.

The /SET and /CONFIGFILE options are very powerful features, but they are most appropriate for a particular class of packages - namely ,packages that encapsulate all functionality in one place and are not expected to be called from other packages. Of course, you can use /SET and /CONFIGFILE alongside normal configurations, and in many cases that may be the best approach.

Understand the limitations of configurations

Not all properties can be set by using configurations. Consider the Transfer SQL Server Objects Task, for example:

Figure 2: Properties of the Transfer SQL Server Objects task

The TablesList, ViewsList, StoredProceduresList, UserDefinedFunctionsList, DefaultsList, UserDefinedDataTypesList, and others are all properties of the task, but they are also collections. In both SQL Server 2005 and SQL Server 2008 Integration Services, we cannot set configurations for properties that are collections.

Use Configurations Sparingly

Just because it is possible to set a configuration on a property, it may not be sensible to do so.Be prudent when planning your configurations, and only apply them where necessary. It is rare that you need more configurations than there are connection managers in your package. 

Configure Only the ConnectionString Property for Connection Managers

After reading that statement, you may ask: “But what if I want to set the username and password for each connection manager? That’s two configurations already.”

That’s true, but this leads us to my next tip.If you want to put configurations on a connection manager, then you should apply a configuration to the ConnectionString property only. There is no need to apply a configuration for each of the ServerName, UserName, Password, and InitialCatalogproperties. Doing so would be cumbersome, and many connection strings require more information than those four properties can provide. So, in the interests of flexibility and ease of administration, ignore those four properties in favor of the ConnectionString property.   

Note: http://www.connectionstrings.com is a great resource for help in building connection strings.

Use Indirect Configurations in a Team Environment

If your Integration Services solution involves multiple Integration Services developers, then it is sometimes preferable for each developer to develop locally in a sandbox environment. In this situation, each developer requires configuration definitions that are local to his or her sandbox. In this scenario you should use a flexible configuration type such as indirect configurations that can easily be adapted for each developer’s computer.

Conclusion

This article has provided an overview of configurations in Integration Services. It has explained what a configuration is and what it is not, and the different configuration types. It has explored alternatives to using configurations, and provided some simple tips for their use.

About the author**. Jamie Thomson is a SQL Server MVP based in London, UK. He specializes in SQL Server Integration Services and frequently blogs in depth about Integration Services issues, techniques, and best practices.