Data Generation Wizard

One of the new features in the Visual Studio Team System 2008 Database Edition - Power Tools release is the addition of the Data Generation Wizard. The Data Generation Wizard is an item template wizard that allows you to create and configure a data generation plan by pointing it at an existing database, which has the same schema as the underlying database project, and configure each column inside each table with a sequential data bound generator. When finished you have a configured data generation plan that pulls all rows from the existing database in to your target database.

Lets step through the process.


Step 1: Add a new Data Generation plan using the Data Generation Wizard.


Inside an existing database project, select Add... New Item... and select the "Data Generation Wizard" option. In this example I am using the good old trusted pubs database as my sample database.


Step 2: Launch the Data Generation Wizard

Once you selected the Data Generation Wizard, you will get the main page of the wizard. Above you see the default state of the wizard.


Step 3: Select your source database

The next step is to select the database that you want to pull the data from, in this example I am pulling from the pubs sample database. If you have not configured a connection to the specific target database, you can create one using the New Connection... option or you can change existing connection definitions using the Edit Connection... option.

Once you select the database there are a couple of options you can select or set, lets walk through these:

  • Set the number of rows to generate...
    This option will configure the "Rows to Insert" value of data generation plan, to exactly the number of rows that are inside the source database.
  • Replace the foreign key generator with values form the target database...
    If you check this option the wizard will pull the foreign key values from the source database, instead of leveraging the foreign key data generator
  • Number of failed rows...
    This option configures the error threshold for the number of failures to be allowed before terminating the population of a table
  • Number of failed tables...
    This option configures the error threshold for the number of failures to be allowed before the terminating execution of the data generation plan



Step 4: Run the wizard

When you hit OK, you will see a Status window which will display the progress information of the creation and configuration of the data generation plan.


Below you see the status information from creating the data generation plan for the pubs database.

Retrieving data from pubs.Data retrieval complete.Starting merge of data generation plan with database schema.Completed.Resetting all target row counts to match target databaseSetting table [dbo].[authors] to 23 rowsSetting table [dbo].[discounts] to 3 rowsSetting table [dbo].[employee] to 43 rowsSetting table [dbo].[jobs] to 14 rowsSetting table [dbo].[pub_info] to 8 rowsSetting table [dbo].[publishers] to 8 rowsSetting table [dbo].[roysched] to 86 rowsSetting table [dbo].[sales] to 21 rowsSetting table [dbo].[stores] to 6 rowsSetting table [dbo].[titleauthor] to 25 rowsSetting table [dbo].[titles] to 18 rowsPopulating columns with Sequential databound generatorConfiguring table dbo.authors to be databoundConfiguring table dbo.discounts to be databoundConfiguring table dbo.employee to be databoundConfiguring table to be databoundConfiguring table dbo.pub_info to be databoundConfiguring table dbo.publishers to be databoundConfiguring table dbo.roysched to be databoundConfiguring table dbo.sales to be databoundConfiguring table dbo.stores to be databoundConfiguring table dbo.titleauthor to be databoundConfiguring table dbo.titles to be databoundSaving to file 'c:\work\pubs\Data Generation Plans\DataTransformPlan1.dgen'Completed

As you can conclude from the status output the wizard configures the data generation plan in two phases. First it configures the row counts and secondly in configures the generators for each column inside each table. If you did not select the option to propagate row counts the first phase will be skipped.


Step 5: The resulting data generation plan

Now that the wizard is finished, lets look at the results. First we will look at the data generation plan. We will see that the row counts are configured for each respective table.


Step 6: The resulting column configuration

If we look at the column configurations we see that in our example the foreign keys are generated using the foreign key generator and the other values are pulled from the database. Note that we are using the sequential databound generator, which support multi-instancing. This means that columns values are matched per row, the regular databound generator does not have this ability to correlate column values within a single row!


Step 7: Running the generation plan

Now that we have a data generation plan, lets runs it. We created a clean database from our project named pubs2 and we are going to generate straight in to this.


If you are using the real pubs database you will find that this will fail. For a good reason, because the employee_insupd trigger inside the schema that will cause a rollback if the job id does not have a certain value. If you are using the default setting, the plan for the employee table will look like this:


The job id foreign key generator is not configured correctly to satisfy the employee_insupd trigger.

If you selected the option "Replace the foreign key generator with values form the target database..." you plan for the employee table looks like this:


Note that the job_id column is also pulled from the database, so when you used that option you will satisfy the trigger and the plan will just work!


Step 8: Showing the results

Now that we executed the plan, lets check the results, by opening up the T-SQL Editor and fire off a simple query "SELECT * FROM sales" to see if the content made it over.



The Data Generation Wizard allows you to quickly configure a data generation plan that pulls all rows from an existing SQL Server database. The wizard will only allow you to pull information from an SQL Server database, because the schema of the objects inside the database has to match the table inside the database project.

We hope this new feature saves you time!

"DataDude" Group Engineering Manager