Generate Test Data for Databases by Using Data Generators

You can use Visual Studio Team System Database Edition to create test data that you need to verify the behavior of the database objects in your database project. You can generate data that is appropriate for the schema of the database but unrelated to the production data. This approach can help you protect the privacy or security of the production data.

You define a data generation plan that specifies the details of how you want data to be generated for specific tables and columns. For each column, you specify a data generator that produces data of a particular type.

When you create a data generation plan in a database project, the plan is based on the schema of the project. If you create a data generation plan and then the schema changes, you are prompted to update the plan.

Important noteImportant Note:

If you do not update the data generation plan based on the schema changes, you cannot continue to modify the plan. You can only close and reopen the data generation plan to display the prompt again.

When you run database unit tests, you can specify a different data generation plan for each test project. Therefore, you can initialize the database to a different state for each group of tests.

Data Generators

Database Edition includes several built-in data generators for generating different kinds of data. For example, the integer generator provides random integer values, the string generator provides random strings, and the regular expression generator provides strings that match a pattern that you specify. For more information and the complete list of standard data generators, see Standard Data Generator Types and Specify Details of Data Generation for a Column.

If the generators that are included with Database Edition do not provide data that meets your requirements, you can create custom generators. For example, you might want to create a custom data generator if your database contains a column that has a check constraint that references another column. For more information, see Generate Specialized Test Data with a Custom Data Generator.

Data Generation Plans in a Team Environment

A data generation plan is an XML file that contains information about the database schema and configuration information that controls the data generation for each column in each table that you specify. A data generation plan for a database that contains about 40 tables can be about 8 MB and more than 100,000 lines long.

You cannot automatically merge changes to a .dgen file by using Team Foundation version control or another version control system. The process that you must use to manually merge large XML files is difficult and prone to errors. You can minimize problems that relate to your data generation plan by using exclusive checkout on it when you must change it.

Security of Data Generators

You can share data generation plans and custom data generators in a team environment. Before you share or use shared data generation files, you should consider the following security risks:

  • Schema Information in Data Generation Plans
    When you create a data generation plan, the .dgen file contains the schema of the tables, which might be a sensitive trade secret. When you share a .dgen file, the person with whom you share the file can see your schema. You should share data generation plans only with trusted sources.

  • Malicious Code in Data Generation Plans
    When a data generation plan contains a data bound generator, you write a Transact-SQL (T-SQL) query that runs when the plan is run. This approach allows arbitrary T-SQL to run from within a data generation plan. You should obtain data generation plans only from trusted sources and warn end users not to run data generation plans that they receive from non-trusted sources.

  • Connection Information and Custom Data Generators
    All custom data generators have access to the database connection string at run-time. A malicious custom generator could expose the connection string information. You should obtain custom data generators only from trusted sources and warn end users not to use custom data generators that they receive from non-trusted sources.

  • Malicious Code in Custom Data Generators
    Custom data generators are classes that can contain arbitrary code. When you use a custom data generator, it runs with the same permissions as the active user. This approach can run malicious code in FullTrust mode. You should obtain custom data generators only from trusted sources and warn end users not to use custom data generators that they receive from non-trusted sources.

  • Malicious Code in Installers for Custom Data Generators
    You can create deployment projects to install custom data generators. Deployment projects can contain arbitrary code. When you run an installation program for a custom data generator, the program runs with elevated permissions. This approach can run malicious code with elevated permissions. You should obtain custom data generator installers only from trusted sources and warn end users not to run installation programs for custom data generators that they receive from non-trusted sources.

Common Tasks

In the following table, you can find descriptions of common tasks that support this scenario and links to more information about how you can successfully complete those tasks.

Common Tasks

Supporting Content

Get hands-on practice: You can follow an introductory walkthrough to become familiar with how to create and run a simple database unit test.

Walkthrough: Creating and Running a Data Generation Plan

Create a data generation plan: Create a data generation plan for each unit test project or specific application tests.

How to: Create Data Generation Plans

Upgrade a data generation plan from a previous release: You can use data generation plans that you created with a previous release of Database Edition. However, you must upgrade the plan immediately after you open it.

How to: Upgrade a Data Generation Plan from a Previous Release

Specify the tables that you want to generate: You can generate data for all tables or for only those tables that you specify. For each table, you can specify a number of rows to generate. As an alternative, you can generate rows in a ratio that is based on the number of rows that you generate for another table.

How to: Specify Tables for Data Generation

How to: Specify the Number of Rows to Generate

Configure how data is generated for each column in the tables that you specify: You specify the type of data generator for each column. You can also provide constraints on the range of data values. For more advanced data generators, you can provide additional configuration information to further refine the data that you generate.

Specify Details of Data Generation for a Column

Preview the generated data: Before you write data to your database, you can preview the data for any table that is included in your data generation plan. By previewing the data, you can adjust your data generation plan without generating data iteratively.

How to: Preview a Data Generation Plan

Generate test data: After you have previewed and verified the data that you will generate, you can run the data generation plan to generate the test data. You can also configure your unit test project so that data is generated automatically whenever you run your unit tests.

How to: Run a Data Generation Plan to Generate Data

How to: Configure Database Unit Test Execution

Check in the data generation plans: Just as you must check in your unit tests to version control to share them with your team, you must also check in your data generation plans.

Using the Check In and Pending Changes Windows

How to: Check In Pending Changes

Troubleshoot problems: You can learn more about how to troubleshoot common problems with creating and running database unit tests.

Troubleshooting Database Unit Testing Issues

  • Create and Define Database Unit Tests
    Provides information and links to additional topics about how you can define unit tests to verify the behavior of your database objects. You can associate each test project with a different data generation plan, deployment configuration, and connection string.

  • Run Database Unit Tests
    Describes how, after you create your database unit tests, you run them to see the results.

  • Define Custom Conditions for Database Unit Tests
    Describes how you can test for a special condition that you cannot verify by using the default test conditions.

See Also

Concepts

Terminology Overview of Database Edition