Creating Complex Test Databases - Using Excel to convert a raw list of text into viable T-SQL syntax

As part of my series on creating databases with interesting characteristics for testing purposes, I have covered a few different engine constructs, test strategies, and scripting techniques to build interesting databases.  In this article, I am going to highlight one technique I use fairly often with leveraging Excel's text concatenation functionality to generate T-SQL script based on a list of raw text found elsewhere (online, from a document, in email, etc.).  I use this to both generate schema and data from existing lists of text.

As a tester at Microsoft, I focus on some very specific database test tasks that may not apply to the rest of the world, but the techniques I use to generate these test artifacts can be leveraged in many applications.  Recently I needed to create a table and data focusing on the sql_variant data type.  The sql_variant data type is a sort of generic data type that can store values of many other native SQL Server data types (int, datetime, varchar, etc), but not all data types (no text, image, geography, etc).  My task was to create one column in this table for each native data type that can be stored in sql_variant.  The list of the 23 supported data types that can be stored in a sql_variant column can be found in Books Online.  After copying and pasting this list of 23 data types into Excel, I can use the string concatenation function to generate the script I'm looking for.


After applying the concatenation function to all rows in the spreadsheet, we effectively have the syntax for our entire CREATE TABLE statement.

Although this is a fairly simple example, the idea can be extended to many other applications.  In the past I have needed a table containing zip code data (ex:  city, county, state, zip code).  Searching online, you can find this data, but the trick is figuring out how to insert it into your database.  There is good support for transferring data from Excel to SQL Server, but by leveraging this technique, you can customize your T-SQL syntax for your needs, giving you much more flexibility on the resulting schema and/or data in your database.

Hope you enjoy,
Sam Lester (MSFT)