Table Design Patterns

[This content is no longer valid. For the latest information on "M", "Quadrant", SQL Server Modeling Services, and the Repository, see the Model Citizen blog.]

SQL Server Modeling Services uses tables to define the characteristics of an entity in a data model. Rows in the table are instances of the entity. Some of the following SQL guidelines promote consistency with other Modeling Services tables, improving the experience of users that interact with Modeling Services data. Other guidelines are required by specific Modeling Services features.

Tip

This topic explains how to use the Modeling Services pattern for tables directly in SQL Server. If you are creating models with Microsoft code name “M”, refer to the topic on using this Modeling Services pattern in “M”. For more information, see Using Extents to Define Entities.

Guidelines for Table Design

The following list provides the SQL guidelines for designing tables for the Modeling Services database.

  1. Create Tables to Store Entity Instances of a Data Model.

  2. Create a Primary Key Named Id.

  3. Add a Folder Column to Support the Modeling Services Folder Design Pattern.

  4. Use Foreign Keys to Connect Related Tables within a Data Model.

  5. Consider Performance Implications of Table Design Choices.

  6. Restrict Access to Base Tables to the RepositoryAdministrator and RepositoryService Users.

  7. Add a timestamp Column for Use with Lifetime Services.

  8. Follow Modeling Services Design Patterns for the Localization of Strings and Resources.

Note

Database administrative privileges are required for performing many of the manual table design tasks in the Modeling Services database.

1. Create Tables to Store Entity Instances in a Data Model

Models describe entities and the relationships between entities. In the database, these entity descriptions and relationships are expressed with tables. Create a table in the Modeling Services database for each entity in a data model. The table should be owned by the SQL Server schema for that model. For more information about using schemas, see Schema Design Patterns. The following simple T-SQL example creates a new schema, Contact, and then adds a table to this schema named AddressesTable.

use Repository
go
create schema [Contact] authorization [RepositoryOwner]
go
create table [Contact].[AddressesTable](
   [Id] bigint NOT NULL,
   [Folder] [Repository.Item].[FolderId] NOT NULL,
   [Street] nvarchar(max) NOT NULL,
   [City] nvarchar(max) NOT NULL,
   [State] nvarchar(max) NOT NULL,
   [ZipCode] int NOT NULL,
   constraint [PK_Addresses] primary key clustered 
   (
      [Id] asc
   ) on [PRIMARY]
) on [PRIMARY]
go

Tables should follow Modeling Services naming conventions. In addition to making the Modeling Services database easier to manage and use, there are also some Modeling Services features that require these naming conventions. The following list contains the naming conventions for tables and table-related database constructs.

Append the Word “Table” to the End of Table Names

Append the word “Table” to the end of table names. For example, a table of address entities should be named AddressesTable. Tables in the Modeling Services database are often referred to as base tables, and typical users cannot access the table data directly. Instead, users require an updatable view to securely access the underlying table data. By adding the word “Table” to each table name, the associated views can use the more natural names of the entities, such as Addresses.

Note

Do not confuse references to base tables with references to base names. A base table, such as AddressesTable, contains the actual data that the Addresses view exposes. However, the base name for this table is Addresses. Literally, it is the name that both the table and view are based on. Note that some Modeling Services stored procedures request a base name as a parameter. In these situations, the stored procedures do not function correctly if this naming convention is not followed.

Use Plural Table Names

Tables contain collections of entities and should have a plural name to reflect this. For example, a table of addresses should be named AddressesTable rather than AddressTable.

Use the Pattern [PK_TableBaseName] for Primary Key Names

Primary key names must be unique within a schema and across all constraint names. A table’s base name is the table’s name without the appended word “Table”. For example, a table named AddressesTable has a base name of Addresses. This table should have a PRIMARY KEY constraint named PK_Addresses. This convention ensures uniqueness and makes it easy for developers to refer to the primary key of a table.

Use the Pattern [Default_TableBaseName_Column] for DEFAULT Constraint Names

DEFAULT constraint names must be unique within a schema and across all constraint names. This convention assures uniqueness and makes it easy for developers to refer to a specific default. For example, the DEFAULT constraint for a Street column in an AddressesTable table would be Default_Addresses_Street.

Use the Pattern [Check_TableBaseName_Column1_ColumnN_Description] for CHECK Constraint Names

CHECK constraint names must be unique within a schema and across all constraint names. CHECK constraints can focus on one column or compare values of multiple columns. Names of CHECK constraints should include each column name referenced along with a description of the constraint. For example, a CHECK constraint for positive numbers on a ZipCode column of an AddressesTable table would be Check_Addresses_ZipCode_Positive. This convention ensures uniqueness and makes it easy for developers to refer to a specific check clause.

Use the Pattern [Unique_TableBaseName_KeyColumn1_KeyColumnN] for UNIQUE Constraint Names

UNIQUE constraint names must be unique within a schema and across all constraint names. As with CHECK constraints, UNIQUE constraints can focus on one or more columns, and the name of the UNIQUE constraint should include these column names. For example, consider an AddressesTable table that required one entry per street address. A UNIQUE constraint would need to cover all of the address fields, as in Unique_Addresses_Street_City_State_ZipCode. This convention ensures uniqueness and makes it easy for developers to refer to a UNIQUE constraint.

Use the Pattern [FK_SourceTableBaseName_SourceColumn_TargetSchemaName.TargetTableBaseName] for Foreign Key Names

Foreign key names must be unique within a schema and across all constraint names. For more information about using foreign keys to model relationships, see Use Foreign Keys to Connect Related Tables within a Data Model. For example, consider a table of customer entities named CustomersTable that contains an Address column. This Address column uses a FOREIGN KEY constraint to reference the Id column of the AddressesTable table. Both tables are owned by the Contact SQL Server schema. In this example, the FOREIGN KEY constraint on the [Contact].[CustomersTable] table would be FK_Customers_Address_Contact.Addresses. This convention ensures uniqueness and makes it easy for developers to refer to a foreign key.

Use the Pattern [IR_KeyColumn1_KeyColumnN] for Index Names

Index names must be unique within a table. The prefix IR in this naming convention represents the words “Index Relational.” This differentiates it from an XML index with the prefix IX. An Index can include multiple columns, and an Index name should include the referenced column names. For example, an AddressesTable table could have an index on the City and State columns. This index name would be IR_City_State. This convention ensures uniqueness and makes it easy for developers to refer to an index.

Use the Pattern [IX_Column_Kind] for XML Index

Index names must be unique within a table. The placeholder Kind in this convention can be one of the following values: Primary, Value, Path, or Property. This convention ensures uniqueness and makes it easy for developers to refer to an index.

2. Create a Primary Key Named Id

Modeling Services tables should have a primary key column named Id. Using a standard name for primary key columns makes queries easier to write and understand, because join clauses often reference these primary key columns. This primary key is an identifier for each row in the table. There are additional design choices for the identifier data type as well as the method for populating the Id field during inserts. For more information, see Identifier Design Patterns.

Note

An arbitrary primary key, such as Id, is referred to as a surrogate key. The alternative would be to use a natural key that is part of the data itself (such as a government-issued identification number). Surrogate keys are preferred, because it is sometimes difficult to accurately find a column or combination of columns that will maintain uniqueness. In addition, natural keys can be much larger than surrogate keys, especially if the natural keys are text based or composed of multiple columns. Large keys increase I/O demands on the table and its indexes, decreasing database throughput.

Example

PRIMARY KEY constraints are added to a table in two ways. The CREATE TABLE statement can contain a PRIMARY KEY constraint. After the table is created, the ALTER TABLE statement can also add a PRIMARY KEY constraint. The following T-SQL example assumes that a table named PhoneNumbersTable was previously created in a schema named Contact. This example adds a primary key to the Id column of the PhoneNumbersTable table.

alter table [Contact].[PhoneNumbersTable]
add constraint PK_PhoneNumbersTable PRIMARY KEY CLUSTERED (Id)

3. Add a Folder Column to Support the Modeling Services Folder Design Pattern

Modeling Services Folders organize and secure model data. For more information about adding Folders to the Modeling Services database, see SQL Server Modeling Services Folder Design Patterns. To use Folders, tables must add a column named Folder with a data type of int. This Folder must have a FOREIGN KEY constraint that references the Id column of the [Repository.Item].[FoldersTable] table. This means that any new rows added to the table must specify an owning Folder by inserting the associated folder identifier as the value of the Folder column for that row.

Foreign keys for Folder columns that reference the [Repository.Item].[FoldersTable] table should also use cascading deletes for referential integrity. Rows with foreign keys that use cascading deletes are affected by deletions in the foreign key table. When a row is deleted from the [Repository.Item].[FoldersTable] table, all rows that reference the deleted Folder are also deleted.

To facilitate adding Folder support to a table, Modeling Services provides a stored procedure, [Repository.Item].[AddFolderForeignKey]. This procedure accepts the target table name or base name, the schema name, and the name of the Folder column. It then creates the foreign key relationship to the [Repository.Item].[FoldersTable].

Example

The following T-SQL example creates a schema named Investment and a table in that schema named StocksTable. The script then calls the [Repository.Item].[AddFolderForeignKey] stored procedure to add the FOREIGN KEY constraint from the Folder column of the StocksTable table to the Id column of the [Repository.Item].[FoldersTable] table.

use Repository
go
create schema [Investment] authorization [RepositoryOwner]
go
create table [Investment].[StocksTable](
   [Id] bigint NOT NULL,
   [Folder] [Repository.Item].[FolderId] NOT NULL,
   [StockSymbol] varchar(4) NOT NULL,
   constraint [PK_Stocks] primary key clustered 
   (
      [Id] asc
   ) on [PRIMARY]
) on [PRIMARY]
go

exec [Repository.Item].[AddFolderForeignKey] @schema = N'Investment', 
    @baseName = N'Stocks', @folderColumn = N'Folder'

/*
-- Equivalent T-SQL statement alternative to calling the stored procedure:
alter table [Investment].[StocksTable] with check 
add constraint [FK_Stocks_Folder_Repository.Item.Folders] foreign key ([Folder])
   references [Repository.Item].[FoldersTable] ([Id])
   on delete cascade
go
*/

Foreign keys are also used to express modeling relationships between two tables. For example, a CustomersTable table might have an Address column that is a reference to the Id column of an AddressesTable. This relationship should be expressed and enforced by a FOREIGN KEY constraint on the Address column. Note that each row in the CustomersTable table points to a single address, so the column name is singular to reflect this.

5. Consider Performance Implications of Table Design Choices

There are several design choices for tables that improve the performance of the Modeling Services database.

Use Normalized Table Design

High-performing relational database design begins with normalized tables. Normalization is the process of dividing large tables with many columns into a group of smaller related tables. This saves space and increases performance. For more information about normalization and its benefits, see Normalization.

Consider Vertically Partitioning Columns into Separate Tables

Some normalized tables still contain large columns or infrequently used columns. Consider vertically partitioning these columns into one or more separate tables. These tables have mirrored primary key values, and the partitioned tables maintain a FOREIGN KEY constraint back to the primary key of the main table to ensure consistency. Queries that do not require these vertically partitioned columns perform better, because the main table contains smaller rows and demands less I/O to search. Queries that require the partitioned column values can join the tables on their primary key values to get the complete set of data. Create a view that provides this join for the user and abstracts the vertical partitioning.

Use Indexes to Improve Query Performance

Proper use of indexes are critical to high-performing tables in the Modeling Services database. Possible column candidates for indexes include join columns, foreign key columns, and columns that frequently appear in query WHERE clauses. For more information about creating well-designed indexes, see General Index Design Guidelines.

Use Numeric Data Types for Enumerations

It is common to need certain database columns to act as enumerations, accepting only a predefined set of values. Instead of using a string type, such as nvarchar or nchar, use numeric enumeration columns with calculated columns providing the user-friendly descriptions. This reduces the required space per row, which increases database throughput. For example, consider an nvarchar(9) column named BookType in a BooksTable table. There could be hardcover books and paperback books. As the number of books in this BooksTable table grows larger, a significant amount of space is necessary for storing the hardcover and paperback strings in the BookType column. A better performing table would use the tinyint data type for the BookType column as in the following T-SQL statement:

create table [BooksTable](
   [Id] bigint not null,
   [Name] nvarchar(max) not null,
   [BookType] tinyint not null,
   [BookTypeName] as 
   (case [BookType] when 0 then N'hardcover'
                    when 1 then N'softcover'
                    end),
) on [PRIMARY]
go
alter table [BooksTable] with check 
add constraint [Check_Books_BookType_WithinRange] 
   check (([BookType] >= 0) and ([BookType] <= 1))
go

This example creates a BookType column with the tinyint data type, requiring much less storage than the previous nvarchar(9) data type. A CHECK constraint ensures that the enumeration values are in the right range. In this example, the value 0 represents the hardcover book type, and the value 1 represents the paperback book type. Finally, a calculated column, BookTypeName, displays a text version of the enumerations without storing the readable string for every row in the table.

This strategy applies to enumerations with a fixed list of values that change infrequently. For enumerations with values that change frequently, use a separate table to store the enumeration values. Reference that table with a foreign key. Then obtain the enumeration values in queries by joining on this lookup table.

Avoid Using text, ntext or image Column Data Types

The Modeling Services database uses transactional replication to support scale-out and geoscaling scenarios. Transactional replication does not support updates to columns with a data type of text, ntext, or image. Use columns with a data type varchar(max), nvarchar(max), or varbinary(max) instead.

Use XML Columns based on a Schema Collection for Hierarchical or Ordered Data

Some data is best expressed hierarchically or in an ordered list. When this type of data must frequently be returned as a unit, it might be best to store this structured data as XML rather than spreading the same data across many different relational tables. In these cases, use a column with an xml data type to store the structure XML data as a unit. Using XML columns increases both the performance and usability of the table.

6. Restrict Access to Base Tables to the RepositoryAdministrator Role and RepositoryService User

The Modeling Services database uses updatable views to provide access to the data in the underlying base tables. For more information about updatable views, see View Design Patterns. Because users are not meant to change data directly in the base tables, restrict access to Modeling Services tables to only the RepositoryAdministrator role and the RepositoryService user. Instead of changing the access for each table, change these permissions for each schema. The tables in a schema inherit the permissions set on its schema. For more information about correctly setting access to schemas, see Grant Control to the RepositoryAdministrator Role and the RepositoryService User.

7. Add a timestamp Column for Use with Lifetime Services

Lifetime Services is a feature of Modeling Services that enables administrators to define policies for removing old table data automatically. Regular jobs run with the SQL Server Agent to apply these policies to appropriate tables. These policies describe how frequently to remove old data from a Modeling Services table. To support Lifetime Services, tables must contain a column named TimeStamp with a data type of timestamp. Each modification of the table automatically updates the timestamp column with a database-wide relative timestamp value. Lifetime Services can use this timestamp to determine which rows have not been modified since the last time the policy was run.

Note

Note that tables that do not require lifetime services do not need a TimeStamp column.

8. Follow Modeling Services Design Patterns for the Localization of Strings and Resources

It is useful to localize strings and resources within Modeling Services tables. The Modeling Services database supports localization through its System.Globalization schema. For more information about localizing table strings and resources, see Localization Design Pattern.

See Also

Concepts

SQL Server Modeling Services Design Patterns
Table Design Patterns