Optimizing Bulk Import Performance
This topic describes options for optimizing the bulk import of data into a table in Microsoft SQL Server 2005 by using a bcp command, BULK INSERT statement, or INSERT ... SELECT * FROM OPENROWSET(BULK...) statement. To bulk import or export data as rapidly as possible, it is important to understand the factors that affect performance and the command qualifiers that are available to manage performance. Where possible, use a Transact-SQL statement to bulk import data into SQL Server because Transact-SQL is faster than bcp.
For a comparison of these methods, see About Bulk Import and Bulk Export Operations.
How to best increase the performance of a particular bulk import operation is influenced by the following factors:
- Whether the table has constraints or triggers, or both.
- The recovery model used by the database.
For more information, see Overview of the Recovery Models.
- Whether the table into which data is copied is empty.
- Whether the table has indexes.
- Whether TABLOCK is being specified.
- Whether the data is being copied from a single client or copied in parallel from multiple clients.
- Whether the data is to be copied between two computers on which SQL Server is running.
In SQL Server 2005, bulk-import optimizations are available when triggers are enabled. SQL Server 2005 uses row versioning for triggers and stores the row versions in the version store in tempdb. Before you can bulk import a large batch of data records using triggers, you may have to expand the size of tempdb to accommodate the impact of the triggers on the version store.
For information about how these factors affect bulk import scenarios, see Guidelines for Optimizing Bulk Import.
Methods for Optimizing Bulk Import
To speed up the bulk import of data, SQL Server provides you with the following methods:
Using minimal logging
The simple recovery model minimally logs most bulk operations.
For a database under the full recovery model, all row-insert operations that are performed during bulk import are fully logged in the transaction log. For large data imports, this can cause the transaction log to fill rapidly. For bulk-import operations, minimal logging is more efficient than full logging and reduces the possibility that a bulk-import operation will fill the log space. To minimally log a bulk-import operation on a database that normally uses the full recovery model, you can first switch the database to the bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model. For more information, see Considerations for Switching from the Full or Bulk-Logged Recovery Model.
Inserted rows are minimally logged if optimized bulk logging is applicable; otherwise, the inserted rows are fully logged in the transaction log. For information on when bulk-import operations are logged and how to perform minimally logged bulk-import operations, see Minimally Logged Operations and Prerequisites for Minimal Logging in Bulk Import.
Importing data in parallel from multiple clients to a single table
SQL Server allows data to be bulk imported into a single table from multiple clients in parallel. All three bulk import mechanisms support the parallel import of data. This can improve the performance of data import operations.
For more information, see Importing Data in Parallel with Table Level Locking.
For information on the use of batches when importing data and for information about the command qualifiers for managing batches, see Managing Batches for Bulk Import.
The OPENROWSET clause's BULK option does not support a controlling the batch size.
Disabling triggers may improve performance.
For more information on the impact of trigger execution on bulk-import operations and how to enable or disable triggers, see Controlling Trigger Execution When Bulk Importing Data.
For information on the impact of constraint checking on bulk-import operations and how to enable or disable the CHECK and FOREIGN KEY constraints of a table, see Controlling Constraint Checking by Bulk Import Operations.
Ordering the data in a data file
By default, a bulk-import operation assumes that a data file is unordered. If the table has a clustered index, the bcp utility and BULK INSERT statement allow you to specify how data in the data file is sorted during a bulk-import operation. It is optional for data in the data file to be sorted in the same order as the table. However, you can improve performance of the bulk-import operation if you specify the same ordering for the data file as the table.
For more information, see Controlling the Sort Order When Bulk Importing Data.
INSERT ... SELECT * FROM OPENROWSET(BULK...) statements do not support controlling the sort order.
Controlling the locking behavior
For information on how to specify the locking behavior during bulk-import operations, see Controlling the Locking Behavior for Bulk Import.
Importing data in native format
For more information, see Using Native Format to Import or Export Data and Using Unicode Native Format to Import or Export Data.
Help and Information
12 December 2006