使用 Unicode 字符格式导入或导出数据 (SQL Server)Use Unicode Character Format to Import or Export Data (SQL Server)

适用对象:是SQL Server 是Azure SQL 数据库 是Azure Synapse Analytics (SQL DW) 是并行数据仓库APPLIES TO: yesSQL Server yesAzure SQL Database yesAzure Synapse Analytics (SQL DW) yesParallel Data Warehouse

使用包含扩展/DBCS 字符的数据文件在多个 SQL ServerSQL Server 实例之间大容量传输数据时,建议使用 Unicode 字符格式。Unicode character format is recommended for bulk transfer of data between multiple instances of SQL ServerSQL Server by using a data file that contains extended/DBCS characters. 从服务器导出数据时,Unicode 字符数据格式允许使用与执行该操作的客户端不同的代码页。The Unicode character data format allows data to be exported from a server by using a code page that differs from the code page used by the client that is performing the operation. 在这种情况下,使用 Unicode 字符格式有下列优点:In such cases, use of Unicode character format has the following advantages:

  • 如果源数据和目标数据的类型为 Unicode,则使用 Unicode 字符格式可以保留所有的字符数据。If the source and destination data are Unicode data types, use of Unicode character format preserves all of the character data.

  • 如果源数据和目标数据不为 Unicode 数据类型,则使用 Unicode 字符格式可以尽可能减少丢失源数据中无法在目标数据中表示的扩展字符。If the source and destination data are not Unicode data types, use of Unicode character format minimizes the loss of extended characters in the source data that cannot be represented at the destination.

本主题内容:In this Topic:
使用 Unicode 字符格式的注意事项Considerations for Using Unicode Character Format
使用 Unicode 字符格式、bcp 和格式化文件的特殊注意事项Special Considerations for Using Unicode Character Format, bcp, and a Format File
Unicode 字符格式的命令选项Command Options for Unicode Character Format
示例测试条件Example Test Conditions
 ● 示例表 ● Sample Table
 ● 示例非 XML 格式化文件 ● Sample Non-XML Format File
示例Examples
 ● 使用 bcp 和 Unicode 字符格式导出数据 ● Using bcp and Unicode Character Format to Export Data
 ● 在不使用格式化文件的情况下使用 bcp 和 Unicode 字符格式导入数据 ● Using bcp and Unicode Character Format to Import Data without a Format File
 ● 在使用非 XML 格式化文件的情况下使用 bcp 和 Unicode 字符格式导入数据 ● Using bcp and Unicode Character Format to Import Data with a Non-XML Format File
 ● 在不使用格式化文件的情况下使用 BULK INSERT 和 Unicode 字符格式 ● Using BULK INSERT and Unicode Character Format without a Format File
 ● 在使用非 XML 格式化文件的情况下使用 BULK INSERT 和 Unicode 字符格式 ● Using BULK INSERT and Unicode Character Format with a Non-XML Format File
 ● 在使用非 XML 格式化文件的情况下使用 OPENROWSET 和 Unicode 字符格式 ● Using OPENROWSET and Unicode Character Format with a Non-XML Format File
相关任务Related Tasks

使用 Unicode 字符格式的注意事项Considerations for Using Unicode Character Format

使用 Unicode 字符格式时,请考虑下列事项:When using Unicode character format, consider the following:

使用 Unicode 字符格式、bcp 和格式化文件的特殊注意事项Special Considerations for Using Unicode Character Format, bcp, and a Format File

Unicode 字符格式数据文件遵循 Unicode 文件的约定。Unicode character format data files follow the conventions for Unicode files. 该文件的前两个字节为十六进制数字 0xFFFE。The first two bytes of the file are hexadecimal numbers, 0xFFFE. 这两个字节用作字节顺序标记 (BOM),指定在文件中高位字节是存储在前面还是后面。These bytes serve as byte-order marks (BOM), specifying whether the high-order byte is stored first or last in the file. bcp 实用工具 可能曲解字节顺序标记,并导致部分导入过程失败;可能会收到如下错误消息:The bcp Utility may misinterpret the BOM and cause part of your import process to fail; you may receive an error message similar as follows:

Starting copy...
SQLState = 22005, NativeError = 0
Error = [Microsoft][ODBC Driver 13 for SQL Server]Invalid character value for cast specification

在以下情况下,字节顺序标记可能被曲解:The BOM may be misinterpreted under the following conditions:

  • 使用 bcp 实用工具-w 切换表示 Unicode 字符The bcp Utility is used and the -w switch is used to indicate Unicode character

  • 使用格式化文件A format file is used

  • 数据文件中的第一个字段是非字符The first field in the data file is non-character

考虑以下任意解决方法是否适用于特定 情况:Consider whether any of the following workarounds may be available for your specific situation:

Unicode 字符格式的命令选项Command Options for Unicode Character Format

可以使用 bcpBULK INSERTINSERT ... 将 Unicode 字符格式数据导入表中SELECT * FROM OPENROWSET(BULK...)。对于 bcp 命令或 BULK INSERT 语句,你可以在语句中指定数据格式。You can import Unicode character format data into a table using bcp, BULK INSERT or INSERT ... SELECT * FROM OPENROWSET(BULK...). For a bcp command or BULK INSERT statement, you can specify the data format in the statement. 对于 INSERT ...SELECT * FROM OPENROWSET(BULK...) 语句,必须在格式化文件中指定数据格式。For an INSERT ... SELECT * FROM OPENROWSET(BULK...) statement, you must specify the data format in a format file.

下列命令选项支持 Unicode 字符格式:Unicode character format is supported by the following command options:

CommandCommand 选项Option 描述Description
bcpbcp -w-w 使用 Unicode 字符格式。Uses the Unicode character format.
BULK INSERTBULK INSERT DATAFILETYPE ='widechar'DATAFILETYPE ='widechar' 批量导入数据时使用 Unicode 字符格式。Uses Unicode character format when bulk importing data.
OPENROWSETOPENROWSET N/AN/A 必须使用格式化文件Must use a format file

备注

或者,您可以在格式化文件中为每个字段指定格式设置。Alternatively, you can specify formatting on a per-field basis in a format file. 有关详细信息,请参阅 用来导入或导出数据的格式化文件 (SQL Server)For more information, see Format Files for Importing or Exporting Data (SQL Server).

示例测试条件Example Test Conditions

本主题中的示例基于下面定义的表和格式化文件。The examples in this topic are based on the table, and format file defined below.

示例表Sample Table

以下脚本将创建测试数据库、名为 myWidechar 的表并用一些初始值填充表。The script below creates a test database, a table named myWidechar and populates the table with some initial values. 在 Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS) 中执行以下 Transact-SQL:Execute the following Transact-SQL in Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS):

CREATE DATABASE TestDatabase;
GO

USE TestDatabase;
CREATE TABLE dbo.myWidechar ( 
    PersonID smallint NOT NULL,
    FirstName nvarchar(25) NOT NULL,
    LastName nvarchar(30) NOT NULL,
    BirthDate date,
    AnnualSalary money
);

-- Populate table
INSERT TestDatabase.dbo.myWidechar
VALUES 
(1, N'ϴAnthony', N'Grosse', '02-23-1980', 65000.00),
(2, N'❤Alica', N'Fatnowna', '11-14-1963', 45000.00),
(3, N'☎Stella', N'Rossenhain', '03-02-1992', 120000.00);

-- Review Data
SELECT * FROM TestDatabase.dbo.myWidechar;

示例非 XML 格式化文件Sample Non-XML Format File

SQL Server 支持两种类型的格式化文件:非 XML 格式和 XML 格式。SQL Server support two types of format file: non-XML format and XML format. 非 XML 格式是 SQL Server 早期版本支持的原始格式。The non-XML format is the original format that is supported by earlier versions of SQL Server. 有关详细信息,请查看 非 XML 格式化文件 (SQL Server)Please review Non-XML Format Files (SQL Server) for detailed information. 下面的命令基于 的架构使用 bcp 实用工具 myWidechar.fmt生成非 XML 格式化文件 myWidecharThe following command will use the bcp utility to generate a non-xml format file, myWidechar.fmt, based on the schema of myWidechar. 若要使用 bcp 命令创建格式化文件,请指定 format 参数,并使用 nul 而不是数据文件路径。To use a bcp command to create a format file, specify the format argument and use nul instead of a data-file path. 格式化选项还需要 -f 选项。The format option also requires the -f option. 此外,对于本示例,限定符 c 用于指定字符数据, T 用于指定使用集成安全性的受信任连接。In addition, for this example, the qualifier c is used to specify character data, and T is used to specify a trusted connection using integrated security. 在命令提示符处输入以下命令:At a command prompt, enter the following commands:

bcp TestDatabase.dbo.myWidechar format nul -f D:\BCP\myWidechar.fmt -T -w

REM Review file
Notepad D:\BCP\myWidechar.fmt

重要

确保非 XML 格式化文件以回车符/换行符结尾。Ensure your non-XML format file ends with a carriage return\line feed. 否则可能会收到以下错误消息:Otherwise you will likely receive the following error message:

SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 13 for SQL Server]I/O error while reading BCP format file

示例Examples

下面的示例使用上面创建的数据库和格式化文件。The examples below use the database, and format files created above.

使用 bcp 和 Unicode 字符格式导出数据Using bcp and Unicode Character Format to Export Data

-w 切换和 OUT 命令。-w switch and OUT command. 请注意:此示例中创建的数据文件将用于所有后续示例中。Note: the data file created in this example will be used in all subsequent examples. 在命令提示符处输入以下命令:At a command prompt, enter the following commands:

bcp TestDatabase.dbo.myWidechar OUT D:\BCP\myWidechar.bcp -T -w

REM Review results
NOTEPAD D:\BCP\myWidechar.bcp

在不使用格式化文件的情况下使用 bcp 和 Unicode 字符格式导入数据Using bcp and Unicode Character Format to Import Data without a Format File

-w 切换和 IN 命令。-w switch and IN command. 在命令提示符处输入以下命令:At a command prompt, enter the following commands:

REM Truncate table (for testing)
SQLCMD -Q "TRUNCATE TABLE TestDatabase.dbo.myWidechar;"

REM Import data
bcp TestDatabase.dbo.myWidechar IN D:\BCP\myWidechar.bcp -T -w

REM Review results is SSMS

在使用非 XML 格式化文件的情况下使用 bcp 和 Unicode 字符格式导入数据Using bcp and Unicode Character Format to Import Data with a Non-XML Format File

-w-f 切换以及 IN 命令。-w and -f switches and IN command. 由于此示例讲到 bcp、格式化文件、Unicode 字符以及数据文件中的第一个数据字段为非字符,因此需要使用解决方法。A workaround will need to be used since this example involves bcp, a format file, Unicode character, and the first data field in the data file is non-character. 请参阅上面的 使用 Unicode 字符格式、bcp 和格式化文件的特殊注意事项See Special Considerations for Using Unicode Character Format, bcp, and a Format File, above. 数据文件 myWidechar.bcp 将通过添加其他记录为“虚拟”记录来进行更改,然后使用 -F 2 开关跳过该记录。The data file myWidechar.bcp will be altered by adding an additional record as a "dummy" record which will then be skipped with the -F 2 switch.

在命令提示符中,输入以下命令,然后按照修改步骤操作:At a command prompt, enter the following commands and follow the modification steps:

REM Truncate table (for testing)
SQLCMD -Q "TRUNCATE TABLE TestDatabase.dbo.myWidechar;"

REM Open data file
Notepad D:\BCP\myWidechar.bcp
REM Copy first record and then paste as new first record.  This additional record is the "dummy" record.
REM Close file.

REM Import data instructing bcp to skip dummy record with the -F 2 switch.
bcp TestDatabase.dbo.myWidechar IN D:\BCP\myWidechar.bcp -f D:\BCP\myWidechar.fmt -T -F 2

REM Review results is SSMS

REM Return data file to original state for usage in other examples
bcp TestDatabase.dbo.myWidechar OUT D:\BCP\myWidechar.bcp -T -w

在不使用格式化文件的情况下使用 BULK INSERT 和 Unicode 字符格式Using BULK INSERT and Unicode Character Format without a Format File

DATAFILETYPE 参数。DATAFILETYPE argument. 在 Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS) 中执行以下 Transact-SQL:Execute the following Transact-SQL in Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS):

TRUNCATE TABLE TestDatabase.dbo.myWidechar; -- for testing
BULK INSERT TestDatabase.dbo.myWidechar
    FROM 'D:\BCP\myWidechar.bcp'
    WITH (
        DATAFILETYPE = 'widechar'
        );

-- review results
SELECT * FROM TestDatabase.dbo.myWidechar;

在使用非 XML 格式化文件的情况下使用 BULK INSERT 和 Unicode 字符格式Using BULK INSERT and Unicode Character Format with a Non-XML Format File

FORMATFILE 参数。FORMATFILE argument. 在 Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS) 中执行以下 Transact-SQL:Execute the following Transact-SQL in Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS):

TRUNCATE TABLE TestDatabase.dbo.myWidechar; -- for testing
BULK INSERT TestDatabase.dbo.myWidechar
   FROM 'D:\BCP\myWidechar.bcp'
   WITH (
        FORMATFILE = 'D:\BCP\myWidechar.fmt'
        );

-- review results
SELECT * FROM TestDatabase.dbo.myWidechar;

在使用非 XML 格式化文件的情况下使用 OPENROWSET 和 Unicode 字符格式Using OPENROWSET and Unicode Character Format with a Non-XML Format File

FORMATFILE 参数。FORMATFILE argument. 在 Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS) 中执行以下 Transact-SQL:Execute the following Transact-SQL in Microsoft SQL Server Management StudioSQL Server Management Studio (SSMS):

TRUNCATE TABLE TestDatabase.dbo.myWidechar;  -- for testing
INSERT INTO TestDatabase.dbo.myWidechar
    SELECT *
    FROM OPENROWSET (
        BULK 'D:\BCP\myWidechar.bcp', 
        FORMATFILE = 'D:\BCP\myWidechar.fmt'  
        ) AS t1;

-- review results
SELECT * FROM TestDatabase.dbo.myWidechar;

相关任务Related Tasks

使用数据格式进行批量导入或批量导出To use data formats for bulk import or bulk export

另请参阅See Also

bcp Utility bcp Utility
BULK INSERT (Transact-SQL) BULK INSERT (Transact-SQL)
OPENROWSET (Transact-SQL) OPENROWSET (Transact-SQL)
数据类型 (Transact-SQL) Data Types (Transact-SQL)
Collation and Unicode SupportCollation and Unicode Support