包含数据库的排序规则Contained Database Collations

适用对象:是SQL Server 否Azure SQL 数据库 否Azure Synapse Analytics (SQL DW) 否并行数据仓库 APPLIES TO: yesSQL Server noAzure SQL Database noAzure Synapse Analytics (SQL DW) noParallel Data Warehouse

许多属性会影响文本数据的排序顺序和相等语义,包括区分大小写、区分重音以及所用的基本语言。Various properties affect the sort order and equality semantics of textual data, including case sensitivity, accent sensitivity, and the base language being used. 对于这些特性,可通过选择数据的排序规则来表示给 SQL ServerSQL ServerThese qualities are expressed to SQL ServerSQL Server through the choice of collation for the data. 有关排序规则本身的更深入讨论,请参阅 排序规则和 Unicode 支持For a more in-depth discussion of collations themselves, see Collation and Unicode Support.

排序规则不仅适用于用户表中存储的数据,还适用于由 SQL ServerSQL Server处理的所有文本,包括元数据、临时对象、变量名称等。在这些内容的处理方面,包含数据库和非包含数据库采用不同的方式。Collations apply not only to data stored in user tables, but to all text handled by SQL ServerSQL Server, including metadata, temporary objects, variable names, etc. The handling of these differs in contained and non-contained databases. 此更改不会影响很多用户,而且有助于提供独立而统一的实例。This change will not affect many users, but helps provide instance independence and uniformity. 但是,此更改也可能导致某些混淆,并可能使同时访问包含数据库和非包含数据库的会话出现问题。But this may also cause some confusion, as well as problems for sessions that access both contained and non-contained databases.

本主题阐明更改的内容,并考察这一更改可能导致问题的领域。This topic clarifies the content of the change, and examines areas where the change may cause problems.

非包含数据库Non-Contained Databases

所有数据库都有一种默认的排序规则(可在创建或更改数据库时设置)。All databases have a default collation (which can be set when creating or altering a database. 此排序规则用于数据库中的所有元数据,以及数据库中所有字符串列的默认值。This collation is used for all metadata in the database, as well as the default for all string columns within the database. 通过使用 COLLATE 子句,用户可为任何特定的列选择不同的排序规则。Users can choose a different collation for any particular column by using the COLLATE clause.

示例 1Example 1

例如,如果我们在北京工作,则可能会使用中文排序规则:For example, if we were working in Beijing, we might use a Chinese collation:

ALTER DATABASE MyDB COLLATE Chinese_Simplified_Pinyin_100_CI_AS;  

现在,如果我们创建一个列,其默认排序规则将是此中文排序规则,但我们可以根据需要选择其他排序规则:Now if we create a column, its default collation will be this Chinese collation, but we can choose another one if we want:

CREATE TABLE MyTable  
      (mycolumn1 nvarchar,  
      mycolumn2 nvarchar COLLATE Frisian_100_CS_AS);  
GO  
SELECT name, collation_name  
FROM sys.columns  
WHERE name LIKE 'mycolumn%' ;  
GO  

下面是结果集:Here is the result set.

name            collation_name  
--------------- ----------------------------------  
mycolumn1       Chinese_Simplified_Pinyin_100_CI_AS  
mycolumn2       Frisian_100_CS_AS  

这看起来比较简单,但会引发几个问题。This appears relatively simple, but several problems arise. 由于列的排序规则取决于创建表的数据库,因此如果使用存储在 tempdb中的临时表,就会出现问题。Because the collation for a column is dependent on the database in which the table is created, problems arise with the use of temporary tables which are stored in tempdb. tempdb 的排序规则通常与实例的排序规则匹配,而不必与数据库的排序规则匹配。The collation of tempdb usually matches the collation for the instance, which does not have to match the database collation.

示例 2Example 2

假设在排序规则为 Latin1_General 的实例上使用上述(中文)数据库:For example, consider the (Chinese) database above when used on an instance with a Latin1_General collation:

CREATE TABLE T1 (T1_txt nvarchar(max)) ;  
GO  
CREATE TABLE #T2 (T2_txt nvarchar(max)) ;  
GO  

乍看起来这两个表似乎具有相同的架构,但由于数据库的排序规则不同,它们的值实际上并不兼容:At first glance, these two tables look like they have the same schema, but since the collations of the databases differ, the values are actually incompatible:

SELECT T1_txt, T2_txt  
FROM T1   
JOIN #T2   
    ON T1.T1_txt = #T2.T2_txt  

下面是结果集:Here is the result set.

消息 468,级别 16,状态 9,第 2 行Msg 468, Level 16, State 9, Line 2

无法解决等于运算中“Latin1_General_100_CI_AS_KS_WS_SC”与“Chinese_Simplified_Pinyin_100_CI_AS”之间的排序规则冲突。Cannot resolve the collation conflict between "Latin1_General_100_CI_AS_KS_WS_SC" and Chinese_Simplified_Pinyin_100_CI_AS" in the equal to operation.

通过显式排列临时表可修复此问题。We can fix this by explicitly collating the temporary table. SQL ServerSQL Server COLLATE 子句提供 DATABASE_DEFAULT 关键字,使操作在一定程度上得到了简化。makes this somewhat easier by providing the DATABASE_DEFAULT keyword for the COLLATE clause.

CREATE TABLE T1 (T1_txt nvarchar(max)) ;  
GO  
CREATE TABLE #T2 (T2_txt nvarchar(max) COLLATE DATABASE_DEFAULT);  
GO  
SELECT T1_txt, T2_txt  
FROM T1   
JOIN #T2   
    ON T1.T1_txt = #T2.T2_txt ;  

现在此代码将可以正常运行。This now runs without error.

我们还可以看到,变量的行为也依赖于排序规则。We can also see collation-dependent behavior with variables. 请考虑以下函数:Consider the following function:

CREATE FUNCTION f(@x INT) RETURNS INT  
AS BEGIN   
      DECLARE @I INT = 1  
      DECLARE @İ INT = 2  
      RETURN @x * @i  
END;  

这是一个相当特殊的函数。This is a rather peculiar function. 在区分大小写的排序规则中,return 子句中的 @i 无法绑定到 @I 或 @İ。In a case-sensitive collation, the @i in the return clause cannot bind to either @I or @İ. 在不区分大小写的 Latin1_General 排序规则中,@i 绑定到 @I,该函数返回 1。In a case-insensitive Latin1_General collation, @i binds to @I, and the function returns 1. 而在不区分大小写的 Turkish 排序规则中,@i 绑定到 @İ,该函数返回 2。But in a case-insensitive Turkish collation, @i binds to @İ, and the function returns 2. 如果在采用不同排序规则的实例之间移动数据库,则会给数据库造成严重的破坏。This can wreak havoc on a database that moves between instances with different collations.

包含的数据库Contained Databases

由于包含数据库的设计目标是让自身实现独立,因此必须切断它们对实例和 tempdb 排序规则的依赖。Since a design objective of contained databases is to make them self-contained, the dependence on the instance and tempdb collations must be severed. 为此,包含数据库引入了目录排序规则的概念。To do this, contained databases introduce the concept of the catalog collation. 目录排序规则适用于系统元数据和临时对象。The catalog collation is used for system metadata and transient objects. 下面将详细介绍这一概念。Details are provided below.

如果某个包含数据库的目录排序规则是 Latin1_General_100_CI_AS_WS_KS_SCIn a contained database, the catalog collation Latin1_General_100_CI_AS_WS_KS_SC. 则所有 SQL ServerSQL Server 实例上的所有包含数据库都采用此排序规则,而且不能更改。This collation is the same for all contained databases on all instances of SQL ServerSQL Server and cannot be changed.

数据库排序规则将得到保留,但只能用作用户数据的默认排序规则。The database collation is retained, but is only used as the default collation for user data. 默认情况下,数据库排序规则等同于 model 数据库排序规则,但在非包含数据库中,用户可通过 CREATEALTER DATABASE 命令更改数据库排序规则。By default, the database collation is equal to the model database collation, but can be changed by the user through a CREATE or ALTER DATABASE command as with non-contained databases.

新关键字 CATALOG_DEFAULT适用于 COLLATE 子句。A new keyword, CATALOG_DEFAULT, is available in the COLLATE clause. 此关键字用作包含数据库和非包含数据库中当前元数据排序规则的快捷方式。This is used as a shortcut to the current collation of metadata in both contained and non-contained databases. 换言之,在非包含数据库中, CATALOG_DEFAULT 将返回当前的数据库排序规则,因为元数据是按数据库排序规则排列的。That is, in a non-contained database, CATALOG_DEFAULT will return the current database collation, since metadata is collated in the database collation. 在包含数据库中,这两个值可能是不同的,因为用户可以更改数据库排序规则,以使其不同于目录排序规则。In a contained database, these two values may be different, since the user can change the database collation so that it does not match the catalog collation.

下表总结了非包含数据库和包含数据库中各个对象的行为:The behavior of various objects in both non-contained and contained databases is summarized in this table:

Item 非包含数据库Non-Contained Database 包含数据库Contained Database
用户数据(默认)User Data (default) COLLATEDATABASE_DEFAULT COLLATEDATABASE_DEFAULT
临时数据(默认)Temp Data (default) TempDB 排序规则TempDB Collation COLLATEDATABASE_DEFAULT
元数据Metadata DATABASE_DEFAULT/CATALOG_DEFAULTDATABASE_DEFAULT / CATALOG_DEFAULT CATALOG_DEFAULTCATALOG_DEFAULT
临时元数据Temporary Metadata TempDB 排序规则tempdb Collation CATALOG_DEFAULTCATALOG_DEFAULT
变量Variables 实例排序规则Instance Collation CATALOG_DEFAULTCATALOG_DEFAULT
Goto 标签Goto Labels 实例排序规则Instance Collation CATALOG_DEFAULTCATALOG_DEFAULT
游标名称Cursor Names 实例排序规则Instance Collation CATALOG_DEFAULTCATALOG_DEFAULT

通过上述临时表示例可以看出,此排序规则行为使大多数临时表不再需要使用显式 COLLATE 子句。If we temp table example previously described, we can see that this collation behavior eliminates the need for an explicit COLLATE clause in most temp table uses. 在包含数据库中,即使数据库和实例采用不同的排序规则,此代码现在也可以正常运行:In a contained database, this code now runs without error, even if the database and instance collations differ:

CREATE TABLE T1 (T1_txt nvarchar(max)) ;  
GO  
CREATE TABLE #T2 (T2_txt nvarchar(max));  
GO  
SELECT T1_txt, T2_txt  
FROM T1   
JOIN #T2   
    ON T1.T1_txt = #T2.T2_txt ;  

之所以能够正常运行,是因为 T1_txtT2_txt 都按包含数据库的数据库排序规则排列。This works because both T1_txt and T2_txt are collated in the database collation of the contained database.

跨越包含和非包含上下文Crossing Between Contained and Uncontained Contexts

只要包含数据库中的会话仍处于包含状态,就必须保留在它所连接到的数据库内。As long as a session in a contained database remains contained, it must remain within the database to which it connected. 此时的行为非常简单。In this case the behavior is very straightforward. 但是,如果会话跨越包含和非包含上下文,其行为就会变得比较复杂,因为必须将两组规则联系起来。But if a session crosses between contained and non-contained contexts, the behavior becomes more complex, since the two sets of rules must be bridged. 这种情况可能出现在部分包含数据库中,因为用户可对另一个数据库应用 USEThis can happen in a partially-contained database, since a user may USE to another database. 在此情况下,排序规则之间的差异按以下原则处理。In this case, the difference in collation rules is handled by the following principle.

  • 批处理的排序规则行为由开始执行批处理的数据库决定。The collation behavior for a batch is determined by the database in which the batch begins.

请注意,此决定是在发出任何命令之前做出的,包括最初的 USENote that this decision is made before any commands are issued, including an initial USE. 也即,如果批处理是在包含数据库中开始执行的,而应用于非包含数据库的第一个命令是 USE ,则仍对该批处理应用包含数据库的排序规则行为。That is, if a batch begins in a contained database, but the first command is a USE to a non-contained database, the contained collation behavior will still be used for the batch. 鉴于此,引用(例如对变量的引用)可能会产生多种可能的结果:Given this, a reference to a variable, for example, may have multiple possible outcomes:

  • 引用可能只找到一个匹配项。The reference may find exactly one match. 在此情况下,该引用可正常操作。In this case, the reference will work without error.

  • 引用在当前排序规则中可能找不到匹配项,而原本存在匹配项。The reference may not find a match in the current collation where there was one before. 这将引发一个错误,指示该变量不存在,即使已明确创建该变量也不例外。This will raise an error indicating that the variable does not exist, even though it was apparently created.

  • 引用可能会找到多个原本不同的匹配项。The reference may find multiple matches that were originally distinct. 这也将引发错误。This will also raise an error.

我们将用几个示例加以说明。We'll illustrate this with a few examples. 在这些示例中,假定存在一个名为 MyCDB 的部分包含数据库;其数据库排序规则设置为默认排序规则 Latin1_General_100_CI_AS_WS_KS_SCFor these we assume there is a partially-contained database named MyCDB with its database collation set to the default collation, Latin1_General_100_CI_AS_WS_KS_SC. 我们假设实例排序规则为 Latin1_General_100_CS_AS_WS_KS_SCWe assume that the instance collation is Latin1_General_100_CS_AS_WS_KS_SC. 两个排序规则的区别只在于是否区分大小写。The two collations differ only in case sensitivity.

示例 1Example 1

下面的示例演示引用只找到一个匹配项的情况。The following example illustrates the case where the reference finds exactly one match.

USE MyCDB;  
GO  
  
CREATE TABLE #a(x int);  
INSERT INTO #a VALUES(1);  
GO  
  
USE master;  
GO  
  
SELECT * FROM #a;  
GO  
  
Results:  
  

下面是结果集:Here is the result set.

x  
-----------  
1  

在此示例中,标识的 #a 同时绑定在不区分大小写的目录排序规则和区分大小写的实例排序规则中,代码可以正常运行。In this case, the identified #a binds in both the case-insensitive catalog collation and the case-sensitive instance collation, and the code works.

示例 2Example 2

下面的示例演示当前排序规则中原本有匹配项而引用却找不到匹配项的情况。The following example illustrates the case where the reference does not find a match in the current collation where there was one before.

USE MyCDB;  
GO  
  
CREATE TABLE #a(x int);  
INSERT INTO #A VALUES(1);  
GO  

在此示例中,#A 绑定到不区分大小写的默认排序规则中的 #a,而且插入部分可正常运行。Here, the #A binds to #a in the case-insensitive default collation, and the insert works,

下面是结果集:Here is the result set.

(1 row(s) affected)  

但如果继续运行脚本...But if we continue the script...

USE master;  
GO  
  
SELECT * FROM #A;  
GO  

将显示一个错误,因为我们尝试绑定到区分大小写的实例排序规则中的 #A;We get an error trying to bind to #A in the case-sensitive instance collation;

下面是结果集:Here is the result set.

消息 208,级别 16,状态 0,第 2 行Msg 208, Level 16, State 0, Line 2

对象名“#A”无效。Invalid object name '#A'.

示例 3Example 3

下面的示例演示引用找到多个原本不同的匹配项的情况。The following example illustrates the case where the reference finds multiple matches that were originally distinct. 首先,我们在 tempdb (与实例具有相同的区分大小写的排序规则)中开始并执行以下代码。First, we start in tempdb (which has the same case-sensitive collation as our instance) and execute the following statements.

USE tempdb;  
GO  
  
CREATE TABLE #a(x int);  
GO  
CREATE TABLE #A(x int);  
GO  
INSERT INTO #a VALUES(1);  
GO  
INSERT INTO #A VALUES(2);  
GO  

此操作成功,因为采用此排序规则的表是不同的。This succeeds, since the tables are distinct in this collation:

下面是结果集:Here is the result set.

(1 row(s) affected)  
(1 row(s) affected)  

但是,一旦进入包含数据库,我们就会发现无法再绑定到这些表。If we move into our contained database, however, we find that we can no longer bind to these tables.

USE MyCDB;  
GO  
SELECT * FROM #a;  
GO  

下面是结果集:Here is the result set.

消息 12800,级别 16,状态 1,第 2 行Msg 12800, Level 16, State 1, Line 2

对临时表名称“#a”的引用不明确,无法解析。The reference to temp table name '#a' is ambiguous and cannot be resolved. 可能的候选项是“#a”和“#A”。Possible candidates are '#a' and '#A'.

结束语Conclusion

与非包含数据库相比,包含数据库的排序规则行为略有不同。The collation behavior of contained databases differs subtly from that in non-contained databases. 此行为通常是有益的,因为它可以提供独立而简单的实例。This behavior is generally beneficial, providing instance-independence and simplicity. 某些用户可能会遇到问题,特别是当会话同时访问包含数据库和非包含数据库时。Some users may have issues, particularly when a session accesses both contained and non-contained databases.

另请参阅See Also

包含的数据库Contained Databases