Collation and Unicode SupportCollation and Unicode Support

適用於: 是SQL Server 是Azure SQL Database 是Azure SQL 資料倉儲 是平行處理資料倉儲 APPLIES TO: yesSQL Server yesAzure SQL Database yesAzure SQL Data Warehouse yesParallel Data Warehouse

[SQL Server]SQL Server 中的定序會提供資料的排序規則、大小寫和區分腔調字屬性。Collations in [SQL Server]SQL Server provide sorting rules, case, and accent sensitivity properties for your data. 與字元資料類型 (例如 charvarchar ) 搭配使用的定序會指示字碼頁,以及可針對該資料類型表示的對應字元。Collations that are used with character data types such as char and varchar dictate the code page and corresponding characters that can be represented for that data type. 不論您是安裝新的 [SQL Server]SQL Server 執行個體、還原資料庫備份,還是將伺服器連接至用戶端資料庫,都請務必了解您要使用之資料的地區設定需求、排序次序和區分大小寫與腔調字。Whether you are installing a new instance of [SQL Server]SQL Server, restoring a database backup, or connecting server to client databases, it is important that you understand the locale requirements, sorting order, and case and accent sensitivity of the data that you are working with. 若要列出您的 [SQL Server]SQL Server執行個體所提供的定序,請參閱 sys。fn_helpcollations (Transact-SQL)To list the collations available on your instance of [SQL Server]SQL Server, see sys.fn_helpcollations (Transact-SQL).

當您針對伺服器、資料庫、資料行或運算式選取定序時,就是將某些特性指派給資料,而這些特性會影響資料庫中許多作業的結果。When you select a collation for your server, database, column, or expression, you are assigning certain characteristics to your data that affects the results of many operations in the database. 例如,當您使用 ORDER BY 來建構查詢時,結果集排序次序可能會相依於套用至資料庫的定序,或在查詢運算式層級指定於 COLLATE 子句中的定序。For example, when you construct a query by using ORDER BY, the sort order of your result set might depend on the collation that is applied to the database or dictated in a COLLATE clause at the expression level of the query.

為了有效運用 [SQL Server]SQL Server中的定序支援,您必須了解本主題中定義的詞彙,以及它們與資料特性的關聯。To best use collation support in [SQL Server]SQL Server, you must understand the terms that are defined in this topic, and how they relate to the characteristics of your data.

定序詞彙Collation Terms

定序Collation

定序指定位元模式,代表資料集的每一個字元。A collation specifies the bit patterns that represent each character in a data set. 定序也可以決定排序和比較資料的規則。Collations also determine the rules that sort and compare data. [SQL Server]SQL Server 支援在單一資料庫中儲存具備不同定序的物件。supports storing objects that have different collations in a single database. 對於非 Unicode 資料行,定序設定則指定資料的字碼頁以及可代表的字元。For non-Unicode columns, the collation setting specifies the code page for the data and which characters can be represented. 在非 Unicode 資料行之間移動的資料必須從來源字碼頁轉換成目的地字碼頁。Data that is moved between non-Unicode columns must be converted from the source code page to the destination code page.

Transact-SQLTransact-SQL 陳述式在各有不同定序設定的不同資料庫內容中執行時,陳述式的結果可能不同。 Transact-SQLTransact-SQL statement results can vary when the statement is run in the context of different databases that have different collation settings. 如果可能的話,請針對組織使用標準化定序。If it is possible, use a standardized collation for your organization. 這樣您就不需要在每一個字元或 Unicode 運算式中明確指定定序。This way, you do not have to explicitly specify the collation in every character or Unicode expression. 如果您必須使用有不同定序和字碼頁設定的物件,則在編寫查詢程式碼時,必須考量定序優先順序的規則。If you must work with objects that have different collation and code page settings, code your queries to consider the rules of collation precedence. 如需詳細資訊,請參閱 定序優先順序 (Transact-SQL)For more information, see Collation Precedence (Transact-SQL).

與定序相關聯的選項是區分大小寫、區分腔調字、區分假名、區分全半形和區分變化選取器 (Variation Selector)。The options associated with a collation are case sensitivity, accent sensitivity, Kana-sensitivity, width sensitivity, variation-selector-sensitivity. SQL Server 2019 預覽SQL Server 2019 preview 引進一個適用於 UTF-8 編碼的額外選項。introduces an additional option for UTF-8 encoding. 這些選項的指定方式是將它們附加至定序名稱。These options are specified by appending them to the collation name. 例如,此定序 Japanese_Bushu_Kakusu_100_CS_AS_KS_WS_UTF8 區分大小寫、區分腔調字、區分假名、區分全半形和 UTF-8 編碼。For example, this collation Japanese_Bushu_Kakusu_100_CS_AS_KS_WS_UTF8 is case-sensitive, accent-sensitive, Kana-sensitive, width-sensitive, and UTF-8 encoded. 例如,此定序 Japanese_Bushu_Kakusu_140_CI_AI_KS_WS_VSS 不區分大小寫、不區分腔調字、區分假名、區分全半形和區分變化選取器,並使用非 Unicode 編碼。As another example, this collation Japanese_Bushu_Kakusu_140_CI_AI_KS_WS_VSS is case-insensitive, accent-insensitive, Kana-sensitive, width-sensitive, variation-selector-sensitive and uses non-Unicode encoding. 下表描述與這些不同選項相關聯的行為。The following table describes the behavior associated with these various options.

選項Option DescriptionDescription
區分大小寫 (_CS)Case-sensitive (_CS) 區分大寫和小寫字母。Distinguishes between uppercase and lowercase letters. 如果選取此選項,小寫字母會排序在大寫字母的前面。If selected, lowercase letters sort ahead of their uppercase versions. 如果未選取此選項,定序就不會區分大小寫。If this option is not selected, the collation is case-insensitive. 亦即,在排序用途上, [SQL Server]SQL Server 會將大寫和小寫字母視為相同。That is, [SQL Server]SQL Server considers the uppercase and lowercase versions of letters to be identical for sorting purposes. 指定 _CI,就可以明確地選取不區分大小寫。You can explicitly select case insensitivity by specifying _CI.
區分腔調字 (_AS)Accent-sensitive (_AS) 區分有腔調和無腔調的字元。Distinguishes between accented and unaccented characters. 例如,'a' 不等於 'ấ'。For example, 'a' is not equal to 'ấ'. 如果未選取此選項,定序就不會區分腔調。If this option is not selected, the collation is accent-insensitive. 亦即,在排序用途上, [SQL Server]SQL Server 會將有腔調和無腔調字母視為相同。That is, [SQL Server]SQL Server considers the accented and unaccented versions of letters to be identical for sorting purposes. 指定 _AI,就可以明確地選取不區分腔調字。You can explicitly select accent insensitivity by specifying _AI.
區分假名 (_KS)Kana-sensitive (_KS) 區分兩種類型的日文假名字元:平假名和片假名。Distinguishes between the two types of Japanese kana characters: Hiragana and Katakana. 如果未選取此選項,定序就不會區分假名。If this option is not selected, the collation is Kana-insensitive. 亦即,在排序用途上, [SQL Server]SQL Server 會將平假名和片假名視為相同。That is, [SQL Server]SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes. 省略此選項,是指定不區分假名的唯一方法。Omitting this option is the only method of specifying Kana-insensitivity.
區分全半形 (_WS)Width-sensitive (_WS) 區分全形與半形字元。Distinguishes between full-width and half-width characters. 如果未選取此選項,在排序用途上, [SQL Server]SQL Server 會將相同字元的全形和半形表示法視為相同。If this option is not selected, [SQL Server]SQL Server considers the full-width and half-width representation of the same character to be identical for sorting purposes. 省略此選項,是指定不區分全形與半形的唯一方法。Omitting this option is the only method of specifying width-insensitivity.
區分變化選取器 (_VSS)Variation-selector-sensitive (_VSS) SQL Server 2017 (14.x)SQL Server 2017 (14.x)中首次引進如何區分日文定序 Japanese_Bushu_Kakusu_140 和 Japanese_XJIS_140 中的各種不同表意字元變化選取器。Distinguishes between various ideographic variation selectors in Japanese collations Japanese_Bushu_Kakusu_140 and Japanese_XJIS_140 first introduced in SQL Server 2017 (14.x)SQL Server 2017 (14.x). 變化序列是由基底字元加上額外的變化選取器所組成。A variation sequence consists of a base character plus an additional variation selector. 如果未選取此 _VSS 選項,則定序不區分變化選取器,而且比較時不會考慮變化選取器。If this _VSS option is not selected, the collation is variation selector insensitive, and the variation selector is not considered in the comparison. 換句話說,SQL Server 基於排序目的,會將建置在相同基底字元但使用不同變化選取器的字元視為相同。That is, SQL Server considers characters built upon the same base character with differing variation selectors to be identical for sorting purposes. 另請參閱 Unicode Ideographic Variation DatabaseSee also Unicode Ideographic Variation Database.

全文檢索搜尋索引不支援區分 Variation Selector (_VSS) 定序。Variation selector sensitive (_VSS) collations are not supported in Full-text search indexes. 全文檢索搜尋索引支援只區分腔調字 (_AS)、區分假名 (_KS) 和區分全半形 (_WS) 選項。Full-text search indexes support only Accent-Sensitive (_AS), Kana-sensitive (_KS), and Width-sensitive (_WS) options. SQL Server XML 和 CLR 引擎不支援 (_VSS) Variation Selector。SQL Server XML and CLR engines do not support (_VSS) Variation selectors.
UTF-8 (_UTF8)UTF-8 (_UTF8) 讓 UTF-8 編碼資料儲存至 [SQL Server]SQL ServerEnables UTF-8 encoded data to be stored in [SQL Server]SQL Server. 如果未選取此選項,則 [SQL Server]SQL Server 會使用適用資料類型的預設非 Unicode 編碼格式。If this option is not selected, [SQL Server]SQL Server uses the default non-Unicode encoding format for the applicable data types.

[SQL Server]SQL Server 支援下列定序集:supports the following collation sets:

Windows 定序Windows collations

Windows 定序會定義規則,以便依據相關聯的 Windows 系統地區設定儲存字元資料。Windows collations define rules for storing character data that are based on an associated Windows system locale. 如果是 Windows 定序,非 Unicode 資料的比較是使用與 Unicode 資料相同的演算法來實作。For a Windows collation, comparison of non-Unicode data is implemented by using the same algorithm as Unicode data. 基本 Windows 定序規則會指定套用字典排序時使用的字母或語言,以及用來儲存非 Unicode 字元資料的字碼頁。The base Windows collation rules specify which alphabet or language is used when dictionary sorting is applied, and the code page that is used to store non-Unicode character data. Unicode 和非 Unicode 排序都與特定 Windows 版本中的字串比較相容。Both Unicode and non-Unicode sorting are compatible with string comparisons in a particular version of Windows. 如此可讓 [SQL Server]SQL Server 中的各種資料類型取得一致性,同時讓開發人員能夠使用與 [SQL Server]SQL Server 相同的規則,在應用程式中排序字串。This provides consistency across data types within [SQL Server]SQL Server, and it also lets developers sort strings in their applications by using the same rules that are used by [SQL Server]SQL Server. 如需詳細資訊,請參閱 Windows 定序名稱 (Transact-SQL)For more information, see Windows Collation Name (Transact-SQL).

二進位定序Binary collations

二進位定序依據地區設定和資料類型所定義的字碼值順序來排序資料。Binary collations sort data based on the sequence of coded values that are defined by the locale and data type. 它們會區分大小寫。They are case sensitive. [SQL Server]SQL Server 中的二進位定序會定義所使用的地區設定和 ANSI 字碼頁。A binary collation in [SQL Server]SQL Server defines the locale and the ANSI code page that is used. 這會強制使用二進位排序次序。This enforces a binary sort order. 因為它們相對而言較為簡單,所以二進位定序可提升應用程式效能。Because they are relatively simple, binary collations help improve application performance. 如果是非 Unicode 資料類型,資料比較是依據 ANSI 字碼頁中所定義的字碼元素。For non-Unicode data types, data comparisons are based on the code points that are defined in the ANSI code page. 如果是 Unicode 資料類型,資料比較則是依據 Unicode 字碼指標。For Unicode data types, data comparisons are based on the Unicode code points. 如果是 Unicode 資料類型的二進位定序,在資料排序時不會考量地區設定。For binary collations on Unicode data types, the locale is not considered in data sorts. 例如,Latin_1_General_BIN 和 Japanese_BIN 用於 Unicode 資料時會產生相同的排序結果。For example, Latin_1_General_BIN and Japanese_BIN yield identical sorting results when they are used on Unicode data.

[SQL Server]SQL Server中有兩種二進位定序:較舊的 BIN 定序和較新的 BIN2 定序。There are two types of binary collations in [SQL Server]SQL Server; the older BIN collations and the newer BIN2 collations. BIN2 定序中,所有字元都是根據其字碼指標排序。In a BIN2 collation all characters are sorted according to their code points. BIN 定序中,只有第一個字元是根據字碼指標排序,剩餘字元則是根據其位元組值排序。In a BIN collation only the first character is sorted according to the code point, and remaining characters are sorted according to their byte values. (因為 Intel 平台是 Little Endian 架構,所以 Unicode 字碼字元一律以位元組交換的方式儲存)。(Because the Intel platform is a little endian architecture, Unicode code characters are always stored byte-swapped.)

SQL Server 定序SQL Server collations

[SQL Server]SQL Server 定序 (SQL_*) 會提供與 [SQL Server]SQL Server 舊版的排序次序相容性。collations (SQL_*) provide sort order compatibility with earlier versions of [SQL Server]SQL Server. 非 Unicode 資料的字典排序規則與 Windows 作業系統提供的任何排序常式不相容。The dictionary sorting rules for non-Unicode data are incompatible with any sorting routine that is provided by Windows operating systems. 不過,Unicode 資料的排序與 Windows 排序規則的特定版本相容。However, sorting Unicode data is compatible with a particular version of Windows sorting rules. 因為 [SQL Server]SQL Server 定序對非 Unicode 和 Unicode 資料使用不同的比較規則,所以您會看到相同資料的比較有不同的結果,這取決於基礎資料類型而定。Because [SQL Server]SQL Server collations use different comparison rules for non-Unicode and Unicode data, you see different results for comparisons of the same data, depending on the underlying data type. 如需詳細資訊,請參閱 SQL Server 定序名稱 (Transact-SQL)For more information, see SQL Server Collation Name (Transact-SQL).

注意

當您升級 [SQL Server]SQL Server 的英文執行個體時,可基於與現有 [SQL Server]SQL Server 執行個體的相容性而指定 [SQL Server]SQL Server 定序 (SQL_*)。When you upgrade an English-language instance of [SQL Server]SQL Server, [SQL Server]SQL Server collations (SQL_*) can be specified for compatibility with existing instances of [SQL Server]SQL Server. 因為 [SQL Server]SQL Server 執行個體的預設定序是在安裝期間定義,所以當下列條件成立時,請一定要小心指定定序設定:Because the default collation for an instance of [SQL Server]SQL Server is defined during setup, make sure that you specify collation settings carefully when the following are true:

  • 應用程式的程式碼視先前的 [SQL Server]SQL Server 定序行為而定。Your application code depends on the behavior of previous [SQL Server]SQL Server collations.
  • 您必須儲存反映多國語言的字元資料。You must store character data that reflects multiple languages.

您可以在 [SQL Server]SQL Server執行個體的下列層級設定定序:Setting collations are supported at the following levels of an instance of [SQL Server]SQL Server:

伺服器層級定序Server-level collations

預設伺服器定序是在 [SQL Server]SQL Server 安裝期間設定,因此也會成為系統資料庫和所有使用者資料庫的預設定序。The default server collation is set during [SQL Server]SQL Server setup, and also becomes the default collation of the system databases and all user databases. 請注意,您無法在 [SQL Server]SQL Server 安裝期間選取僅限 Unicode 定序,因為系統不支援將它們當做伺服器層級定序。Note that Unicode-only collations cannot be selected during [SQL Server]SQL Server setup because they are not supported as server-level collations.

將定序指派給伺服器之後,除非匯出所有資料庫物件和資料,並重建 master 資料庫,然後匯入所有資料庫物件和資料,否則無法變更此定序。After a collation has been assigned to the server, you cannot change the collation except by exporting all database objects and data, rebuilding the master database, and importing all database objects and data. 若不變更 [SQL Server]SQL Server執行個體的預設定序,您可以在建立新資料庫或資料庫資料行時,指定想要的定序。Instead of changing the default collation of an instance of [SQL Server]SQL Server, you can specify the desired collation at the time that you create a new database or database column.

資料庫層級定序Database-level collations

建立或修改資料庫時,您可以使用 CREATE DATABASE 或 ALTER DATABASE 陳述式的 COLLATE 子句來指定預設資料庫定序。When a database is created or modified, you can use the COLLATE clause of the CREATE DATABASE or ALTER DATABASE statement to specify the default database collation. 如果未指定定序,則會將伺服器定序指派給資料庫。If no collation is specified, the database is assigned the server collation.

除非變更伺服器的定序,否則無法變更系統資料庫的定序。You cannot change the collation of system databases except by changing the collation for the server.

資料庫定序是用於資料庫中的所有中繼資料,而且是資料庫中使用之所有字串資料行、暫存物件、變數名稱和任何其他字串的預設值。The database collation is used for all metadata in the database, and is the default for all string columns, temporary objects, variable names, and any other strings used in the database. 請注意,如果變更使用者資料庫的定序,則在資料庫中的查詢存取暫存資料表時,可能會發生定序衝突。When you change the collation of a user database, there can be collation conflicts when queries in the database access temporary tables. 暫存資料表一律儲存在 tempdb 系統資料庫中,以使用執行個體的定序。Temporary tables are always stored in the tempdb system database, which uses the collation for the instance. 如果定序導致評估字元資料發生衝突,則比較使用者資料庫與 tempdb 間之字元資料的查詢可能會失敗。Queries that compare character data between the user database and tempdb may fail if the collations cause a conflict in evaluating the character data. 您可以在查詢中指定 COLLATE 子句以解決此問題。You can resolve this by specifying the COLLATE clause in the query. 如需詳細資訊,請參閱 COLLATE (Transact-SQL)For more information, see COLLATE (Transact-SQL).

資料行層級定序Column-level collations

建立或改變資料表時,可以使用 COLLATE 子句來指定每個字元字串資料行的定序。When you create or alter a table, you can specify collations for each character-string column by using the COLLATE clause. 若未指定任何定序,就會將資料庫的預設定序指派給此資料行。If no collation is specified, the column is assigned the default collation of the database.

運算式層級定序Expression-level collations

運算式層級定序是在執行陳述式時設定的,而且它們會影響傳回結果集的方式。Expression-level collations are set when a statement is run, and they affect the way a result set is returned. 這樣可讓 ORDER BY 將結果排序成地區設定特定的結果。This enables ORDER BY sort results to be locale-specific. 使用類似下面的 COLLATE 子句來實作運算式層級定序:Use a COLLATE clause such as the following to implement expression-level collations:

SELECT name FROM customer ORDER BY name COLLATE Latin1_General_CS_AI;    

地區設定Locale

地區設定是一組與某個地點或文化特性相關聯的資訊。A locale is a set of information that is associated with a location or a culture. 這項資訊包括口語的名稱和識別碼、用來撰寫該語言的指令碼及文化習慣。This can include the name and identifier of the spoken language, the script that is used to write the language, and cultural conventions. 定序可與一個或多個地區設定產生關聯。Collations can be associated with one or more locales. 如需詳細資訊,請參閱 Microsoft 指派的地區設定識別碼For more information, see Locale IDs Assigned by Microsoft.

Code PageCode Page

字碼頁是給定的指令碼的已排序字元集,其中每一個字元與數字索引或字碼指標值相關聯。A code page is an ordered set of characters of a given script in which a numeric index, or code point value, is associated with each character. Windows 字碼頁一般稱為 「字元集」 (Character set) 或 charsetA Windows code page is typically referred to as a character set or charset. 字碼頁是用來提供不同 Windows 系統地區設定所使用的字元集和鍵盤配置的支援。Code pages are used to provide support for the character sets and keyboard layouts that are used by different Windows system locales.

Sort OrderSort Order

排序次序會指定如何排序資料值。Sort order specifies how data values are sorted. 這會影響資料比較的結果。This affects the results of data comparison. 資料是使用定序來排序,而且可以使用索引來最佳化。Data is sorted by using collations, and it can be optimized by using indexes.

Unicode 支援Unicode Support

Unicode 是將字碼指標對應到字元的標準用法。Unicode is a standard for mapping code points to characters. 由於 Unicode 主要設計為涵蓋世界上所有語言的字元,因此不需要使用不同的字碼頁來處理不同的字元集。Because it is designed to cover all the characters of all the languages of the world, there is no need for different code pages to handle different sets of characters. 如果您將可反映多種語言的字元資料儲存至 [SQL Server]SQL Server ( SQL Server 2005SQL Server 2005SQL Server 2017SQL Server 2017),則請使用 Unicode (UTF-16) 資料類型 (ncharnvarcharntext),而不要使用非 Unicode 資料類型 (charvarchartext)。If you store character data that reflects multiple languages in [SQL Server]SQL Server ( SQL Server 2005SQL Server 2005 through SQL Server 2017SQL Server 2017), use Unicode (UTF-16) data types (nchar, nvarchar, and ntext) instead of non-Unicode data types (char, varchar, and text). 或者,從 SQL Server 2019 預覽SQL Server 2019 preview 開始,如果使用啟用 UTF-8 的定序 (_UTF8),則先前非 Unicode 資料類型 (charvarchar) 會變成 Unicode (UTF-8) 資料類型。Alternatively, starting with SQL Server 2019 預覽SQL Server 2019 preview, if a UTF-8 enabled collation (_UTF8) is used, then previously non-Unicode data types (char and varchar) become Unicode (UTF-8) data types.

注意

SQL Server 2019 預覽SQL Server 2019 preview 不會變更先前現有 Unicode (UTF-16) 資料類型 (ncharnvarcharntext) 的行為。does not change the behavior of previously existing Unicode (UTF-16) data types (nchar, nvarchar, and ntext).

重要限制會與非 Unicode 資料類型相關聯。Significant limitations are associated with non-Unicode data types. 這是因為非 Unicode 電腦受限於使用單一字碼頁。This is because a non-Unicode computer is limited to use of a single code page. 透過使用 Unicode,您可能會發現效能獲得明顯改善,因為所需要的字碼頁轉換減少。You might experience performance gain by using Unicode because fewer code-page conversions are required. 您必須在資料庫、資料行或運算式層級個別選取 Unicode 定序,因為伺服器層級不支援這些定序。Unicode collations must be selected individually at the database, column, or expression level because they are not supported at the server level.

用戶端所使用的字碼頁是由作業系統設定決定。The code pages that a client uses are determined by the operating system settings. 若要在 Windows XP 作業系統上設定用戶端字碼頁,請使用 [控制台] 中的 [地區設定]To set client code pages on the Windows operating system, use Regional Settings in Control Panel.

當您將資料從伺服器移至用戶端時,舊版用戶端驅動程式可能無法辨識您的伺服器定序。When you move data from a server to a client, your server collation might not be recognized by older client drivers. 這種問題可能會發生在您將資料從 Unicode 伺服器移至非 Unicode 用戶端時。This can occur when you move data from a Unicode server to a non-Unicode client. 此時,最佳選項可能就是升級用戶端作業系統,以便更新基礎系統定序。Your best option might be to upgrade the client operating system so that the underlying system collations are updated. 如果用戶端已經安裝資料庫用戶端軟體,就可以考慮將服務更新套用至資料庫用戶端軟體。If the client has database client software installed, you might consider applying a service update to the database client software.

此外,您也可以嘗試針對伺服器上的資料使用不同的定序。You can also try to use a different collation for the data on the server. 您可以選擇將對應至用戶端字碼頁的定序。Choose a collation that maps to a code page on the client.

若要使用 SQL Server 2017SQL Server 2017 中提供的 UTF-16 定序來改善部分 Unicode 字元的搜尋和排序 (僅限 Windows 定序),您可以選取其中一個增補字元 (_SC) 定序或其中一個版本 140 定序。To use the UTF-16 collations available in SQL Server 2017SQL Server 2017 to improve searching and sorting of some Unicode characters (Windows collations only), you can select either one of the supplementary characters (_SC) collations or one of the version 140 collations.

若要使用 SQL Server 2019 預覽SQL Server 2019 preview 中提供的 UTF-8 定序以改善一些 Unicode 字元的排序和搜尋 (僅限 Windows 定序),您必須選取啟用 UTF-8 編碼的定序 (_UTF8)。To use the UTF-8 collations available in SQL Server 2019 預覽SQL Server 2019 preview to improve searching and sorting of some Unicode characters (Windows collations only), you must select UTF-8 encoding enabled collations(_UTF8).

  • UTF8 旗標可套用至:The UTF8 flag can be applied to:

    • 版本 90 定序Version 90 collations

      注意

      只有在增補字元 (_SC) 或區分變化選取器 (_VSS) 感知定序已存在於此版本時。Only when supplementary characters (_SC) or variation-selector-sensitive (_VSS) aware collations already exist in this version.

    • 版本 100 定序Version 100 collations

    • 版本 140 定序Version 140 collations

  • UTF8 旗標無法套用至:The UTF8 flag cannot be applied to:

    • 不支援增補字元 (_SC) 或區分變化選取器 (_VSS) 的 90 版定序Version 90 collations that don't support supplementary characters (_SC) or variation-selector-sensitive (_VSS)

    • BIN 或 BIN2 二進位定序The BIN or BIN2 binary collations

    • SQL* 定序The SQL* collations

若要評估與使用 Unicode 或非 Unicode 資料類型有關的議題,請測試自己的狀況,在您的環境中衡量效能差異。To evaluate issues that are related to using Unicode or non-Unicode data types, test your scenario to measure performance differences in your environment. 建議您將組織內系統上使用的定序標準化,並盡量部署 Unicode 伺服器和用戶端。It is a good practice to standardize the collation that is used on systems across your organization, and deploy Unicode servers and clients wherever possible.

在許多情況下, [SQL Server]SQL Server 會與其他伺服器或用戶端互動,而且您的組織可能會在應用程式與伺服器執行個體之間使用多重資料存取標準。In many situations, [SQL Server]SQL Server interacts with other servers or clients, and your organization might use multiple data access standards between applications and server instances. [SQL Server]SQL Server 用戶端是兩種主要類型中的其中一種:clients are one of two main types:

  • Unicode 用戶端 ,使用 OLE DB 和開放式資料庫連接 (ODBC) 3.7 版或更新版本。Unicode clients that use OLE DB and Open Database Connectivity (ODBC) version 3.7 or a later version.

  • 非 Unicode 用戶端 ,使用 DB-Library 和 ODBC 3.6 版或更早版本。Non-Unicode clients that use DB-Library and ODBC version 3.6 or an earlier version.

下表將提供搭配 Unicode 和非 Unicode 伺服器的各種組合來使用多國語言資料的相關資訊。The following table provides information about using multilingual data with various combinations of Unicode and non-Unicode servers.

[伺服器]Server 用戶端Client 好處或限制Benefits or Limitations
UnicodeUnicode UnicodeUnicode 因為 Unicode 資料會在整個系統中使用,所以這個狀況可提供最佳效能,並防止所擷取的資料遭到損毀。Because Unicode data is used throughout the system, this scenario provides the best performance and protection from corruption of retrieved data. 這是 ActiveX Data Objects (ADO)、OLE DB 和 ODBC 3.7 版或更新版本的情況。This is the situation with ActiveX Data Objects (ADO), OLE DB, and ODBC version 3.7 or a later version.
UnicodeUnicode 非 UnicodeNon-Unicode 在此狀況中,特別是執行新版作業系統的伺服器與執行舊版 [SQL Server]SQL Server或在舊版作業系統上執行的用戶端之間存在連接,當您將資料移至用戶端電腦時,可能會有一些限制或錯誤。In this scenario, especially with connections between a server that is running a newer operating system and a client that is running an older version of [SQL Server]SQL Server, or on an older operating system, there can be limitations or errors when you move data to a client computer. 伺服器上的 Unicode 資料會嘗試對應至非 Unicode 用戶端上的對應字碼頁,以便轉換資料。Unicode data on the server tries to map to a corresponding code page on the non-Unicode client to convert the data.
非 UnicodeNon-Unicode UnicodeUnicode 這不是使用多國語言資料的理想組態。This is not an ideal configuration for using multilingual data. 您無法將 Unicode 資料寫入非 Unicode 伺服器。You cannot write Unicode data to the non-Unicode server. 當資料傳送到在此伺服器的字碼頁以外的伺服器時,就可能發生問題。Problems are likely to occur when data is sent to servers that are outside the server's code page.
非 UnicodeNon-Unicode 非 UnicodeNon-Unicode 這是多國語言資料的限制狀況。This is a very limiting scenario for multilingual data. 您只能使用單一字碼頁。You can use only a single code page.

增補字元Supplementary Characters

[SQL Server]SQL Server 提供資料類型 (例如 Ncharnvarchar) 將 Unicode (UTF-16) 資料儲存至任何定序下方,以及提供資料類型 (例如 charvarchar) 將 Unicode (UTF-8) 資料儲存至啟用 UTF-8 的定序 (_UTF8) 下方。provides data types such as nchar and nvarchar to store Unicode (UTF-16) data under any collation, and data types such as char and varchar to store Unicode (UTF-8) data under UTF-8 enabled collations (_UTF8). 這些資料類型分別使用稱為 UTF-16UTF-8 的格式來編碼文字。These data types encode text in a format called UTF-16 and UTF-8 respectively. Unicode Consortium 會為每個字元配置唯一的字碼指標,其值介於 0x0000 到 0x10FFFF 的範圍。The Unicode Consortium allocates each character a unique codepoint, which is a value in the range 0x0000 to 0x10FFFF. 最常用的字元具有可在記憶體和磁碟上納入 8 位元或 16 位元單字的字碼指標值,但是字碼指標值大於 0xFFFF 的字元需要兩個到四個連續 8 位元單字 (UTF-8) 或兩個連續 16 位元單字 (UTF-16)。The most frequently used characters have codepoint values that fit into a 8-bit or 16-bit word in memory and on disk, but characters with codepoint values larger than 0xFFFF require two to four consecutive 8-bit words (UTF-8), or two consecutive 16-bit words (UTF-16). 這些字元稱為「增補字元」,而其他連續的 8 位元或 16 位元單字則稱為「代理字組」。These characters are called supplementary characters, and the additional consecutive 8-bit or 16-bit words are called surrogate pairs.

SQL Server 2012 (11.x)SQL Server 2012 (11.x) 中引進的全新系列增補字元 (_SC) 定序可以與 Ncharnvarcharsql_variant 資料類型搭配使用。Introduced in SQL Server 2012 (11.x)SQL Server 2012 (11.x), a new family of supplementary character (_SC) collations can be used with the data types nchar, nvarchar, and sql_variant. 例如: Latin1_General_100_CI_AS_SCJapanese_Bushu_Kakusu_100_CI_AS_SC(如果使用日文定序的話)。For example: Latin1_General_100_CI_AS_SC, or if using a Japanese collation, Japanese_Bushu_Kakusu_100_CI_AS_SC.

SQL Server 2019 預覽SQL Server 2019 preview 會將增補字元支援延伸到具有新啟用 UTF-8 定序 (_UTF8) 的資料類型 charvarcharextends supplementary character support to the data types char and varchar with the new UTF-8 enabled collations (_UTF8).

SQL Server 2014 (12.x)SQL Server 2014 (12.x) 開始,所有新定序都會自動支援增補字元。Starting in SQL Server 2014 (12.x)SQL Server 2014 (12.x), all new collations automatically support supplementary characters.

如果您使用增補字元:If you use supplementary characters:

  • 增補字元可用於 90 (含) 以上定序版本的排序及比較作業。Supplementary characters can be used in ordering and comparison operations in collation versions 90 or greater.

  • 所有版本 100 定序都支援含有增補字元的語言排序。All version 100 collations support linguistic sorting with supplementary characters.

  • 不支援在中繼資料內使用增補字元,例如資料庫物件的名稱。Supplementary characters are not supported for use in metadata, such as in names of database objects.

  • 無法啟用搭配使用定序與增補字元 (_SC) 的資料庫,進行 [SQL Server]SQL Server 複寫。Databases that use collations with supplementary characters (_SC), cannot be enabled for [SQL Server]SQL Server Replication. 這是因為針對複寫所建立的某些系統資料表和預存程序使用舊版 ntext 資料類型,其不支援增補字元。This is because some of the system tables and stored procedures that are created for replication, use the legacy ntext data type, which does not support supplementary characters.

  • SC 旗標可套用至:The SC flag can be applied to:

    • 版本 90 定序Version 90 collations

    • 版本 100 定序Version 100 collations

  • SC 旗標無法套用至:The SC flag cannot be applied to:

    • 80 版本的非版本控制 Windows 定序Version 80 non-versioned Windows collations

    • BIN 或 BIN2 二進位定序The BIN or BIN2 binary collations

    • SQL* 定序The SQL* collations

    • 版本 140 定序 (這些定序已支援增補字元,因此不需要 SC 旗標)Version 140 collations (these don't need the SC flag as they already support supplementary characters)

下表將比較當某些字串函式和字串運算子使用增補字元搭配或不搭配增補字元感知 (SCA) 定序時,這些函式和運算子的行為:The following table compares the behavior of some string functions and string operators when they use supplementary characters with and without a supplementary character-aware (SCA) collation:

字串函數或運算子String Function or Operator 搭配增補字元感知 (SCA) 定序With a Supplementary Character-Aware (SCA) Collation 不搭配 SCA 定序Without an SCA Collation
CHARINDEXCHARINDEX

LENLEN

PATINDEXPATINDEX
將 UTF-16 Surrogate 字組視為單一字碼指標。The UTF-16 surrogate pair is counted as a single codepoint. 將 UTF-16 Surrogate 字組視為兩個字碼指標。The UTF-16 surrogate pair is counted as two codepoints.
LEFTLEFT

REPLACEREPLACE

REVERSEREVERSE

RIGHTRIGHT

SUBSTRINGSUBSTRING

STUFFSTUFF
這些函數會將每個 Surrogate 字組視為單一字碼指標,並且如預期方式運作。These functions treat each surrogate pair as a single codepoint and work as expected. 這些函數可能會將 Surrogate 字組拆開,造成無法預期的結果。These functions may split any surrogate pairs and lead to unexpected results.
NCHARNCHAR 傳回對應至指定之 Unicode 字碼指標值 (在 0 到 0x10FFFF 的範圍內) 的字元。Returns the character corresponding to the specified Unicode codepoint value in the range 0 to 0x10FFFF. 如果指定的值位於 0 到 0xFFFF 的範圍內,就會傳回單一字元。If the value specified lies in the range 0 through 0xFFFF, one character is returned. 若為更高的值,則傳回對應的 Surrogate。For higher values, the corresponding surrogate is returned. 高於 0xFFFF 的值會傳回 NULL 而非對應的 Surrogate。A value higher than 0xFFFF returns NULL instead of the corresponding surrogate.
UNICODEUNICODE 傳回 UTF-16 字碼指標 (在 0 到 0x10FFFF 的範圍內)。Returns a UTF-16 codepoint in the range 0 through 0x10FFFF. 傳回 UCS-2 字碼指標 (在 0 到 0xFFFF 的範圍內)。Returns a UCS-2 codepoint in the range 0 through 0xFFFF.
符合單一萬用字元Match One Character Wildcard

萬用字元 - 不相符的字元Wildcard - Character(s) Not to Match
增補字元支援所有萬用字元作業。Supplementary characters are supported for all wildcard operations. 增補字元不支援這些萬用字元作業,Supplementary characters are not supported for these wildcard operations. 但支援其他萬用字元運算子。Other wildcard operators are supported.

GB18030 支援GB18030 Support

GB18030 是一種獨立標準,可供中華人民共和國進行中文字元的編碼。GB18030 is a separate standard used in the People's Republic of China for encoding Chinese characters. 在 GB18030 中,字元的長度可以是 1、2 或 4 個位元組。In GB18030, characters can be 1, 2, or 4 bytes in length. [SQL Server]SQL Server 可在 GB18030 編碼的字元從用戶端應用程式進入伺服器時加以辨識,並在轉換後以原生模式將其儲存為 Unicode 字元,藉以支援這種編碼的字元。provides support for GB18030-encoded characters by recognizing them when they enter the server from a client-side application and converting and storing them natively as Unicode characters. 當 GB18030 編碼的字元儲存在伺服器中後,任何後續作業都會將其視為 Unicode 字元。After they are stored in the server, they are treated as Unicode characters in any subsequent operations. 您可以使用任何中文定序,最好是使用最新的 100 版本。You can use any Chinese collation, preferably the latest 100 version. 所有 _100 層級定序都支援含有 GB18030 字元的語言排序。All _100 level collations support linguistic sorting with GB18030 characters. 如果資料包含增補字元 (Surrogate 字組),您就可以使用 SQL Server 2017SQL Server 2017 所提供的 SC 定序來改善搜尋和排序。If the data includes supplementary characters (surrogate pairs), you can use the SC collations available in SQL Server 2017SQL Server 2017 to improve searching and sorting.

複雜字集支援Complex Script Support

[SQL Server]SQL Server 可以支援輸入、儲存、變更和顯示複雜字集。can support inputting, storing, changing, and displaying complex scripts. 複雜字集包括下列類型:Complex scripts include the following types:

  • 包括由右至左和由左至右兩種文字之組合的字集,如阿拉伯文和英文文字的組合。Scripts that include the combination of both right-to-left and left-to-right text, such as a combination of Arabic and English text.
  • 字元的形狀會隨著位置或是否結合其他字元而不同的字集,例如,阿拉伯文、印度文和泰文字元。Scripts whose characters change shape depending on their position, or when combined with other characters, such as Arabic, Indic, and Thai characters.
  • 泰文之類的語言,因為單字之間不中斷,所以需要用內部字典來辨識單字。Languages such as Thai that require internal dictionaries to recognize words because there are no breaks between them.

[SQL Server]SQL Server 互動的資料庫應用程式必須使用支援複雜字集的控制項。Database applications that interact with [SQL Server]SQL Server must use controls that support complex scripts. Managed 程式碼中所建立的標準 Windows Form 控制項具有複雜字集的功能。Standard Windows form controls that are created in managed code are complex script-enabled.

SQL Server 2017 (14.x)SQL Server 2017 (14.x)Japanese Collations added in SQL Server 2017 (14.x)SQL Server 2017 (14.x)

SQL Server 2017 (14.x)SQL Server 2017 (14.x) 開始,支援新的日文定序系列,可使用不同選項 (_CS、_AS、_KS、_WS、_VSS) 的排列。Starting in SQL Server 2017 (14.x)SQL Server 2017 (14.x), new Japanese collation families are supported, with the permutations of various options (_CS, _AS, _KS, _WS, _VSS).

若要列出這些定序,您可以查詢 SQL Server Database EngineSQL Server Database EngineTo list these collations, you can query the SQL Server Database EngineSQL Server Database Engine:

SELECT Name, Description FROM fn_helpcollations()  
WHERE Name LIKE 'Japanese_Bushu_Kakusu_140%' OR Name LIKE 'Japanese_XJIS_140%'

所有新定序都內建增補字元支援,因此沒有且不需要 SC 旗標。All of the new collations have built-in support for supplementary characters, so none of the new collations have (or need) the SC flag.

資料庫引擎索引、記憶體最佳化資料表、資料行存放區索引和原生編譯的模組都支援這些定序。These collations are supported in Database Engine indexes, memory-optimized tables, columnstore indexes, and natively compiled modules.

工作Task 主題Topic
描述如何設定或變更 SQL Server 執行個體的定序。Describes how to set or change the collation of the instance of SQL Server. 設定或變更伺服器定序Set or Change the Server Collation
描述如何設定或變更使用者資料庫的定序。Describes how to set or change the collation of a user database. 設定或變更資料庫定序Set or Change the Database Collation
描述如何設定或變更資料庫中資料行的定序。Describes how to set or change the collation of a column in the database. 設定或變更資料行定序Set or Change the Column Collation
描述如何傳回伺服器、資料庫或資料行層級的定序資訊。Describes how to return collation information at the server, database, or column level. 檢視定序資訊View Collation Information
描述如何撰寫 Transact-SQL 陳述式,讓它們可以從某種語言移植至另一種語言,或更輕鬆地支援多種語言。Describes how to write Transact-SQL statements that are more portable from one language to another, or support multiple languages more easily. 撰寫國際通用的 Transact-SQL 陳述式Write International Transact-SQL Statements
描述如何變更錯誤訊息的語言,以及日期、時間和貨幣資料之使用和顯示方式的喜好設定。Describes how to change the language of error messages and preferences for how date, time, and currency data are used and displayed. 設定工作階段語言Set a Session Language

SQL Server 最佳做法定序變更 SQL Server Best Practices Collation Change
使用 Unicode 字元格式匯入或匯出資料 (SQL Server) Use Unicode Character Format to Import or Export Data (SQL Server)
SQL Server 最佳做法:移轉至 Unicode - 不再維護"SQL Server Best Practices Migration to Unicode" - No longer maintained
Unicode Consortium 網站Unicode Consortium Web site

另請參閱See Also

自主資料庫定序 Contained Database Collations
選擇建立全文檢索索引時的語言 Choose a Language When Creating a Full-Text Index
sys.fn_helpcollations (Transact-SQL)sys.fn_helpcollations (Transact-SQL)