nchar and nvarchar (Transact-SQL)
Character data types that are either fixed-length, nchar, or variable-length, nvarchar. Starting with SQL Server 2012 (11.x), when a Supplementary Character (SC) enabled collation is used, these data types store the full range of Unicode character data and use the UTF-16 character encoding. If a non-SC collation is specified, then these data types store only the subset of character data supported by the UCS-2 character encoding.
nchar [ ( n ) ]
Fixed-length string data. n defines the string length in byte-pairs and must be a value from 1 through 4,000. The storage size is two times n bytes. For UCS-2 encoding, the storage size is two times n bytes and the number of characters that can be stored is also n. For UTF-16 encoding, the storage size is still two times n bytes but the number of characters that can be stored may be smaller than n because Supplementary Characters use two byte-pairs (also called surrogate-pair). The ISO synonyms for nchar are national char and national character.
nvarchar [ ( n | max ) ]
Variable-length string data. n defines the string length in byte-pairs and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^30-1 characters (2 GB). The storage size is two times n bytes + 2 bytes. For UCS-2 encoding, the storage size is two times n bytes + 2 bytes and the number of characters that can be stored is also n. For UTF-16 encoding, the storage size is still two times n bytes + 2 bytes but the number of characters that can be stored may be smaller than n because Supplementary Characters use two byte-pairs (also called surrogate-pair). The ISO synonyms for nvarchar are national char varying and national character varying.
A common misconception is to think that NCHAR(n) and NVARCHAR(n), the n defines the number of characters. But in NCHAR(n) and NVARCHAR(n) the n defines the string length in byte-pairs (0-4,000). n never defines numbers of characters that can be stored. This is similar to the definition of CHAR(n) and VARCHAR(n).
The misconception happens because when using characters defined in the Unicode range 0-65,535, one character can be stored per each byte-pair. However, in higher Unicode ranges (65,536-1,114,111) one character may use two byte-pairs. For example, in a column defined as NCHAR(10), the Database Engine can store 10 characters that use one byte-pair (Unicode range 0-65,535), but less than 10 characters when using two byte-pairs (Unicode range 65,536-1,114,111). For more information about Unicode storage and character ranges, see Storage differences between UTF-8 and UTF-16.
When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified with the CAST function, the default length is 30.
If you use nchar or nvarchar, we recommend to:
- Use nchar when the sizes of the column data entries are consistent.
- Use nvarchar when the sizes of the column data entries vary considerably.
- Use nvarchar(max) when the sizes of the column data entries vary considerably, and the string length might exceed 4,000 byte-pairs.
sysname is a system-supplied user-defined data type that is functionally equivalent to nvarchar(128), except that it is not nullable. sysname is used to reference database object names.
Objects that use nchar or nvarchar are assigned the default collation of the database unless a specific collation is assigned using the COLLATE clause.
SET ANSI_PADDING is always ON for nchar and nvarchar. SET ANSI_PADDING OFF does not apply to the nchar or nvarchar data types.
Prefix a Unicode character string constants with the letter N to signal UCS-2 or UTF-16 input, depending on whether an SC collation is used or not. Without the N prefix, the string is converted to the default code page of the database that may not recognize certain characters. Starting with SQL Server 2019 preview, when a UTF-8 enabled collation is used, the default code page is capable of storing UNICODE UTF-8 character set.
When prefixing a string constant with the letter N, the implicit conversion will result in a UCS-2 or UTF-16 string if the constant to convert does not exceed the max length for the nvarchar string data type (4,000). Otherwise, the implicit conversion will result in a large-value nvarchar(max).
Each non-null varchar(max) or nvarchar(max) column requires 24 bytes of additional fixed allocation, which counts against the 8,060-byte row limit during a sort operation. These additional bytes can create an implicit limit to the number of non-null varchar(max) or nvarchar(max) columns in a table. No special error is provided when the table is created (beyond the usual warning that the maximum row size exceeds the allowed maximum of 8,060 bytes) or at the time of data insertion. This large row size can cause errors (such as error 512) that users may not anticipate during some normal operations. Two examples of operations are a clustered index key update, or sorts of the full column set.
Converting Character Data
For information about converting character data, see char and varchar (Transact-SQL).
ALTER TABLE (Transact-SQL)
CAST and CONVERT (Transact-SQL)
CREATE TABLE (Transact-SQL)
Data Types (Transact-SQL)
DECLARE @local_variable (Transact-SQL)
SET ANSI_PADDING (Transact-SQL)
SET @local_variable (Transact-SQL)
Collation and Unicode Support
Single-Byte and Multibyte Character Sets