2.2.1.1 Character Sequences

In all dialects prior to NT LAN Manager, all character sequences were encoded using the OEM character set (extended ASCII). The NT LAN Manager dialect introduced support for Unicode, which is negotiated during protocol negotiation and session setup. The use of Unicode characters is indicated on a per-message basis by setting the SMB_FLAGS2_UNICODE flag in the SMB_Header.Flags2 field. All Unicode characters MUST be in UTF-16LE encoding.

In CIFS, character sequences are transmitted over the wire as arrays of either UCHAR (for OEM characters) or WCHAR (for Unicode characters). Throughout this document, null-terminated character sequence fields that can be encoded in either Unicode or OEM characters (depending on the result of Unicode capability negotiation) are labeled as SMB_STRING fields.

String fields that restrict character encoding to OEM characters only, even if Unicode support has been negotiated, are labeled as OEM_STRING. Some examples of strings that are never passed in Unicode are:

  • The dialect strings in the SMB_COM_NEGOTIATE (section 2.2.4.52) command.

  • The service name string in the SMB_COM_TREE_CONNECT_ANDX (section 2.2.4.55) command.