ANSI Versus Unicode

There are currently two versions of the PST file format: ANSI and Unicode. The ANSI PST file format is the legacy format and SHOULD NOT be used to create new PST files. The Unicode PST file format is the currently-used format.<3>

While the nomenclature suggests a difference in how the internal strings are represented in the PST file, there are other significant differences between the ANSI and Unicode PST file formats. The most significant difference is the sizes of various core data elements that are used throughout the NDB layer. Specifically, the ANSI version uses 32-bit values to represent block IDs (BIDs) and absolute file offsets (IB). The Unicode version uses 64-bit values instead. Some other values that were represented using 32-bits have also been extended to use 64-bits. Those cases are discussed on a case-by-case basis.

Because BIDs and IBs are used extensively throughout the NDB layer, the version-specific size differences affect most of the NDB data structures. ANSI and Unicode versions of the data structures are defined separately whenever there are material differences between the two versions.