Extensible Wave-Format Descriptors

The following figure shows the data-format descriptor for a wave audio stream.

diagram illustrating a wave-format descriptor

As indicated in the figure, the amount of additional format information following the KSDATAFORMAT structure varies depending on the data format.

Audio systems use this type of format descriptor in several ways:

  • A format descriptor like the one shown in the preceding figure is passed as a call parameter to a miniport driver's NewStream method (for example, see IMiniportWaveCyclic::NewStream).

  • The ResultantFormat parameter of the IMiniport::DataRangeIntersection method points to a buffer into which the method writes a format descriptor like the one shown in the preceding figure.

  • The KSPROPERTY_PIN_DATAINTERSECTION get-property request retrieves a format descriptor like the one shown in the preceding figure.

  • The KSPROPERTY_PIN_PROPOSEDATAFORMAT set-property request accepts a format descriptor like the one shown in the preceding figure.

  • A similar format is used for the KsCreatePin function's Connect call parameter. This parameter points to the KSPIN_CONNECT structure at the beginning of a buffer that also contains a format descriptor. The format descriptor, which immediately follows the KSPIN_CONNECT structure, begins with a KSDATAFORMAT structure like the one shown in the preceding figure.

The format information that follows the KSDATAFORMAT structure should be either a WAVEFORMATEXTENSIBLE structure or a WAVEFORMATEX structure. WAVEFORMATEXTENSIBLE is an extended version of WAVEFORMATEX that can describe a broader range of formats than WAVEFORMATEX. WAVEFORMATEX is an extended version of the pre-WDM WAVEFORMAT structure. WAVEFORMAT is obsolete and is not supported by the WDM audio subsystem in any version of Microsoft Windows.

Similarly, the PCMWAVEFORMAT structure is an extended version of WAVEFORMAT that is obsolete, but for which the WDM audio subsystem provides limited support.

For information about WAVEFORMAT and PCMWAVEFORMAT, see the Microsoft Windows SDK documentation.

The four wave-format structures--WAVEFORMAT, PCMWAVEFORMAT, WAVEFORMATEX, and WAVEFORMATEXTENSIBLE--all begin with the same five members, starting with wFormatTag. The preceding figure shows these four structures superimposed on each other to highlight the portions of the structures that are identical. PCMWAVEFORMAT and WAVEFORMATEX extend WAVEFORMAT by adding a wBitsPerSample member, but WAVEFORMATEX also adds a cbSize member. WAVEFORMATEXTENSIBLE extends WAVEFORMATEX by adding three members, beginning with Samples.wValidBitsPerSample. (Samples is a union whose other member, wValidSamplesPerBlock, is used instead of wValidBitsPerSample for some compressed formats.) The wFormatTag member, which immediately follows the end of the KSDATAFORMAT structure in the buffer, specifies what kind of format information follows KSDATAFORMAT. The KMixer system driver supports only PCM formats that use one of the three format tags shown in the following table.

wFormatTag Value Meaning


Integer PCM data format specified by WAVEFORMATEX or PCMWAVEFORMAT.


Floating-point PCM data format specified by WAVEFORMATEX.


Extended data format specified by WAVEFORMATEXTENSIBLE.

In fact, KMixer supports only a subset of the PCM formats that can be described by these tag values (and it supports no non-PCM formats). USB audio devices (see USBAudio Class System Driver) are restricted to this subset because all PCM-formatted USB audio streams pass through KMixer. (Some non-PCM USB audio streams can bypass KMixer; for more information, see USB Audio Support for Non-PCM Formats.) However, in Windows XP and earlier, DirectSound applications can overcome KMixer's restrictions by connecting directly to hardware pins on WaveCyclic and WavePci devices that support formats not supported by KMixer. For more information, see DirectSound Hardware Acceleration in WDM Audio.

Note the ambiguity in the meaning of the WAVE_FORMAT_PCM tag value in the preceding table--it can specify either a WAVEFORMATEX or PCMWAVEFORMAT structure. However, these two structures are nearly identical. The only difference is that WAVEFORMATEX contains a cbSize member and PCMWAVEFORMAT does not. According to the WAVEFORMATEX specification, cbSize is ignored if wFormatTag = WAVE_FORMAT_PCM (because cbSize is implicitly zero); cbSize is used for all other formats. Thus, in the case of a PCM format, PCMWAVEFORMAT and WAVEFORMATEX contain the same information and can be treated identically.

WAVEFORMATEX can specify only a subset of the formats that WAVEFORMATEXTENSIBLE can specify. Unlike WAVEFORMATEX, WAVEFORMATEXTENSIBLE can do the following:

  1. Specify the number of bits per sample separately from the size of the sample container. For example, a 20-bit sample can be stored left-justified within a three-byte container. WAVEFORMATEX, which fails to distinguish the number of data bits per sample from the sample container size, is unable to describe such a format unambiguously.

  2. Assign specific speaker locations to audio channels in multichannel streams. WAVEFORMATEX lacks this capability and can adequately support only mono and (two-channel) stereo streams.

Any format that is described by WAVEFORMATEX can also be described by WAVEFORMATEXTENSIBLE. For information about converting a WAVEFORMATEX structure to WAVEFORMATEXTENSIBLE, see Converting Between Format Tags and Subformat GUIDs.

WAVEFORMATEX is sufficient for describing formats with sample sizes of 8 or 16 bits, but WAVEFORMATEXTENSIBLE is necessary to adequately describe formats with a sample precision of greater than 16 bits. Here are two examples:

  • A stream with a sample precision of 24 bits can use a 32-bit container size for efficient processing, but can be converted to use a 24-bit container to improve storage efficiency without loss of data.

  • When processing a stream with 24-bit sample data, a rendering device that provides only 20 bits of precision can use dithering to improve the fidelity of its output signal. Dithering, however, requires additional processing time, and if the original stream is accurate to only 20 bits, the additional processing is unnecessary.

In both of these examples, preserving signal quality while making the right tradeoff between processing and storage efficiency is possible only if both the sample precision and container size are known.

If a simple format can be unambiguously described by either a WAVEFORMATEX or a WAVEFORMATEXTENSIBLE structure, an audio driver has the option of selecting either structure to describe the format. However, audio drivers have typically used WAVEFORMATEX to specify mono and (two-channel) stereo PCM formats with 8-bit or 16-bit samples, and some older applications might expect all audio drivers to use WAVEFORMATEX to specify these formats.

If a driver supports an audio format that can be unambiguously specified as either a WAVEFORMATEX or a WAVEFORMATEXTENSIBLE structure, the driver should recognize the format regardless of which of the two structures a client application or component uses to specify the structure. For example, if an audio device supports a 44.1-kHz, 16-bit, stereo PCM format, the miniport driver's KSPROPERTY_PIN_PROPOSEDATAFORMAT property handler and its implementation of the NewStream method should accept that format regardless of whether the format is specified as a WAVEFORMATEX or a WAVEFORMATEXTENSIBLE structure.

To simplify the processing of format data, drivers typically use WAVEFORMATEXTENSIBLE structures to internally represent formats. This approach might require the conversion of an input WAVEFORMATEX structure to an internal WAVEFORMATEXTENSIBLE representation, or the conversion of an internal WAVEFORMATEXTENSIBLE representation to an output WAVEFORMATEX structure.

When converting a format descriptor from WAVEFORMATEX to WAVEFORMATEXTENSIBLE, if the wFormatTag member of the WAVEFORMATEX structure is either WAVE_FORMAT_PCM or WAVE_FORMAT_IEEE_FLOAT, set the dwChannelMask member of the WAVEFORMATEXTENSIBLE structure to either SPEAKER_FRONT_CENTER (for a mono stream) or SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT (for a stereo stream). The SPEAKER_FRONT_XXX constants are defined in header file Ksmedia.h.

In all Windows releases except for Windows 98 "Gold", KMixer supports a range of WAVEFORMATEXTENSIBLE PCM formats that have multiple channels and have up to 32 bits per sample.

The subset of WAVEFORMATEX PCM formats that KMixer supports differs between Windows releases, as shown in the following table.

Windows Release Packed Sample Sizes Number of Channels

Windows 98 "Gold"

8, 16, 24, and 32 bits


Windows 98 SE

8 and 16 bits only

Mono and stereo only

Windows 98 SE + hotfix

8, 16, 24, and 32 bits

Mono and stereo only

Windows 2000

8 and 16 bits only

Mono and stereo only

Windows Me

8, 16, 24, and 32 bits

Mono and stereo only

Windows XP and later

8 and 16 bits only

Mono and stereo only

In WAVEFORMATEXTENSIBLE, dwBitsPerSample is the container size, and wValidBitsPerSample is the number of valid data bits per sample. Containers are always byte-aligned in memory, and the container size must be specified as a multiple of eight bits.

Before the WAVEFORMATEXTENSIBLE structure was defined, vendors had to register each new wave format with Microsoft so that an official, 16-bit format tag could be assigned to the format. (The format tag is contained in the wFormatTag member of the WAVEFORMATEX structure.) A list of registered format tags appears in public header file Mmreg.h (for example, WAVE_FORMAT_MPEG).

With WAVEFORMATEXTENSIBLE, registering formats is no longer necessary. Vendors can independently assign GUIDs to their new formats as needed. (The format GUID is contained in the SubFormat member of WAVEFORMATEXTENSIBLE.) However, Microsoft lists some of the more popular format GUIDs in public header file Ksmedia.h (for example, KSDATAFORMAT_SUBTYPE_MPEG). Before defining a new format GUID, vendors should check the list of KSDATAFORMAT_SUBTYPE_XXX constants in Ksmedia.h to determine whether an appropriate GUID has already been defined for a particular format.

When using WAVEFORMATEXTENSIBLE, set wFormatTag to WAVE_FORMAT_EXTENSIBLE and SubFormat to the appropriate format GUID. For integer PCM formats, set SubFormat to KSDATAFORMAT_SUBTYPE_PCM. For PCM formats that encode sample values as floating-point numbers, set SubFormat to KSDATAFORMAT_SUBTYPE_IEEE_FLOAT. For either of these formats, set cbSize to sizeof(WAVEFORMATEXTENSIBLE)-sizeof(WAVEFORMATEX). For information about using WAVEFORMATEXTENSIBLE to describe non-PCM data formats, see Supporting Non-PCM Wave Formats.