Multichannel and High-Resolution WAV Formats

[The feature associated with this page, DirectSound, is a legacy feature. It has been superseded by WASAPI and Audio Graphs. Media Casting have been optimized for Windows 10 and Windows 11. Microsoft strongly recommends that new code use Media Casting instead of DirectSound, when possible. Microsoft suggests that existing code that uses the legacy APIs be rewritten to use the new APIs if possible.]

On WDM drivers, DirectSound buffers support WAV formats that have more than two output channels, for speaker configurations such as 5.1, which has speakers at the front left, front center, front right, back left, and back right, plus a low-frequency enhancer. They also support formats with sample resolutions of greater than 16 bits.

Such formats can be described by a WAVEFORMATEXTENSIBLE structure. This structure is an extension of WAVEFORMATEX that configures the extra bytes specified by the WAVEFORMATEX.cbSize. A WAVEFORMATEXTENSIBLE structure can be cast as WAVEFORMATEX wherever the latter is expected, as for example in the DSBUFFERDESC structure. DirectSound recognizes multichannel and high-resolution formats by the WAVE_FORMAT_EXTENSIBLE tag in WAVEFORMATEX.wFormatTag.

You do not have to use the WAVEFORMATEXTENSIBLE type definition in your application in order to play multichannel or high-resolution files. You need only parse the file header correctly into a WAVEFORMATEX structure that contains the extra bytes specified by cbSize.

DirectSound does not support effects or 3D processing on buffers in a multichannel format. An attempt to create a buffer with the DSBCAPS_CTRL3D or DSBCAPS_CTRLFX flag and a multichannel WAV format will fail.

For more information on multichannel WAV formats, see Multiple Channel Audio Data and WAVE Files, available at www.microsoft.com.

DirectSound Mixing

If a system is configured for fewer physical speakers than the number of channels specified in a multichannel WAV file, the audio data is mixed appropriately and output to the existing speakers.

If a system is configured for more physical speakers than the number of channels specified in a multichannel WAV file, DirectSound will not mix audio to the additional channels. DirectSound connects to the output device using the current mix format and does not perform any speaker filling. For example, a 5.1 channel WAV being played on a system with a 7.1 speaker setup will have two silent speakers.

The same behavior is true when DirectSound is the source for some other audio system. For example when playing 5.1 source audio using DirectSound and an APO designed to receive 7.1 data, DirectSound will send 7.1 data to the APO. DirectSound will internally up-mix the 5.1 data to a 7.1 signal. The up-mix pattern that is used mimics the standard 5.1 input signal. This causes the 5.1 speaker configuration to be mapped directly to the 7.1 speakers. Because of this, the "side speakers" of the 7.1 configuration produce no sound since they are not included in the original 5.1 configuration.

If an application desires behavior other than the default, it should up-mix the signal itself before submitting data to DirectSound. There is no way to override the default behavior. Since audio rendering technologies that are not based on DirectSound such as WaveOut or the WASAPIs do not exhibit this limitation it may also be possible to use these technologies instead of DirectSound.