AAC Decoder

The Microsoft Media Foundation AAC decoder is a Media Foundation Transform that decodes the following Advanced Audio Coding (AAC) and High Efficiency AAC (HE-AAC) profiles:

  • MPEG-2 AAC Low Complexity (LC) profile (multichannel).
  • MPEG-4 HE-AAC v1 (multichannel) with AAC-LC core.
  • MPEG-4 HE-AAC v2 (stereo) with AAC-LC core.

The AAC decoder supports both raw AAC streams with no headers and AAC in an audio data transport stream (ADTS).

Starting in Windows 8, the AAC decoder also supports decoding MPEG-4 audio transport streams with a multiplex layer (LATM) and synchronization layer (LOAS). It can also convert an LATM/LOAS stream to ADTS.

Class Identifier

The class identifier (CLSID) of the AAC encoder is CLSID_CMSAACDecMFT, defined in the header file wmcodecdsp.h.

Media Types

The AAC decoder supports the following media types.

Input Types

The AAC decoder supports the following audio subtypes:

Subtype Description Header
MFAudioFormat_AAC Raw AAC or ADTS AAC.
For this subtype, the media type gives the sample rate and number of channels prior to the application of spectral band replication (SBR) and parametric stereo (PS) tools, if present. The effect of the SBR tool is to double the decoded sample rate relative to the core AAC-LC sample rate. The effect of the PS tool is to decode stereo from a mono-channel core AAC-LC stream.
This subtype is equivalent to MEDIASUBTYPE_MPEG_HEAAC, defined in wmcodecdsp.h. See Audio Subtype GUIDs.
The MPEG-4 File Source and the ADTS Parser output this subtype.
mfapi.h
MEDIASUBTYPE_RAW_AAC1 Raw AAC.
This subtype is used for AAC contained in an AVI file with the audio format tag equal to WAVE_FORMAT_RAW_AAC1 (0x00FF).
For this subtype, the media type gives the sample rate and number of channels after the SBR and PS tools are applied, if present.
wmcodecdsp.h

To configure the AAC decoder, set the following attributes on the input media type.

Attribute Description Remarks
MF_MT_MAJOR_TYPE Major type. Must be MFMediaType_Audio.
MF_MT_SUBTYPE Audio subtype. Refer to the previous description for details.
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION Audio profile and level.
Optional. Applies only to MFAudioFormat_AAC.
The value of this attribute is the audioProfileLevelIndication field, as defined by ISO/IEC 14496-3.
If unknown, set to zero or 0xFE ("no audio profile specified").
MF_MT_AAC_PAYLOAD_TYPE Payload type.
Applies only to MFAudioFormat_AAC. The decoder supports the following payload types:
  • 0: Raw AAC. The stream contains raw_data_block() elements only, as defined by MPEG-2.
  • 1: ADTS. The stream contains an adts_sequence(), as defined by MPEG-2. Only one raw_data_block() per adts_frame() is allowed.
  • 3: Audio transport stream with a synchronization layer (LOAS) and a multiplex layer (LATM). Of the three types of LOAS, only AudioSyncStream is supported. The multiplex layer is AudioMuxElement, restricted to one audio program and one layer.
MF_MT_AAC_PAYLOAD_TYPE is optional. If this attribute is not specified, the default value 0 is used, which specifies the stream contains raw_data_block elements only.
MF_MT_AUDIO_BITS_PER_SAMPLE Desired bit depth of the decoded PCM audio.
MF_MT_AUDIO_CHANNEL_MASK Specifies the assignment of audio channels to speaker positions. Optional. For more information, see Format Constraints.
MF_MT_AUDIO_NUM_CHANNELS Number of channels, including the low frequency (LFE) channel, if present.
The interpretation of this value depends on the media subtype, as described previously.
MF_MT_AUDIO_SAMPLES_PER_SECOND Sample rate, in samples per second.
The interpretation of this value depends on the media subtype, as described previously.
MF_MT_USER_DATA Additional format information. The value of this attribute depends on the subtype.
  • MFAudioFormat_AAC: Contains the portion of the HEAACWAVEINFO structure that appears after the WAVEFORMATEX structure (that is, after the wfx member). This is followed by the AudioSpecificConfig() data, as defined by ISO/IEC 14496-3.
  • MEDIASUBTYPE_RAW_AAC1: Contains the AudioSpecificConfig() data. This data must appear; otherwise, the decoder will reject the media type.
The length of the AudioSpecificConfig() data is 2 bytes for AAC-LC or HE-AAC with implicit signaling of SBR/PS. It is more than 2 bytes for HE-AAC with explicit signaling of SBR/PS.
The value of audioObjectType as defined in AudioSpecificConfig() must be 2, indicating AAC-LC. The value of extensionAudioObjectType must be 5 for SBR or 29 for PS.

Output Types

The decoder supports the following output types:

Subtype Description
MFAudioFormat_Float IEEE floating-point audio.
MFAudioFormat_PCM 16-bit PCM audio.
MFAudioFormat_AAC Requires Windows 8.
This output type can be used to convert an AAC stream in the LOAS/LATM format to ADTS format.
To convert an LOAS/LATM stream to an ADTS stream, set the input type to MFAudioFormat_AAC with payload type 3 (LOAS). Then set the output type to MFAudioFormat_AAC with payload type 1 (ADTS). The decoder will reformat the conainter without decoding the bitstream.
Note: The decoder does not register MFAudioFormat_AAC as an output type. However, if the application sets the input type as described, the IMFTransform::GetOutputAvailableType method returns MFAudioFormat_AAC in the list of available output types.

If the input stream contains more than two channels, the AAC decoder provides two options for the output format:

  • The same channel configuration as the input type.
  • Stereo fold-down.

Format Constraints

The decoded audio sampling rate must be one of the following, after SBR is applied (if present):

  • 8 kHz
  • 11.025 kHz
  • 12 kHz
  • 16 kHz
  • 22.05 kHz
  • 24 kHz
  • 32 kHz
  • 44.1 kHz
  • 48 kHz

Sampling rates above 48 kHz are not supported.

The decoder supports up to 6 audio channels. For each speaker configuration, the decoder expects the AAC syntactic elements to appear in a certain order. The following table lists the supported speaker configurations. The third column of the table lists the expected syntactic elements and their order, using the following notation:

  • <SCE1>: The single_channel_element (SCE) associated with the front center speaker.
  • <SCE2>: The SCE associated with the back center speaker.
  • <CPE1>: The channel_pair_element (CPE) associated with the front speakers.
  • <CPE2>: The CPE associated with the back (or side) speakers
  • <LFE>: The lfe_channel_element (LFE).

For more information about these syntactic elements, refer to ISO/IEC 13818-7.

Configuration Channel Mask AAC Syntactic Elements
Mono SPEAKER_FRONT_CENTER <SCE1>
Stereo or dual mono SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT <CPE1>
2/1 SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_BACK_CENTER <CPE1><SCE1>
2/2 SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_BACK_LEFT | SPEAKER_BACK_RIGHT <CPE1><CPE2>
3/0 SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_FRONT_CENTER <SCE1><CPE1>
3/1 SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_FRONT_CENTER | SPEAKER_BACK_CENTER <SCE1><CPE1><SCE2>
3/2 SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_FRONT_CENTER | SPEAKER_BACK_LEFT | SPEAKER_BACK_RIGHT <SCE1><CPE1><CPE2>
3/2 + LFE SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_FRONT_CENTER | SPEAKER_LOW_FREQUENCY | SPEAKER_BACK_LEFT | SPEAKER_BACK_RIGHT <SCE1><CPE1><CPE2><LFE>

For raw AAC, each input sample must contain exactly one full AAC compressed frame.

For ADTS, each input sample can contain multiple audio frames, as well as partial frames that is, frames can span sample boundaries. Each ADTS header must be followed by one AAC frame.

The AAC decoder does not support any of the following:

  • Main profile, Sample-Rate Scalable (SRS) profile, or Long Term Prediction (LTP) profile.
  • Audio data interchange format (ADIF).
  • LATM/LAOS transport streams.
  • Coupling channel elements (CCEs). The decoder will skip audio frames with CCEs.
  • AAC-LC with a 960-sample frame size. Only 1024-sample frames are supported.

Transform Attributes

The AAC decoder implements the IMFTransform::GetAttributes method. Applications can use this method to get or set the following attributes.

Attribute Description
CODECAPI_AVDecAudioDualMono Specifies whether 2-channel audio is encoded as stereo or dual mono. Treat as read-only.
CODECAPI_AVDecAudioDualMonoReproMode Specifies how the decoder reproduces dual mono audio. The default value is eAVDecAudioDualMonoReproMode_LEFT_MONO: Output Ch1 to the left and right speakers.
Applications can set this property to change the default behavior.
MFT_SUPPORT_DYNAMIC_FORMAT_CHANGE The AAC decoder does not handle dynamic format changes, and must be flushed or drained before a new input media type is set. Treat this attribute as read-only.
Note: The AAC decoder incorrectly reports a value of TRUE for this attribute.
In Windows 7, the decoder incorrectly reports a value of TRUE for this attribute. In Windows 8, the decoder reports FALSE, which is the correct value

Example Media Types

Here is an example of the input media type needed for a 6-channel, 48-kHz AAC-LC stream, using a raw AAC payload:

Attribute Value
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_SUBTYPE MFAudioFormat_AAC
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MF_MT_AUDIO_NUM_CHANNELS 6
MF_MT_AAC_PAYLOAD_TYPE 0
MF_MT_USER_DATA {0x00, 0x00, 0x2a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x11, 0xb0}
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION 0x2a (optional)

The first 12 bytes of MF_MT_USER_DATA correspond to the following HEAACWAVEINFO structure members:

  • wPayloadType = 0 (raw AAC)
  • wAudioProfileLevelIndication = 0x2a (AAC Profile, Level 4)
  • wStructType = 0

The last two bytes of MF_MT_USER_DATA contain the value of AudioSpecificConfig(), as defined by MPEG-4.

  • AudioSpecificConfig.audioObjectType = 2 (AAC LC) (5 bits)
  • AudioSpecificConfig.samplingFrequencyIndex = 3 (4 bits)
  • AudioSpecificConfig.channelConfiguration = 6 (4 bits)
  • GASpecificConfig.frameLengthFlag = 0 (1 bit)
  • GASpecificConfig.dependsOnCoreCoder = 0 (1 bit)
  • GASpecificConfig.extensionFlag = 0 (1 bit)

Given this input type, use the following output media type to get 6-channel, 32-bit floating point PCM audio from the decoder:

Attribute Value
MF_MT_MAJOR_TYPE MFMediaType_Audio
MF_MT_SUBTYPE MFAudioFormat_Float
MF_MT_AUDIO_BITS_PER_SAMPLE 32
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MF_MT_AUDIO_NUM_CHANNELS 6
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 1152000 (optional)
MF_MT_AUDIO_BLOCK_ALIGNMENT 24 (optional)
MF_MT_AUDIO_CHANNEL_MASK 0x3f (optional)

If Platform Update Supplement for Windows Vista is installed, the AAC audio decoder is available on Windows Vista, but is accessible on Windows Vista only by using the Source Reader.

Requirements

Requirement Value
Minimum supported client
Windows 7 [desktop apps only]
Minimum supported server
Windows Server 2008 R2 [desktop apps only]
DLL
Msmpeg2adec.dll on Windows 7;
MSAudDecMFT.dll on Windows 8

See also

Codec Objects

AAC Media Types

Audio Media Types

Microsoft MPEG-1/DD/AAC Audio Decoder

MPEG-4 Support in Media Foundation

Supported Media Formats in Media Foundation