Windows 8.1 Audio streaming – Part 1: Audio categories

Applications create audio streams for a variety of reasons. For example:

  • Games might have soft music playing in the background, as well loud game effects (e.g. when a user fires a bullet)

  • Media players play audio and video files

  • Communication applications (e.g. Skype, Lync) allow users to talk to each other remotely

  • Sound recorders record audio from other applications or from the environment


In order to inform the system about the usage of an audio stream, applications have the option to tag the stream with a specific “audio category”. Applications can set the audio category, using any of the audio APIs, just after creating the audio stream.

Windows 8.1 supports the following 8 audio categories:

  1. ForegroundOnlyMedia: Games or other sounds designed to work only in the foreground, but will mute existing background media sounds. Examples include game audio needed for a game (dancing games, music games), feature films (designed to pause when they go to the background)

  2. BackgroundCapableMedia: For audio that needs to continue playing in the background. Examples include: Local media playback, local playlist, streaming radio, streaming playlist, music videos, streaming audio/radio (e.g. YouTube, Netflix, etc)

  3. Communications: For audio streaming communication audio such as Voice over IP (VoIP), real-time chat or other type of phone call

  4. Alerts: Looping or longer running alert sounds, such as alarm, ring tones, ringing notification, sounds that need to decrease existing audio

  5. SoundEffects: Sounds designed to mix with existing audio, such as beeps, dings, and other brief sounds

  6. GameEffects: Game sound effects designed to mix with existing audio, such as balls bouncing, engine sounds, characters talking, all non-music sounds

  7. GameMedia: Background music played by a game

  8. Other: Default stream category used for uncategorized streams

This information allows the audio stack to optimize the user experience. More specifically, the stream audio category defines several characteristics of the way that the system interacts with the audio stream, such as:

  1. Audio routing

    • Which endpoint should we use to play or capture sounds related to that stream?

    • Example: When a user connects/disconnects new device (e.g. headset, Miracast, BT, USB, etc), should a particular audio stream switch to it or not?

  2. Volume

    • When a user starts a new stream, do we need to attenuate/mute any of the streams that are already playing or capturing audio?

    • Example: When a user starts speaking on Skype, it is ok if a YouTube video keeps playing in the background at full volume?

  3. Audio processing

    • What types of audio effects should the system apply to each audio stream?

    • Example: communication streams might need not only to reduce unneeded ambient noises by enabling Acoustic Echo Cancelation and Noise Suppression, but also to capture the user’s voice clearly by enabling Beam Forming and Acoustic Gain Control.

  4. Power savings (via H/W offload)

    • Should the processing of the stream be offloaded to the H/W, so that we can lower power consumption of the device?

    • Example: When a user watches a movie, the system has the option to prefetch big audio buffers and send them to the H/W for processing, in order to avoid waking up the CPU very often

  5. Background Audio Playback

    • When the application goes into the background, should audio keep playing?

    • Example: When a user is playing a game and he minimizes the window, in order to open a new tab in Internet Explorer, should the audio from the game still be audible?


In the next few posts I will dive deeper into each of the 5 policies mentioned above.


Code samples: How to set an audio category

1. XAML: MediaElement

<MediaElement x:Name="mediaplayer"  AudioCategory="Communications" />


2. HTML: Audio Tag

<audio msAudioCategory="Communications“>


3. C++: WASAPI

// Instantiate the interface to set the audio categories

IAudioClient2 *pAudioClient2;

pMMDevice->Activate(__uuidof(IAudioClient2), CLSCTX_ALL, NULL, (void **)&pAudioClient2));


// Set the category of the stream to be Communications

AudioClientProperties props = {};

props.cbSize = sizeof(props);

props.eCategory = AudioCategory_Communications;