Audio/Video Capture in Media Foundation

Microsoft Media Foundation supports audio and video capture. Video capture devices are supported through the UVC class driver and must be compatible with UVC 1.1. Audio capture devices are supported through Windows Audio Session API (WASAPI).

A capture device is represented in Media Foundation by a media source object, which exposes the IMFMediaSource interface. In most cases, the application will not use this interface directly, but will use a higher-level API such as the Source Reader to control the capture device.

Enumerate Capture Devices

To enumerate the capture devices on the system, perform the following steps:

  1. Call the MFCreateAttributes function to create an attribute store.

  2. Set the MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE attribute to one of the following values:

    Value Description
    MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_GUID Enumerate audio capture devices.
    MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID Enumerate video capture devices.

     

  3. Call the MFEnumDeviceSources function. This function allocates an array of IMFActivate pointers. Each pointer represents an activation object for one device on the system.

  4. Call the IMFActivate::ActivateObject method to create an instance of the media source from one of the activation objects.

The following example creates a media source for the first video capture device in the enumeration list:

HRESULT CreateVideoCaptureDevice(IMFMediaSource **ppSource)
{
    *ppSource = NULL;

    UINT32 count = 0;

    IMFAttributes *pConfig = NULL;
    IMFActivate **ppDevices = NULL;

    // Create an attribute store to hold the search criteria.
    HRESULT hr = MFCreateAttributes(&pConfig, 1);

    // Request video capture devices.
    if (SUCCEEDED(hr))
    {
        hr = pConfig->SetGUID(
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE, 
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID
            );
    }

    // Enumerate the devices,
    if (SUCCEEDED(hr))
    {
        hr = MFEnumDeviceSources(pConfig, &ppDevices, &count);
    }

    // Create a media source for the first device in the list.
    if (SUCCEEDED(hr))
    {
        if (count > 0)
        {
            hr = ppDevices[0]->ActivateObject(IID_PPV_ARGS(ppSource));
        }
        else
        {
            hr = MF_E_NOT_FOUND;
        }
    }

    for (DWORD i = 0; i < count; i++)
    {
        ppDevices[i]->Release();
    }
    CoTaskMemFree(ppDevices);
    return hr;
}

You can query the activation objects for various attributes, including the following:

The following example takes an array of IMFActivate pointers and prints the display name of each device to the debug window:

void DebugShowDeviceNames(IMFActivate **ppDevices, UINT count)
{
    for (DWORD i = 0; i < count; i++)
    {
        HRESULT hr = S_OK;
        WCHAR *szFriendlyName = NULL;
    
        // Try to get the display name.
        UINT32 cchName;
        hr = ppDevices[i]->GetAllocatedString(
            MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME,
            &szFriendlyName, &cchName);

        if (SUCCEEDED(hr))
        {
            OutputDebugString(szFriendlyName);
            OutputDebugString(L"\n");
        }
        CoTaskMemFree(szFriendlyName);
    }
}

If you already know the symbolic link for a video device, there is another way to create the media source for the device:

  1. Call MFCreateAttributes to create an attribute store.
  2. Set the MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE attribute to MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID.
  3. Set the MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK attribute to the symbolic link.
  4. Call either the MFCreateDeviceSource or MFCreateDeviceSourceActivate function. The former returns an IMFMediaSource pointer. The latter returns an IMFActivate pointer to an activation object. You can use the activation object to create the source. (An activation object can be marshaled to another process, so it is useful if you want to create the source in another process. For more information, see Activation Objects.)

The following example takes the symbolic link of a video device and creates a media source.

HRESULT CreateVideoCaptureDevice(PCWSTR *pszSymbolicLink, IMFMediaSource **ppSource)
{
    *ppSource = NULL;
    
    IMFAttributes *pAttributes = NULL;
    IMFMediaSource *pSource = NULL;

    HRESULT hr = MFCreateAttributes(&pAttributes, 2);

    // Set the device type to video.
    if (SUCCEEDED(hr))
    {
        hr = pAttributes->SetGUID(
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID
            );
    }


    // Set the symbolic link.
    if (SUCCEEDED(hr))
    {
        hr = pAttributes->SetString(
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK,
            (LPCWSTR)pszSymbolicLink
            );            
    }

    if (SUCCEEDED(hr))
    {
        hr = MFCreateDeviceSource(pAttributes, ppSource);
    }

    SafeRelease(&pAttributes);
    return hr;    
}

There is an equivalent way to create an audio device from the audio endpoint ID:

  1. Call MFCreateAttributes to create an attribute store.
  2. Set the MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE attribute to MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_GUID.
  3. Set the MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_ENDPOINT_ID attribute to the endpoint ID.
  4. Call either the MFCreateDeviceSource or MFCreateDeviceSourceActivate function.

The following example takes an audio endpoint ID and creates a media source.

HRESULT CreateAudioCaptureDevice(PCWSTR *pszEndPointID, IMFMediaSource **ppSource)
{
    *ppSource = NULL;
    
    IMFAttributes *pAttributes = NULL;
    IMFMediaSource *pSource = NULL;

    HRESULT hr = MFCreateAttributes(&pAttributes, 2);

    // Set the device type to audio.
    if (SUCCEEDED(hr))
    {
        hr = pAttributes->SetGUID(
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_GUID
            );
    }

    // Set the endpoint ID.
    if (SUCCEEDED(hr))
    {
        hr = pAttributes->SetString(
            MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_AUDCAP_ENDPOINT_ID,
            (LPCWSTR)pszEndPointID
            ); 
    }

    if (SUCCEEDED(hr))
    {
        hr = MFCreateDeviceSource(pAttributes, ppSource);
    }

    SafeRelease(&pAttributes);
    return hr;    
}

Use a capture device

After you create the media source for a capture device, use the Source Reader to get data from the device. The Source Reader delivers media samples that contain the capture audio data or video frames. The next step depends on your application scenario:

  • Video preview: Use Microsoft Direct3D or Direct2D to display the video.
  • File capture: Use the Sink Writer to encode the file.
  • Audio preview: Use WASAPI.

If you want to combine audio capture with video capture, use the aggregate media source. The aggregate media source contains a collection of media sources and combines all of their streams into a single media source object. To create an instance of the aggregate media source, call the MFCreateAggregateSource function.

Shut down the capture device

When the capture device is no longer needed, you must shut down the device by calling Shutdown on the IMFMediaSource object you obtained by calling MFCreateDeviceSource or IMFActivate::ActivateObject. Failure to call Shutdown can result in memory links because the system may keep a reference to IMFMediaSource resources until Shutdown is called.

if (g_pSource)
{
    g_pSource->Shutdown();
    g_pSource->Release();
    g_pSource = NULL;
}

If you allocated a string containing the symbolic link to a capture device, you should release this object as well.

    CoTaskMemFree(g_pwszSymbolicLink);
    g_pwszSymbolicLink = NULL;

    g_cchSymbolicLink = 0;

Audio/Video Capture