Media Encoder Standard schema

This topic describes some of the elements and types of the XML schema on which Media Encoder Standard presets are based. The topic gives explanation of elements and their valid values. The full schema will be published at a later date.

Preset (root element)

Defines an encoding preset.

Elements

Name Type Description
Encoding Encoding Root element, indicates that the input sources are to be encoded.
Outputs Outputs Collection of desired output files.

Attributes

Name Type Description
Version

Required
xs:decimal The preset version. The following restrictions apply: xs:fractionDigits value="1" and xs:minInclusive value="1" For example, version="1.0".

Encoding

Contains a sequence of the following elements.

Elements

Name Type Description
H264Video H264Video Settings for H.264 encoding of video.
AACAudio AACAudio Settings for AAC encoding of audio.
BmpImage BmpImage Settings for Bmp image.
PngImage PngImage Settings for Png image.
JpgImage JpgImage Settings for Jpg image.

H264Video

Elements

Name Type Description
TwoPass

minOccurs="0"
xs:boolean Currently, only one-pass encoding is supported.
KeyFrameInterval

minOccurs="0"

default="00:00:02"
xs:time Determines the fixed spacing between IDR frames in units of seconds. Also referred to as the GOP duration. See SceneChangeDetection (below) for controlling whether the encoder can deviate from this value.
SceneChangeDetection

minOccurs="0"

default=”false”
xs:boolean If set to true, encoder attempts to detect scene change in the video and inserts an IDR frame.
Complexity

minOccurs="0"

default="Balanced"
xs:string Controls the trade-off between encode speed and video quality. Could be one of the following values: Speed, Balanced, or Quality

Default: Balanced
SyncMode

minOccurs="0"
Feature will be exposed in a future releases.
H264Layers

minOccurs="0"
H264Layers Collection of output video layers.

Attributes

Name Type Description
Condition xs:string When the input has no video, you may want to force the encoder to insert a monochrome video track. To do that, use Condition="InsertBlackIfNoVideoBottomLayerOnly" (to insert a video at only the lowest bitrate) or Condition="InsertBlackIfNoVideo" (to insert a video at all output bitrates). For more information, see this topic.

H264Layers

By default, if you send an input to the encoder that contains only audio, and no video, the output asset will contain files with audio data only. Some players may not be able to handle such output streams. You can use the H264Video's InsertBlackIfNoVideo attribute setting to force the encoder to add a video track to the output in that scenario. For more information, see this topic.

Elements

Name Type Description
H264Layer

minOccurs="0" maxOccurs="unbounded"
H264Layer A collection of H264 layers.

H264Layer

Note

Video limits are based on the values described in the H264 Levels table.

Elements

Name Type Description
Profile

minOccurs="0"

default=”Auto”
xs:string Could be of one of the following xs:string values: Auto, Baseline, Main, High.
Level

minOccurs="0"

default=”Auto”
xs:string
Bitrate

minOccurs="0"
xs:int The bitrate used for this video layer, specified in kbps.
MaxBitrate

minOccurs="0"
xs:int The maximum bitrate used for this video layer, specified in kbps.
BufferWindow

minOccurs="0"

default="00:00:05"
xs:time Length of the video buffer.
Width

minOccurs="0"
xs:int Width of the output video frame, in pixels.

Note that currently, you must specify both Width and Height. The Width and Height need to be even numbers.
Height

minOccurs="0"
xs:int Height of the output video frame, in pixels.

Note that currently, you must specify both Width and Height . The Width and Height need to be even numbers.
BFrames

minOccurs="0"
xs:int Number of B frames between reference frames.
ReferenceFrames

minOccurs="0"

default=”3”
xs:int Number of reference frames in a GOP.
EntropyMode

minOccurs="0"

default=”Cabac”
xs:string Could be one of the following values: Cabac and Cavlc.
FrameRate

minOccurs="0"
rational number Determines the frame rate of the output video. Use default of "0/1" to let the encoder use the same frame rate as the input video. Allowed values are expected to be common video frame rates, as shown below. However, any valid rational is allowed. For example 1/1 would be 1 fps and is valid.

- 12/1 (12 fps)

- 15/1 (15 fps)

- 24/1 (24 fps)

- 24000/1001 (23.976 fps)

- 25/1 (25 fps)

- 30/1 (30 fps)

- 30000/1001 (29.97 fps)

NOTE If you are creating a custom preset for multiple-bitrate encoding, then all layers of the preset must use the same value of FrameRate.
AdaptiveBFrame

minOccurs="0"
xs:boolean Copy from Azure media encoder
Slices

minOccurs="0"

default="0"
xs:int Determines how many slices a frame is divided into. Recommend using default.

AACAudio

Contains a sequence of the following elements and groups.

For more information about AAC, see AAC.

Elements

Name Type Description
Profile

minOccurs="0 "

default="AACLC"
xs:string Could be one of the following values: AACLC, HEAACV1, or HEAACV2.

Attributes

Name Type Description
Condition xs:string To force the encoder to produce an asset that contains a silent audio track when input has no audio, specify the "InsertSilenceIfNoAudio" value.

By default, if you send an input to the encoder that contains only video, and no audio, then the output asset will contain files that contain only video data. Some players may not be able to handle such output streams. You can use this setting to force the encoder to add a silent audio track to the output in that scenario.

Groups

Reference Description
AudioGroup

minOccurs="0"
See description of AudioGroup to know the appropriate number of channels, sampling rate, and bit rate that could be set for each profile.

AudioGroup

For details about what values are valid for each profile, see the “Audio codec details” table that follows.

Elements

Name Type Description
Channels

minOccurs="0"
xs:int The number of audio channels encoded. The following are valid options: 1, 2, 5, 6, 8.

Default: 2.
SamplingRate

minOccurs="0"
xs:int The audio sampling rate, specified in Hz.
Bitrate

minOccurs="0"
xs:int The bitrate used when encoding the audio, specified in kbps.

Audio codec details

Audio Codec Details
AACLC 1:

- 11025 : 8 <= bitrate < 16

- 12000 : 8 <= bitrate < 16

- 16000 : 8 <= bitrate <32

- 22050 : 24 <= bitrate < 32

- 24000 : 24 <= bitrate < 32

- 32000 : 32 <= bitrate <= 192

- 44100 : 56 <= bitrate <= 288

- 48000 : 56 <= bitrate <= 288

- 88200 : 128 <= bitrate <= 288

- 96000 : 128 <= bitrate <= 288

2:

- 11025 : 16 <= bitrate < 24

- 12000 : 16 <= bitrate < 24

- 16000 : 16 <= bitrate < 40

- 22050 : 32 <= bitrate < 40

- 24000 : 32 <= bitrate < 40

- 32000 : 40 <= bitrate <= 384

- 44100 : 96 <= bitrate <= 576

- 48000 : 96 <= bitrate <= 576

- 88200 : 256 <= bitrate <= 576

- 96000 : 256 <= bitrate <= 576

5/6:

- 32000 : 160 <= bitrate <= 896

- 44100 : 240 <= bitrate <= 1024

- 48000 : 240 <= bitrate <= 1024

- 88200 : 640 <= bitrate <= 1024

- 96000 : 640 <= bitrate <= 1024

8:

- 32000 : 224 <= bitrate <= 1024

- 44100 : 384 <= bitrate <= 1024

- 48000 : 384 <= bitrate <= 1024

- 88200 : 896 <= bitrate <= 1024

- 96000 : 896 <= bitrate <= 1024
HEAACV1 1:

- 22050 : bitrate = 8

- 24000 : 8 <= bitrate <= 10

- 32000 : 12 <= bitrate <= 64

- 44100 : 20 <= bitrate <= 64

- 48000 : 20 <= bitrate <= 64

- 88200 : bitrate = 64

2:

- 32000 : 16 <= bitrate <= 128

- 44100 : 16 <= bitrate <= 128

- 48000 : 16 <= bitrate <= 128

- 88200 : 96 <= bitrate <= 128

- 96000 : 96 <= bitrate <= 128

5/6:

- 32000 : 64 <= bitrate <= 320

- 44100 : 64 <= bitrate <= 320

- 48000 : 64 <= bitrate <= 320

- 88200 : 256 <= bitrate <= 320

- 96000 : 256 <= bitrate <= 320

8:

- 32000 : 96 <= bitrate <= 448

- 44100 : 96 <= bitrate <= 448

- 48000 : 96 <= bitrate <= 448

- 88200 : 384 <= bitrate <= 448

- 96000 : 384 <= bitrate <= 448
HEAACV2 2:

- 22050 : 8 <= bitrate <= 10

- 24000 : 8 <= bitrate <= 10

- 32000 : 12 <= bitrate <= 64

- 44100 : 20 <= bitrate <= 64

- 48000 : 20 <= bitrate <= 64

- 88200 : 64 <= bitrate <= 64

Clip

Attributes

Name Type Description
StartTime xs:duration Specifies the start time of a presentation. The value of StartTime needs to match the absolute timestamps of the input video. For example, if the first frame of the input video has a timestamp of 12:00:10.000, then StartTime should be at least 12:00:10.000 or greater.
Duration xs:duration Specifies the duration of a presentation (for example, appearance of an overlay in the video).

Output

Attributes

Name Type Description
FileName xs:string The name of the output file.

You can use macros described in the following table to build the output file names. For example:

"Outputs": [ { "FileName": "{Basename}{Resolution}{Bitrate}.mp4", "Format": { "Type": "MP4Format" } } ]

Macros

Macro Description
{Basename} If you are doing VoD encoding, the {Basename} is the first 32 characters of the AssetFile.Name property of the primary file in the input asset.

If the input asset is a live archive, then the {Basename} is derived from the trackName attributes in the server manifest. If you are submitting a subclip job using the TopBitrate, as in: “TopBitrate”, and the output file contains video, then the {Basename} is the first 32 characters of the trackName of the video layer with the highest bitrate.

If instead you are submitting a subclip job using all of the input bitrates, such as “*”, and the output file contains video, then {Basename} is the first 32 characters of the trackName of the corresponding video layer.
{Codec} Maps to “H264” for video and “AAC” for audio.
{Bitrate} The target video bitrate if the output file contains video and audio, or target audio bitrate if the output file contains audio only. The value used is the bitrate in kbps.
{Channel} Audio channel count if the file contains audio.
{Width} Width of the video, in pixels, in the output file, if the file contains video.
{Height} Height of the video, in pixels, in the output file, if the file contains video.
{Extension} Inherits from the “Type” property for the output file. The output file name will have an extension which is one of : “mp4”, “ts”, “jpg”, “png” or “bmp”.
{Index} Mandatory for thumbnail. Should only be present once.

Video (complex type inherits from Codec)

Attributes

Name Type Description
Start xs:string
Step xs:string
Range xs:string
PreserveResolutionAfterRotation xs:boolean For detailed explanation, see the following section: PreserveResolutionAfterRotation

PreserveResolutionAfterRotation

It is recommended to use the PreserveResolutionAfterRotation flag in combination with resolution values expressed in percentage terms (Width=”100%” , Height = “100%”).

By default, the encode resolution settings (Width, Height) in the Media Encoder Standard (MES) presets are targeted at videos with 0 degree rotation. For example, if your input video is 1280x720 with zero degree rotation, then the default presets ensure that the output has the same resolution. See picture below.

MESRoation1

However, this means that if the input video has been captured with non-zero rotation (eg. a smartphone or tablet held vertically), then MES by default will apply the encode resolution settings (Width, Height) to the input video, and then compensate for the rotation. For example, see the picture below. The preset uses Width = “100%”, Height = “100%”, which MES interprets as requiring the output to be 1280 pixels wide and 720 pixels tall. After rotating the video, it then shrinks the picture to fit into that window, leading to pillar-box areas on the left and right.

MESRoation2

If the above is not the desired behavior, then you can make use of the PreserveResolutionAfterRotation flag and set it to “true” (default is “false”). So if your preset has Width = “100%”, Height = “100%” and PreserveResolutionAfterRotation set to “true”, an input video which is 1280 pixels wide and 720 pixels tall with 90 degree rotation will produce an output with zero degree rotation, but 720 pixels wide and 1280 pixels tall. See the picture below.

MESRoation3

FormatGroup (group)

Elements

Name Type Description
BmpFormat BmpFormat
PngFormat PngFormat
JpgFormat JpgFormat

BmpLayer

Element

Name Type Description
Width

minOccurs="0"
xs:int
Height

minOccurs="0"
xs:int

Attributes

Name Type Description
Condition xs:string

PngLayer

Element

Name Type Description
Width

minOccurs="0"
xs:int
Height

minOccurs="0"
xs:int

Attributes

Name Type Description
Condition xs:string

JpgLayer

Element

Name Type Description
Width

minOccurs="0"
xs:int
Height

minOccurs="0"
xs:int
Quality

minOccurs="0"
xs:int Valid values: 1(worst)-100(best)

Attributes

Name Type Description
Condition xs:string

PngLayers

Elements

Name Type Description
PngLayer

minOccurs="0" maxOccurs="unbounded"
PngLayer

BmpLayers

Elements

Name Type Description
BmpLayer

minOccurs="0" maxOccurs="unbounded"
BmpLayer

JpgLayers

Elements

Name Type Description
JpgLayer

minOccurs="0" maxOccurs="unbounded"
JpgLayer

BmpImage (complex type inherits from Video)

Elements

Name Type Description
PngLayers

minOccurs="0"
PngLayers Png layers

JpgImage (complex type inherits from Video)

Elements

Name Type Description
PngLayers

minOccurs="0"
PngLayers Png layers

PngImage (complex type inherits from Video)

Elements

Name Type Description
PngLayers

minOccurs="0"
PngLayers Png layers

Examples

See examples of XML presets that are built based on this schema, see Task Presets for MES (Media Encoder Standard).

Next steps

You can view Azure Media Services learning paths here:

Provide feedback

Use the User Voice forum to provide feedback and make suggestions on how to improve Azure Media Services. You can also go directly to one of the following categories: