Use Phonemes and Visemes

A phoneme is the smallest unit of speech that distinguishes one word sound from another, typically represented by a letter of an alphabet (t) or by two letters in combination (th).

Just as a phoneme is a basic acoustic unit of speech, a viseme is the basic visual unit of speech that represents the position of the mouth and face when pronouncing a phoneme. An application can use viseme information to guide facial animation as a way to enhance understanding of speech by giving visual cues of the words being spoken.

The Text-to-Speech (TTS) engine generates PhonemeReached and VisemeReached events when phoneme or viseme boundaries are crossed. To make use of either of these events, an application must register for event notification and must provide a handler for the event. For more information about registering for events, see Use Speech Synthesis Eventss.

Important

To receive timely event notification, use one of the asynchronous SpeakAsync() methods.

Using Phonemes

The following code example shows a simple handler for the PhonemeReached event. The second parameter is of type PhonemeReachedEventArgs. The Phoneme property on this class indicates the specific phoneme that was reached at the time the PhonemeReached event was raised. This example writes a string to a form, provided that a check box is selected.

void synth_PhonemeReached(object sender, PhonemeReachedEventArgs e)
{
  if (checkBoxPhonemes.Checked)
  {
    AddEventText("Phoneme reached: " + e.Phoneme);
  }
}

Using Visemes

The following code example shows a simple handler for the VisemeReached event. The second parameter is of type VisemeReachedEventArgs. The Viseme property on this class indicates the specific viseme that was reached at the time the VisemeReached event was raised. This example writes a string to a form, provided that a check box is selected.

void synth_VisemeReached(object sender, VisemeReachedEventArgs e)
{
  if (checkBoxVisemes.Checked)
  {
    AddEventText("Viseme reached " + e.Viseme);
  }
}

See Also

Concepts

Use Speech Synthesis Events