TtsEngineSsml.Speak(TextFragment[], IntPtr, ITtsEngineSite) Method


Renders specified TextFragment array in the specified output format.

 abstract void Speak(cli::array <System::Speech::Synthesis::TtsEngine::TextFragment ^> ^ fragment, IntPtr waveHeader, System::Speech::Synthesis::TtsEngine::ITtsEngineSite ^ site);
public abstract void Speak (System.Speech.Synthesis.TtsEngine.TextFragment[] fragment, IntPtr waveHeader, System.Speech.Synthesis.TtsEngine.ITtsEngineSite site);
abstract member Speak : System.Speech.Synthesis.TtsEngine.TextFragment[] * nativeint * System.Speech.Synthesis.TtsEngine.ITtsEngineSite -> unit
Public MustOverride Sub Speak (fragment As TextFragment(), waveHeader As IntPtr, site As ITtsEngineSite)



An array of TextFragment instances containing the text to be rendered into speech.


An IntPtr pointing to a structure containing audio output format.


A reference to an ITtsEngineSite interface passed in by the platform infrastructure to allow access to the infrastructure resources.


The example below is part of a custom speech synthesis implementation inheriting from TtsEngineSsml, and using the use of TextFragment, SpeechEventInfo, FragmentState, and TtsEventId

The implementation of Speak

  1. Receives an array of TextFragment instances and creates a new array of TextFragment instances to be passed to the Speak method on an underlying synthesis engine.

  2. If the TtsEngineAction enumeration value by found from the Action property on the FragmentState returned by the State property of each TextFragment instance is Speak, the implementation

    • Translates Americanism to Britishisms in the text to be spoken.

    • If the EventInterest property on the ITtsEngineSite interfaces provided to the implementation support the WordBoundary event type, a SpeechEventInfo instance is used to create an event to drive a synthesizer progress meter is created.

  3. A speech rendering engine is then called with the modified TextFragment array.

private const int WordBoundaryFlag = 1 << (int)TtsEventId.WordBoundary;  
private readonly char[] spaces = new char[] { ' ', '\t', '\r', '\n' };  
internal struct UsVsUk  
  internal string UK;  
  internal string US;  
override public void Speak (TextFragment [] frags, IntPtr wfx, ITtsEngineSite site)  
  TextFragment [] newFrags=new TextFragment[frags.Length];  
  for (int i=0;i<frags.Length;i++){  
    newFrags[i].TextToSpeak = frags[i].TextToSpeak.Substring(frags[i].TextOffset,  
    newFrags[i].TextLength = newFrags[i].TextToSpeak.Length;  
    newFrags[i].TextOffset = 0;  
    if (newFrags[i].State.Action == TtsEngineAction.Speak) {  
      //Us to UK conversion  
      foreach (UsVsUk term in TransList) {  
      newFrags[i].TextToSpeak.Replace(term.US, term.UK);  
      //Generate progress meter events if supported  
      if ((site.EventInterest & WordBoundaryFlag) != 0) {  
      string[] subs = newFrags[i].TextToSpeak.Split(spaces);  
      foreach (string s in subs) {  
        int offset = newFrags[i].TextOffset;  
        SpeechEventInfo spEvent = new SpeechEventInfo((Int16)TtsEventId.WordBoundary,   
                 s.Length, new IntPtr(offset));  
        offset += s.Length;  
        if (s.Trim().Length > 0) {  
          SpeechEventInfo[] events = new SpeechEventInfo[1];  
          events[0] = spEvent;  
          site.AddEvents(events, 1);  
    _baseSynthesize.Speak(newFrags, wfx, site);  


The structure used as waveHeader and returned by the method should compatible with the WAVEFORMATEX available under SAPI.

The struct must provide functionality equivalent to:

internal struct WaveFormat  
    public Int16 FormatTag;  
    public Int16 Channels;  
    public int SamplesPerSec;  
    public int AvgBytesPerSec;  
    public Int16 BlockAlign;  
    public Int16 BitsPerSample;  
    public Int16 Size;  

Notes to Implementers

Custom speech synthesizer implements using TtsEngineSsml and Speak(TextFragment[], IntPtr, ITtsEngineSite) work as filters or intermediaries between synthesizer applications constructed using the platform infrastructure through the members of the System.Speech.Synthesis namespace and underlying system speech synthesis engines.

A Speak(TextFragment[], IntPtr, ITtsEngineSite) implementation:

  1. Traps or modify aspects of the incoming TextFragment objects

  2. Generates any necessary events using the site reference to a ITtsEngineSite instance

  3. Generates the actual synthesized speech.

Generation of speech is most typically done by calling Speak on one of the speech rendering engines provided by the operating system.

If one of the available speech rendering engines is not used, a object inheriting from TtsEngineSsml must create its own speech rendering engine.

Access to the Speak method on obtained using the registry and reflection. .

When you inherit from TtsEngineSsml, you must override the following members: TtsEngineSsml(String), AddLexicon(Uri, String, ITtsEngineSite), RemoveLexicon(Uri, ITtsEngineSite), GetOutputFormat(SpeakOutputFormat, IntPtr), and Speak(TextFragment[], IntPtr, ITtsEngineSite).

Applies to