TTSApp for Visual Basic (SAPI 5.4)

Microsoft Speech API 5.4

TTSApp for Visual Basic

TTSApp is an example of a text-to-speech (TTS) enabled application. This sample application is intended to demonstrate many of the features for SAPI 5 in a single coherent application. It is not a full featured TTS-enabled application although the foundations of many of the options are present.

Using TTSApp you can hear the resulting audio output from the TTS process for text entered in the main window. Alternatively, you can open a file and TTSApp will speak the contents of that file.

Each word is highlighted in the text window to indicate the current TTS processing position. Features include:

SAPI5 TTSApp The main display window of the TTSApp sample application.
Text window TTSApp speaks the text contained in this window using TTS.
Speak Initiates the TTS process.
Voices Selects the voice for the audio output.
Rate Selects the rate of speech.
Volume Selects the volume level of the audio output stream.
Pause Pauses the TTSApp text phrase speaking process.
Stop Stops the TTSApp text phrase speaking process.
Format Selects the audio format.
Audio Output Selects the output device.
Skip Specifies the number of sentences to skip in the phrase speaking process.
Reset Resets TTSApp to its original configuration setting.
Show Events Displays all TTSApp SAPI events.
IsXML Specifies that the TTS voice will speak the XML tags and their contents in the TTS process.
PersistXML Retains XML changes from one speaking attempt to another.
IsFileName Interprets the text as a file name or file path rather than as text.
FlagsAsync Speaks the text asynchronously. Asynchronous speaking allows SAPI to process other events at the same time of the speak.
PurgeBeforeSpeak Deletes a voice before it is completed speaking. This allows a new voice to be created and used by the same object.
NLPSpeakPunc Speak punctuation as text rather than as grammatical entities.
Mouth Position Displays mouth shapes for phrase elements as they are spoken.
Open Text File Opens a text file for display in the text box.
Speak Wave File Opens a wav file to speak.
Save To Wave File Saves the spoken content to a wave file.
  • Ee125109.TTSapp_VB_Main(en-us,VS.85).gif
    SAPI5 TTSApp main window.
    Use the main TTSApp window to select the configuration settings that affect the TTS process. The elements of TTSApp are listed above. Click the text in the left column for additional information.

  • Ee125109.TTSApp_TextWindow(en-us,VS.85).gif
    Text window
    TTSApp speaks the text content of this window is spoken. All text entered in this window is processed and spoken by a TTSApp voice. By default, the text content of this window is, "Enter the text you wish spoken here."

  • Ee125109.Btn_Speak(en-us,VS.85).gif
    Speak
    Click Speak to initiate the text-to-speech process.

  • Ee125109.Drp_Voices(en-us,VS.85).gif
    Voice
    Select a voice using the drop-down list. TTSApp uses the selected voice when speaking a wav file or the contents of the text window.

  • Ee125109.Sld_Rate(en-us,VS.85).gif
    Rate
    Move the slide control to the right to increase the speech rate, and to the left to decrease the speech rate. The Rate level determines the number of text units spoken per minute.

  • Ee125109.Sld_Volume(en-us,VS.85).gif
    Volume
    Move the slide control to the right to increase the volume level, and to the left to decrease the volume level.

  • Ee125109.Btn_Pause(en-us,VS.85).gif
    Pause
    Click Pause to interrupt the TTS process.

  • Ee125109.Btn_Stop(en-us,VS.85).gif
    Stop
    Click Stop to stop the TTS process.

  • Ee125109.Drp_Format(en-us,VS.85).gif
    Format
    Use the drop-down list to select one of the following format rates.

Selectable format rates 8kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 11kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 12kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 16kHz8 Bit Mono 8 Bit Stereo16 Bit Mono16 Bit Stereo 22kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 24kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 32kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 44kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo 48kHz8 Bit Mono8 Bit Stereo16 Bit Mono16 Bit Stereo
  • Ee125109.TTSapp_VB_AudioOutput(en-us,VS.85).gif
    Audio Output
    Use the drop-down list to select the output device. In most cases, only one device will be available and represents the sound card for the computer.

  • Ee125109.Btn_Skip_Spin(en-us,VS.85).gif
    Skip
    Use the spin box to select the number of skipped sentences. Skip functions only while text is being spoken.

  • Ee125109.Btn_Reset(en-us,VS.85).gif
    Reset
    Click Reset to reset TTSApp to its original configuration state.

  • Ee125109.TTSapp_VB_ShowEvents(en-us,VS.85).gif
    Show Events
    Select Show Events to display SAPI related events in the event display window as the input text is processed by TTSApp.

  • Ee125109.TTSapp_VB_IsXML(en-us,VS.85).gif
    IsXML
    Select IsXML to include the XML tags and their contents in the audio output stream from TTSApp. When this option is selected, the application will parse and interpret the XML tags literally.

    For example, if the IsXML option is selected, the application could be paused for the specified number of milliseconds in the SILENCE tag.

    IsXML selected? XML tag Result
    Yes <SILENCE MSEC = "3000"/> The application would speak 3000 milliseconds of silence.
    No <SILENCE MSEC = "3000"/> The application will speak the phrase, "less than silence msec equals quote three thousand quote slash greater than."
  • Ee125109.TTSapp_VB_PersistXML(en-us,VS.85).gif
    PersistXML
    Select PersistXML to retain XML changes from one speaking attempt to another. By default, this option is not selected. This means that XML changes are not retained and each speaking attempt will begin using the default values for the engine. However, insert the following line in the text box: "Enter text <rate speed = "7"/> you wish spoken here" and select the PersistXML and IsXML boxes. The first speaking attempt will be as predicted in that the last part of the sentence will be read more quickly than the first part. The difference is that the second speaking attempt will begin at the same rate of the previous sentence ended. In addition, the sentence will get progressively faster each time the XML rate tag is encountered. Clearing the box after the second speaking attempt will not revert the rate back the default since the engine has already been changed for that session.

  • Ee125109.TTSapp_VB_IsFileName(en-us,VS.85).gif
    IsFileName
    Select IsFileName to interpret the text as a file name or file path rather than as text. For example, in the case of a standard SAPI install, select IsFileName and paste the following line into the text box: C:\Program Files\Microsoft Speech SDK 5.4\Samples\CPP\Engines\TTS\MkVoice\enter.wav. Click Speak and the content of wav file is played. In this example, the wav file speaks "enter". If IsFileName is clear, the application will speak the contents of the edit box as "c colon backslash program files..."

  • Ee125109.TTSapp_VB_FlagsAsync(en-us,VS.85).gif
    FlagsAsync
    Select FlagsAsync to speak the text asynchronously. Asynchronous speaking allows SAPI to process other events at the same time as speech. In contrast, synchronous speaking does not. For example, with FlagsAsync selected, the speaking attempt displays each word as it is being spoken. If the FlagsAsync is not selected, the text will still be spoken; however, the words will not highlight until the text has been spoken. At that time, each word will highlight in turn. Highlighting may occur quickly due to the fact that events were being queued but not processed until the speech had finished.

  • Ee125109.TTSapp_VB_PurgeBeforeSpeak(en-us,VS.85).gif
    PurgeBeforeSpeak
    Select PurgeBeforeSpeak to interrupt the speech attempt. With FlagsAsync selected, PurgeBeforeSpeak releases the current voice and speech, and allows a new voice to be queued. For instance, select both PurgeBeforeSpeak and FlagsAsync and speak the text. However before the sentence is complete, click Speak again. The voice stops and the sentence is restarted from the beginning. The previous voice has been deleted and a new one created.

  • Ee125109.TTSapp_VB_NLPSpeakPunc(en-us,VS.85).gif
    NLPSpeakPunc
    Select NLPSpeakPunc to speak punctuation as text rather than as grammatical entities. Paste the following sentence into the text box: I like coffee! With NLPSpeakPunc selected, speak the sentence. Rather than ending after the word coffee, the exclamation point is read as the phrase "exclamation point."

  • Ee125109.TTSapp_VB_Mouth(en-us,VS.85).gif
    Mouth Position
    The mouth position displays the various mouth shapes and positions as TTSApp processes the input text stream.

  • Ee125109.TTSApp_VB_Open_Text_File(en-us,VS.85).gif
    Open Text File
    From the File menu, select Open Text File to open a text file to display in the text box rather than typing or pasting the content in manually. XML files may also be opened and displayed. Other file types can be opened, but since the text box only supports plain text, the contents may not display or speak in a predictably.

  • Ee125109.TTSApp_VB_Speak_Wave_File(en-us,VS.85).gif
    Speak Wave File
    From the File menu, select Speak Wave File to speak the contents of a wav file. Use the standard file dialog box to select the wav file. Once chosen, the file speaks automatically.

  • Ee125109.TTSApp_VB_Save_To_Wave_File(en-us,VS.85).gif
    Save To Wave File
    From the File menu, select Save To Wave File to save the output of the spoken content to a wave file. Use the standard file dialog box to select the file. Once chosen, the contents of the text box is spoken automatically and the file is saved. The spoken portion is sent directly to the file and no audible speech will be heard. Of course, the file may be played back using Speak Wave File.