TTSApp for Visual Basic (SAPI 5.4)
Microsoft Speech API 5.4
TTSApp for Visual Basic
TTSApp is an example of a text-to-speech (TTS) enabled application. This sample application is intended to demonstrate many of the features for SAPI 5 in a single coherent application. It is not a full featured TTS-enabled application although the foundations of many of the options are present.
Using TTSApp you can hear the resulting audio output from the TTS process for text entered in the main window. Alternatively, you can open a file and TTSApp will speak the contents of that file.
Each word is highlighted in the text window to indicate the current TTS processing position. Features include:
|SAPI5 TTSApp||The main display window of the TTSApp sample application.|
|Text window||TTSApp speaks the text contained in this window using TTS.|
|Speak||Initiates the TTS process.|
|Voices||Selects the voice for the audio output.|
|Rate||Selects the rate of speech.|
|Volume||Selects the volume level of the audio output stream.|
|Pause||Pauses the TTSApp text phrase speaking process.|
|Stop||Stops the TTSApp text phrase speaking process.|
|Format||Selects the audio format.|
|Audio Output||Selects the output device.|
|Skip||Specifies the number of sentences to skip in the phrase speaking process.|
|Reset||Resets TTSApp to its original configuration setting.|
|Show Events||Displays all TTSApp SAPI events.|
|IsXML||Specifies that the TTS voice will speak the XML tags and their contents in the TTS process.|
|PersistXML||Retains XML changes from one speaking attempt to another.|
|IsFileName||Interprets the text as a file name or file path rather than as text.|
|FlagsAsync||Speaks the text asynchronously. Asynchronous speaking allows SAPI to process other events at the same time of the speak.|
|PurgeBeforeSpeak||Deletes a voice before it is completed speaking. This allows a new voice to be created and used by the same object.|
|NLPSpeakPunc||Speak punctuation as text rather than as grammatical entities.|
|Mouth Position||Displays mouth shapes for phrase elements as they are spoken.|
|Open Text File||Opens a text file for display in the text box.|
|Speak Wave File||Opens a wav file to speak.|
|Save To Wave File||Saves the spoken content to a wave file.|
SAPI5 TTSApp main window.
Use the main TTSApp window to select the configuration settings that affect the TTS process. The elements of TTSApp are listed above. Click the text in the left column for additional information.
TTSApp speaks the text content of this window is spoken. All text entered in this window is processed and spoken by a TTSApp voice. By default, the text content of this window is, "Enter the text you wish spoken here."
Click Speak to initiate the text-to-speech process.
Select a voice using the drop-down list. TTSApp uses the selected voice when speaking a wav file or the contents of the text window.
Move the slide control to the right to increase the speech rate, and to the left to decrease the speech rate. The Rate level determines the number of text units spoken per minute.
Move the slide control to the right to increase the volume level, and to the left to decrease the volume level.
Click Pause to interrupt the TTS process.
Click Stop to stop the TTS process.
Use the drop-down list to select one of the following format rates.
Use the drop-down list to select the output device. In most cases, only one device will be available and represents the sound card for the computer.
Use the spin box to select the number of skipped sentences. Skip functions only while text is being spoken.
Click Reset to reset TTSApp to its original configuration state.
Select Show Events to display SAPI related events in the event display window as the input text is processed by TTSApp.
Select IsXML to include the XML tags and their contents in the audio output stream from TTSApp. When this option is selected, the application will parse and interpret the XML tags literally.
For example, if the IsXML option is selected, the application could be paused for the specified number of milliseconds in the SILENCE tag.
IsXML selected? XML tag Result Yes <SILENCE MSEC = "3000"/> The application would speak 3000 milliseconds of silence. No <SILENCE MSEC = "3000"/> The application will speak the phrase, "less than silence msec equals quote three thousand quote slash greater than."
Select PersistXML to retain XML changes from one speaking attempt to another. By default, this option is not selected. This means that XML changes are not retained and each speaking attempt will begin using the default values for the engine. However, insert the following line in the text box: "Enter text <rate speed = "7"/> you wish spoken here" and select the PersistXML and IsXML boxes. The first speaking attempt will be as predicted in that the last part of the sentence will be read more quickly than the first part. The difference is that the second speaking attempt will begin at the same rate of the previous sentence ended. In addition, the sentence will get progressively faster each time the XML rate tag is encountered. Clearing the box after the second speaking attempt will not revert the rate back the default since the engine has already been changed for that session.
Select IsFileName to interpret the text as a file name or file path rather than as text. For example, in the case of a standard SAPI install, select IsFileName and paste the following line into the text box: C:\Program Files\Microsoft Speech SDK 5.4\Samples\CPP\Engines\TTS\MkVoice\enter.wav. Click Speak and the content of wav file is played. In this example, the wav file speaks "enter". If IsFileName is clear, the application will speak the contents of the edit box as "c colon backslash program files..."
Select FlagsAsync to speak the text asynchronously. Asynchronous speaking allows SAPI to process other events at the same time as speech. In contrast, synchronous speaking does not. For example, with FlagsAsync selected, the speaking attempt displays each word as it is being spoken. If the FlagsAsync is not selected, the text will still be spoken; however, the words will not highlight until the text has been spoken. At that time, each word will highlight in turn. Highlighting may occur quickly due to the fact that events were being queued but not processed until the speech had finished.
Select PurgeBeforeSpeak to interrupt the speech attempt. With FlagsAsync selected, PurgeBeforeSpeak releases the current voice and speech, and allows a new voice to be queued. For instance, select both PurgeBeforeSpeak and FlagsAsync and speak the text. However before the sentence is complete, click Speak again. The voice stops and the sentence is restarted from the beginning. The previous voice has been deleted and a new one created.
Select NLPSpeakPunc to speak punctuation as text rather than as grammatical entities. Paste the following sentence into the text box: I like coffee! With NLPSpeakPunc selected, speak the sentence. Rather than ending after the word coffee, the exclamation point is read as the phrase "exclamation point."
The mouth position displays the various mouth shapes and positions as TTSApp processes the input text stream.
Open Text File
From the File menu, select Open Text File to open a text file to display in the text box rather than typing or pasting the content in manually. XML files may also be opened and displayed. Other file types can be opened, but since the text box only supports plain text, the contents may not display or speak in a predictably.
Speak Wave File
From the File menu, select Speak Wave File to speak the contents of a wav file. Use the standard file dialog box to select the wav file. Once chosen, the file speaks automatically.
Save To Wave File
From the File menu, select Save To Wave File to save the output of the spoken content to a wave file. Use the standard file dialog box to select the file. Once chosen, the contents of the text box is spoken automatically and the file is saved. The spoken portion is sent directly to the file and no audible speech will be heard. Of course, the file may be played back using Speak Wave File.