Web Speech API: Speech Synthesis

The Web Speech API is a JavaScript API made up of two parts: speech recognition and speech synthesis (or text to speech). At this time Microsoft Edge supports only speech synthesis. Speech synthesis involves the conversion of text to speech that a user hears through their speakers.

The web speech synthesis API has four objects that control the text to speech: SpeechSynthesis, SpeechSynthesisUtterance, SpeechSynthesisEvent, and SpeechSynthesisVoice.

Take a look at the demo of a speech synthesizer on Test Drive.

Making your website talk

You can make your website talk with only one line of code:

window.speechSynthesis.speak(new SpeechSynthesisUtterance("Hello world!"));

In this example, you create a new instance of the SpeechSynthesisUtterance object and pass it the text to synthesize: hello world! This text is then spoken using the system’s default voice.

The SpeechSynthesis object controls the text to speech output and contains a queue of utterances, or the text to be spoken. After calling speechSyntehsis.speak, the first utterance in the queue will begin speaking, and if there are additional utterances in the queue, the remaining utterances will speak after the first utterance is completed.

The text to be synthesized, using speechSynthesis.text, can simply be plain text, like "hello world", or you can use a well-formed Speech Synthesis Markup Language (SSML) document.

To accept text from an input, use the following code to create an <input> and a <button> that calls the speakText() function:

Text: <input type="text" id="textInputBox" />
<button onclick="speakText()">Speak</button>
function speakText() {
    var textInput = document.getElementById("textInputBox").value;
    window.speechSynthesis.speak(new SpeechSynthesisUtterance(textInput));
}

You can also call pause, resume, and cancel to control the playback of the utterances. Canceling clears the queue, while pausing and then resuming will begin speaking where the utterance left off. The buttons below will resume, pause, and cancel the playback of the utterances.

<button onclick="speechSynthesis.resume()">Resume</button>
<button onclick="speechSynthesis.pause()">Pause</button>
<button onclick="speechSynthesis.cancel()">Cancel</button>

Changing the sound of the utterance

You can control the parameters that affect the sound of the speech with the SpeechSynthesisUtterance object like voice, volume, rate, and pitch.

var myUtterance = new SpeechSynthesisUtterance();

myUtterance.text = "Hello world!"
myUtterance.pitch = 1;  // accepted values: 0-2 inclusive, default value: 1
myUtterance.rate = 1.5; // accepted values: 0.1-10 inclusive, default value: 1
myUtterance.volume = .5; // accepted values: 0-1, default value: 1

speechSynthesis.speak(myUtterance);
Note

In Microsoft Edge, the value for pitch will always be 1.0.

Alternatively, you can allow users to input values for the volume, rate, and pitch properties using the following lines of code:

Rate: <input type="number" id="numRate" />
// get the value from id="numRate"
var nRate = document.getElementById("numRate").value;

// set the rate of utterance to nRate
utterance.rate = nRate;

Voices

The Web Speech API allows you to change the speaking voice. To check which voices are available on your browser, run the following code in your browser's console:

speechSynthesis.getVoices().forEach(function (voice) {
   console.log(voice.name);
});

The example below populates the available voices into a drop-down list and changes the speaking voice to the one selected in the drop-down list. Create a <select> element with id="ddlVoices" and use the following code to create the drop-down.

function voicesChangedHandler() {
  // load select element "ddlVoices" with available voice options
  var select = document.getElementById("ddlVoices");
  for (var i = 0; i < speechSynthesis.getVoices().length; i++) {
    var voice = speechSynthesis.getVoices()[i];

    // create an option element with voice name
    var option = document.createElement("option");
    option.textContent = voice.name;
    option.value = i;

    // add voice as option to select element
    select.appendChild(option);
  }
}

// populate drop-down list for the first time
voicesChangedHandler();
// update drop-down list whenever the voices change 
speechSynthesis.onvoiceschanged = voicesChangedHandler;

// set utterance.voice equal to the selected voice from the drop-down list
utterance.voice = speechSynthesis.getVoices()[parseInt(document.getElementById("ddlVoices").value)];

See the full example of a speech synthesizer on Test Drive.

API Reference

Web Speech API

Demos

Speech Synthesis API Demo

Introducing the Speech Synthesis API in Microsoft Edge

Specification

Web Speech API