Lesson 3 -- Using Menus Instead of Forms

Article
03/26/2013

As we have noted, there are two types of dialogs: <form> elements and <menu> elements. The VoiceXML document we have developed in Lessons 1 and 2 uses only <form> elements as dialogs. We will continue to use this type of dialog throughout this tutorial because <form> elements are more versatile than <menu> elements.

We feel that you should know about menus, the other type of dialog, however, so Lesson 3 shows how we could replace the main form with a <menu> element with the same functionality.

This lesson is a detour in the tutorial, covering an important element that we will not use.

Menus

Menus are discussed in detail in section 2.2 of the W3C's VoiceXML 2.0 specification, found at http://www.w3.org/TR/2004/REC-voicexml20-20040316/.

Here is a menu that could replace the main form we have been using up to now:

   <menu id="main">
       <prompt>
            Welcome to Contoso Travel.
            Say new reservation or press 1.
            Say change reservation or press 2.
            Say restaurant recommendation or press 3.
       </prompt>
       <choice dtmf="1" next="#new_reservation">
          new reservation
       </choice>
       <choice dtmf="2" next="change-reservation.vxml">
          change reservation
       </choice>
       <choice dtmf="3" next="restaurant.vxml" accept="approximate">
          restaurant recommendation
       </choice>
   </menu>

This <menu> dialog is completely equivalent functionally to the <form> dialog that we used in Lesson 2, and it is much simpler. In fact, like menus in general, it is too simple. Forms can do the same thing and, even though they are a bit more complicated, they are much more flexible. A simple, but important, difference is this: the grammars used in forms can handle a wider variety of caller inputs than a menu can. In Lesson 4, we will modify the main_selection grammar so that it can handle inexact inputs. In the menu above, the caller must say "new reservation" exactly. No more, no less. In the modified grammar of Lesson 4, the caller could say "a new reservation," "I'd like a new reservation," and a variety of other things and still get a match.

Looking at the code for the <menu> element shown above, you can see that it uses a <prompt> element, as before, but there is no <grammar> element (nor is there reference to an external grammar) and there is no <filled> element.

Within the <menu> element, we have <choice> elements.

The <choice> element has a number of attributes that are described in https://msdn.microsoft.com/en-us/library/ff929017.aspx. In particular, the next attribute names the document or dialog to which the menu choice should transition. This takes the place of a <goto> element in the<filled> element of a form.

The dtmf attribute assigns the DTMF key to press for this choice.

The content of the <choice> element is the word or phrase that must be matched for that particular choice. This is the menu's alternative to an item in a form's grammar.

Another attribute of the <choice> element that we use is the accept attribute. The default for the accept attribute is exact, which requires that the caller speaks the exact words that are the <choice> element's content. When accept="approximate" the caller may speak a subphrase of the words that are the <choice> element's content. In the menu above, the first two choices must be exact because the menu must distinguish between "new reservation" and "change reservation." We use accept="approximate" in the third choice so that the caller can say "restaurant recommendation" or just simply "restaurant" or "recommendation."

VoiceXML has a property named inputmodes that determines whether the <choice> element accepts voice, DTMF, or both. inputmodes can have one of three values:

dtmf (accepts DTMF only).
dtmf voice (accepts both voice and DTMF).
voice (accepts voice only).

Our menu above accepts both voice and DTMF because dtmf voice is the default value of inputmodes and we have not specified a different value for this property.

Important

The inputmodes property also determines whether the <form> element accepts voice, DTMF, or both.

Automatic DTMF

The <menu> element has an optional attribute named dtmf. When you set the dtmf attribute to true, sequential DTMF digits are automatically assigned to each of the first nine choices that have not specified their DTMF values otherwise. The first <choice> is assigned dtmf="1", the second is assigned dtmf="2", and so on.

We can revise the main code above to use this feature:

   <menu id="main" dtmf="true">
       <prompt>
            Welcome to Contoso Travel.
            Say new reservation or press 1.
            Say change reservation or press 2.
            Say restaurant recommendation or press 3.
       </prompt>
       <choice next="#new_reservation">
          new reservation
       </choice>
       <choice next="change-reservation.vxml">
          change reservation
       </choice>
       <choice next="restaurant.vxml" accept="approximate">
          restaurant recommendation
       </choice>
   </menu>

In this version of the main code, then, DTMF digits 1, 2, and 3 are automatically assigned to the choices with content of "new reservation", "change reservation", and "restaurant recommendation", respectively.

This version of the main menu, like the one earlier in this lesson, can respond to voice and DTMF in exactly the way the main form could.

Using the <enumerate> element

You will have noted that the use of a <menu> element in place of a <form> element eliminated the need to include <grammar> and <filled> elements. It did not eliminate the need for a <prompt> element, but menus do have a mechanism for simplifying prompts.

Here is the prompt we are using in the menu:

       <prompt>
            Welcome to Contoso Travel.
            Say new reservation or press 1.
            Say change reservation or press 2.
            Say restaurant recommendation or press 3.
       </prompt>

We can use the <enumerate> element to simplify this prompt. There are two ways to do this: we can use the <enumerate> element with or without content.

Using <enumerate> without content

We can write the prompt like this:

       <prompt>
            Tell me what you want. Choose one of <enumerate/>
       </prompt>

The empty <enumerate/> element causes TTS to speak the content of each choice in order.

This will generate the following TTS for the prompt: "Welcome to Contoso Travel. Choose one of new reservation change reservation restaurant recommendation."

There would be a poor user experience when using <enumerate/> without content in our application because it does not prompt the caller to use DTMF input as an alternative to voice commands. The DTMF is available, but there's no way to tell the caller to use it.

Using <enumerate> with content

An alternative way to use <enumerate> prompts the user for DTMF:

       <prompt>
            Welcome to Contoso Travel.
            <enumerate>
               for <value expr="_prompt"/>, say <value expr="_prompt"/>
                                      or press <value expr="_dtmf"/>
            </enumerate>
       </prompt>

Here _prompt and _dtmf are special VoiceXML variables that contain the <choice> element's grammar value (its content) and DTMF value, respectively.

This will generate the following TTS for the prompt: "Welcome to Contoso Travel for new reservation say new reservation or press 1 for change reservation say change reservation or press 2 for restaurant recommendation say restaurant recommendation or press 3."

This is what we want to convey.

Copy the app-root.vxml code at the end of Lesson 2.

Replace the main form in app-root.vxml with this menu:

<menu id="main" dtmf="true">
       <prompt>
            Welcome to Contoso Travel
            Say new reservation or press 1.
            Say change reservation or press 2.
            Say restaurant recommendation or press 3.
       </prompt>
       <choice next="#new_reservation">
          new reservation
       </choice>
       <choice next="change-reservation.vxml">
          change reservation
       </choice>
       <choice next="restaurant.vxml" accept="approximate">
          restaurant recommendation
       </choice>
   </menu>

Copy the revised app-root.vxml code and paste it into the Tellme Studio Scratchpad window.
Click the Update button.
When the code is compiled (when you see "Done" in the status bar), call the number listed at the top of the Scratchpad and respond to the prompt.

You should see that you can still respond in exactly the same way as you could with the main form.

What's next?

This lesson has introduced you to the <menu> element, an important VoiceXML element, even though we will not use it in any of our Contoso Travel application's documents.

In Lesson 4, we will return to our app-root.vxml document and make some improvements. Specifically, we will alter the grammars so that they accept a wider range of spoken input and alter the prompts to improve the way the TTS sounds.

Share via