Understanding Automatic Speech Recognition Directory Lookups

Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.

 

Applies to: Exchange Server 2007, Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3

Microsoft Exchange Server 2007 Unified Messaging offers a voice user interface (VUI) that uses Automatic Speech Recognition (ASR). This is the telephone interface that callers use to navigate the menu systems and access their mailbox by using speech inputs. ASR enables callers to use speech inputs instead of dual tone multi-frequency (DTMF), also known as touchtone, inputs to navigate the UM auto attendant menus or when a UM-enabled user accesses their mailbox. This topic discusses how ASR is used in Exchange Server 2007 Unified Messaging and how grammar files are used with ASR.

Note

ASR for directory lookups and searches is currently available only in English for Outlook Voice Access users and for calls to UM auto attendants. However, support for ASR in other languages is planned for a future release.

Overview of Grammar Files

A speech grammar file contains words and phrases that the speech engine will try to recognize when the grammar file is being used. Grammar files define things such as the commands that are available to a user while they are reviewing their mail or their calendar or the names of people who are recognized by the speech engine when a caller searches the directory. Speech grammar files are first generated as files that have a .grxml extension. They are then processed into a compiled form that has a .cfg extension before they are loaded into the speech engine. However, the .cfg file is loaded into the memory of the Microsoft Exchange Speech Engine service. Therefore, there is no .cfg file that is created and saved to a disk. Figure 1 illustrates how the grammar files are used by callers.

Figure 1   Grammar file overview

Overview of Grammar Files

Note

  If you want to locate the .grxml file that corresponds to a .cfg file, look in the event log for events that have the IDs 1040 or 1041. The event will show which .grxml file was used to produce a particular .cfg file.

Default Grammar Files

When the Unified Messaging server role is installed, many files are copied to the server. These files include the default grammar files that are used by ASR to enable the voice user interface (VUI). By default, these grammar files are installed in the \UnifiedMessaging\grammars*\<language>* folder. However, when these grammar files are used by the Unified Messaging server, they are loaded and compiled into a .cfg file by the Microsoft Exchange Speech Engine service.

The default grammar files include the following files:

  • Calendar.grxml

  • Common.grxml

  • Contacts.grxml

  • Email.grxml

  • Mainmenu.grxml

Custom Grammar Files

Several custom grammar files are created when the Unified Messaging server role is installed and then again when you create Unified Messaging objects in the Active Directory directory service and the Microsoft Exchange Unified Messaging service runs grammar generation at its scheduled time, one time each day. These grammar files contain the names of users and other objects, for example distribution lists, that are in the Active Directory. For each name there is additional data, for example an e-mail alias. This data enables the name to be associated with a unique object.

The following grammar files are created when the Microsoft Exchange Unified Messaging service runs grammar generation at the scheduled time:

  • Gal.grxml

  • <DialPlanGUID>.grxml

  • <AddressListGUID>.grxml

  • DistributionList.grxml

    Note

    UM-enabled users may not be immediately available for callers. You must either wait until the next scheduled grammar generation to occur or manually run galgrammargenerator.exe to include the UM-enabled user's name in a grammar file.

When the Unified Messaging server is creating a speech grammar file, it will examine many directory objects to determine which names should be added to the speech grammar file. The types of objects that it will process are based on the scope of the grammar that is being created. However, for all these objects, Unified Messaging will not add the object to the grammar if the object is hidden from the Exchange 2007 address lists or the msExchHideFromAddressLists attribute is set to true for the object.

  • For the global address list (GAL) grammar file, Unified Messaging will consider the following:

    • Mail-enabled users

    • Mail-enabled contacts

  • For dial plan grammar files, Unified Messaging will consider the following:

    • UM-enabled users in the specified dial plan
  • For the distribution list grammar file, Unified Messaging will consider the following:

    • Distribution lists that are visible in address lists

A default GAL is created when the Mailbox server role is installed on a computer that is running Exchange 2007. When the Unified Messaging server role is installed, it creates a grammar file for the GAL that is based on the speech grammar filters that are configured. If you create custom address lists or distribution lists in your Exchange 2007 organization, additional grammar files will be created for each custom address list or distribution list that you create.

If you create an address list that contains, for example, all recipients in a particular department, and then later add a new user in this department, the recipient will not be included as a member of the address list until you run the Update-AddressList cmdlet.

If you create an address list that contains, for example, all recipients in a particular department, and then the membership of the address list changes, you must run the Update-AddressList cmdlet before Unified Messaging name speech grammar generation occurs. This ensures that, when the grammar is generated or updated, it will contain all the recipients that are currently in the address list. When you run the Update-AddressList cmdlet, it will include each recipient in every address list that the recipient is a member of.

If a UM-enabled user is not stamped as a member of an address list before grammar generation occurs, the user will not be added as a member. The next time grammar generation occurs, either on the defined schedule or manually when you run galgrammargenerator.exe, the UM-enabled user will not be added to the grammar for the address list. Therefore, their name will not be available when the directory is searched.

Note

For a grammar file to be generated for a distribution list, the distribution list must not be hidden.

When you first create a UM dial plan, no grammar files are created. However, when a Unified Messaging server joins a dial plan for the first time, a single grammar file for the UM dial plan is created in the appropriate language folder. The UM dial plan speech grammar file is then filtered to include only UM-enabled users who are associated with the dial plan. The grammar files for these objects are named by using the GUIDs of the objects that they represent after they are compiled, for example, 2da514a1-06f4-44a1-9ce5-610854f7d2ee.grxml or the corresponding .cfg file.

When the grammar files for UM dial plans, the GAL, address lists, and distribution lists are created, they are created in a language-specific folder on the local Unified Messaging server. The language folder that is used is selected based on the default language that is configured on the UM dial plan. For example, if the default language on the dial plan is set to US-English (en-US), a grammar file will be created in the \UnifiedMessaging\grammars\en folder. After the grammar file has been created, it will be updated according to the schedule that is configured on the Unified Messaging server.

For more information, see the following topics:

Grammar Generation

Frequently, the default grammar generation schedule will fit your needs. However, there will be times when you must manually generate grammar files or update existing grammar files before the scheduled grammar generation task runs. There may also be times when you will want to change the default grammar generation schedule.

Grammar generation occurs in the following situations:

  • When the Unified Messaging server is added to a UM dial plan, and daily after that at a scheduled interval.

  • When you run the galgrammargenerator.exe command to manually update or create grammar files.

The grammar file that is created is then updated when the scheduled grammar generation task runs. To display the default grammar generation schedule for a Unified Messaging server, use the following Exchange Management Shell cmdlet:

(Get-UMServer $env:COMPUTERNAME).GrammarGenerationSchedule

For more information about the Get-UMServer cmdlet, see Get-UMServer.

By default, grammar generation occurs daily at the time that is specified by the GrammarGenerationSchedule parameter of the Unified Messaging server. By default, the schedule is defined so that grammar generation will start at 2:00 A.M. each day. However, the grammar generation schedule can be changed and is controlled by using the Set-UMserver cmdlet in the Exchange Management Shell. There is no graphical user interface that you can use to control the grammar generator schedule. This schedule can be controlled only by using the Set-UMserver cmdlet in the Exchange Management Shell. For more information about how to change the phonetic display name by using the Set-UMServer cmdlet, see Set-UMServer.

By default, the grammar generation schedule is set to start one time per day at 2:00 A.M. local time on the Unified Messaging server. After it starts, grammar generation will run until it is completed, whether this is before the scheduled end time for the active period or not; grammar generation will not run if there is another grammar generation that is running. Although you can configure additional scheduled times, grammar generation will not run within one hour of a previously scheduled grammar generation period. Because grammar generation uses lots of system resources, we recommend that you configure all grammar generation schedules so that grammar generation will occur during off-peak hours. However, you can stagger the grammar generation schedules on multiple Unified Messaging servers, for example, Umserver1 starts at 2:00 A.M., Umserver2 starts at 2:30 A.M., and Umserver3 starts at 3:00 A.M. This will help minimize the effect of grammar generation on the Active Directory domain controllers.

Note

A log file that is named UMSpeechGrammar.log will be created in the %ExchangeRoot%\UnifiedMessaging\temp folder. This log file contains information about all grammar files that are created or updated on a Unified Messaging server. This file will be overwritten every time that scheduled grammar generation runs.

In the following circumstances, you can wait for the next scheduled grammar generation for the changes to be reflected, or you can force an update by using the galgrammargenerator.exe command.

  • When you complete a new installation of the Unified Messaging server role and enable users for Unified Messaging

  • When a UM dial plan, UM auto attendant, custom address list, or custom distribution list is created

  • When you create UM-enabled users

  • If you change a UM dial plan or UM auto attendant

Note

When an Outlook Voice Access user tries to locate a UM-enabled user by using the directory search feature with Automatic Speech Recognition (ASR) immediately after you have completed a new installation of the Unified Messaging server role and enabled users for UM, the caller will hear a system prompt that says, "I am sorry I could not help." Then they will be disconnected. This occurs because a grammar file for the global address list (GAL) has not been generated. Use the galgrammargenerator.exe command to create the required grammar file for the GAL.

For example, when you first enable users for UM, those users will not be available to callers who use ASR to perform a directory search until the scheduled grammar generation task runs. To make sure that those new users who were recently UM-enabled are visible to callers, run the galgrammargenerator.exe program to force the .grxml files to be created or updated and to compile the appropriate .cfg files so that callers can use ASR to move through the menu systems or locate users by using ASR.

Galgrammargenerator.exe is also useful when a Unified Messaging server has joined a dial plan and one or more speech-enabled auto attendants are associated with the dial plan. By default, callers who call into a speech-enabled auto attendant can only reach UM-enabled users who are associated with the dial plan. Before callers can be transferred to UM-enabled users by using voice inputs, a grammar file must be generated. The grammar file is not generated automatically when the server joins a dial plan. Instead, it is generated the next time grammar generation is scheduled. Grammar generation occurs according to the default schedule, at 2:00 AM local time each day, unless the schedule has been changed.

If you want UM-enabled users to be available from a directory search from the speech-enabled auto attendant immediately after you create the auto attendant, you must generate the required grammar file for the auto attendant by using galgrammargenerator.exe with the –d option.

A grammar file is not required with auto attendants that are not speech-enabled. This is because a DTMF map is added to the Active Directory for each user when they are enabled for UM. DTMF maps enable callers to enter the digits that correspond to the letters of the user's name or e-mail alias on a telephone keypad.

However, a DTMF map will not automatically be created for users who are not UM-enabled. By using galgrammargenerator.exe with the -u option, you can generate a DTMF map for all users who are mail-enabled but not UM-enabled. This lets users who are mail-enabled but not UM-enabled be reached from the auto attendant when their name or e-mail alias is entered by a caller by using DTMF inputs. For more information about the DTMF interface, see Understanding the DTMF Interface.

The following table lists the switches and descriptions for the switches for the galgrammargenerator.exe program.

Galgrammargenerator.exe and the switches

Switch Description

-d <dialplan>

Creates a grammar file that contains the names of UM-enabled users only in the specified UM dial plan.

-g

Generates the grammar file.

-l

Generates a grammar file for a distribution list.

-o

Generates a log file. The path can be an absolute path, for example, C:\Logfiles. By default, the Unified Messaging server will also automatically create a log file in the \UnifiedMessaging\Temp folder.

-p

Preload all generated grammars into the Microsoft Speech Server platform.

-s <UMserver>

Creates a grammar file for each UM dial plan to which the specified Unified Messaging server belongs.

-u

Creates or updates DTMF maps for users who are enabled for UM and who are not enabled for UM.

Note

If a mailbox-enabled user or a mail-enabled contact has a character in their e-mail alias that is not valid and you run the galgrammargenerator.exe /u command to create a DTMF map for users, the command will not complete successfully and Unified Messaging will report an error. To ensure that all mailbox users and mail-enabled contacts have no characters in their e-mail addresses that are not valid, use the Get-User cmdlet to view all users. The Get-User cmdlet will perform a validation check for the user attributes. If any field has a character that is not valid, an error will be generated that identifies the recipient and the field that contains the character.

-x

Defines the speech filter list that is used in XML format.

Note

The default speech grammar filter list (SpeechGrammarFilterList.xml) is installed in the %ExchangeRoot%\bin folder on each server that has the Unified Messaging server role installed. The contents of the speech filter list file must be the same on each Unified Messaging server. The speech grammar filter list contains several rules that specify input patterns against which display names are matched and output patterns that define transformations of the matched name. If the name matches a pattern it will be replaced in the speech grammar by the name or names that are generated from the associated output pattern or patterns. If the name does not match a pattern, it is passed through unchanged to the speech grammar. Names will be rejected from insertion in the speech grammar if they to have two or more distinct ways of being said. We recommend that you do not manually modify the SpeechGrammarFilterList.xml file.

Customizing Grammar Files

Currently, ASR is available only in English and includes the prerecorded prompts and text-to-speech support for English. Although ASR support is included in the English language pack, there will be times when it is difficult for speech recognition to locate the correct UM-enabled user because the user has a name that is difficult to pronounce, the caller's speech is matched against the wrong name, or the caller speaks a form of the user's name that differs from the name that exists in the speech grammar. However, adding an additional UM language pack will not resolve this problem.

Unified Messaging uses two Active Directory attributes to generate names to use with ASR grammar files: Display name (displayName) and Phonetic display name (msDS-PhoneticName). By default, Unified Messaging uses the displayName attribute to recognize the name of a user when a caller speaks their name. This works well if the user's name is easy to pronounce. However, in some cases, users have names that are difficult to pronounce. To help Unified Messaging find users whose names are difficult to pronounce, we recommend that you configure the Unified Messaging system by supplying a phonetic display name for users who have names that ASR has trouble recognizing. However, to supply a phonetic display name, you must predict how the speech engine would perceive a certain spelling of a name to provide an accurate pronunciation for the phonetic name.

Note

By default, the Unified Messaging server will try to insert both the phonetic display name, if one exists, and the display name into the speech grammar file.

For example, the display name "Kweku Ako-Adjei" could be given a phonetic display name of "Quaykoo Akoo Oddjay", and UM would insert that into the speech grammar file. The drawback to creating phonetic names for users is that it is difficult to do on a large scale. It would be very time-consuming to create and test phonetic display names for every user whose name is not correctly recognized by ASR, especially in large enterprise environments.

To add or change the phonetic display name for a UM-enabled user, you must use ADSI Edit (AdsiEdit.msc) or the Set-User Exchange Management Shell cmdlet. You cannot use Active Directory Users and Computers or the Exchange Management Console to change a user's phonetic display name. For more information about how to change a phonetic display name by using the Set-User cmdlet, see Set-User.

The PhoneticDisplayName parameter specifies a phonetic pronunciation for the display name. The display name is specified by using the DisplayName parameter. If the display name is not easy for the Unified Messaging server to pronounce or recognize, you can use the PhoneticDisplayName parameter to specify a phonetic version. If you specify a value, it is used by ASR to recognize the user's name and by the Text to Speech (TTS) engine to pronounce the user's name. If you do not specify a value, the Unified Messaging server uses the DisplayName parameter. The maximum length of this parameter value is 255 characters.

For more information about ADSI Edit, see Adsiedit Overview.

For More Information

For more information about how to update the speech grammar files that are used with ASR, see How to Update the Speech Grammar Files.

For more information about Unified Messaging dial plans, see Understanding Unified Messaging Dial Plans.

For more information about Unified Messaging auto attendants, see Understanding Unified Messaging Auto Attendants.