Control PowerPoint 2000 with Voice Commands, MIND, July 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

This article assumes you're familiar with Microsoft PowerPoint and Visual Basic for Applications

Control PowerPoint 2000 with Voice Commands
Ed Hess

You've probably used PowerPoint to create business presentations. Wouldn't it be great to give the show a little zip with a hands-free presentation? The Microsoft Speech API can help.

I needed to do a presentation on speech recognition and thought that it would be helpful to navigate through the presentation using speech commands. This method is much more powerful than simply using a wireless mouse because I can gain access to the PowerPoint® object model and use an unlimited number of speech commands to manipulate the slides. I also wanted to be able to walk around without being tethered to my notebook. I did some investigation into wireless microphones and came up with a good one—the Shure Wireless TCHS.

    If you want to follow along, you'll need the following Microsoft® software: the Direct Speech Recognition control, the Direct Speech Synth- esis control, and PowerPoint 97 or PowerPoint 2000. The redistributable ActiveX® components are currently available under license as part of the Microsoft Speech API (SAPI) SDK, which is available at http://www.microsoft.com/Mind/0799/PPT2000/ppt2000.htm.

    Rest assured, there is nothing unique to PowerPoint in the Visual Basic® for Applications (VBA) code that is provided with this article. All of the code can reside in Microsoft Excel, Word, a Visual Basic-based program, an HTML Web page, or any other related container.

    PowerPoint is supported by VBA with a development environment similar to the Visual Basic IDE. You start the Visual Basic Editor from the Tools menu as shown in Figure 1, or by using the keyboard shortcut Alt+F11. The project window and the properties window will appear inside the editor's work area.

Figure 1: Accessing the Visual Basic Editor

      Figure 1: Accessing the Visual Basic Editor

    Enter a name for your project using the Name field in the properties box. To reference the Direct Speech Recognition control and the Direct Text to Speech control, use the Tools menu in the Visual Basic Editor window and select the Look Up Reference menu item. The Available References listbox will appear (see Figure 2). Click on the previously mentioned controls and you will be able to reference the speech objects in your project (assuming that you have already installed the SAPI SDK).

Figure 2: Referencing Speech Controls

      Figure 2: Referencing Speech Controls

The Speech Class Module

    Since I'm not going to be developing a form-based application, I need a class module to intercept events for the speech objects. VBA and Visual Basic require a class module for an object created at runtime for handling events. Create the class module from the Visual Basic Editor by selecting Insert | Class Module from the menu. A class module window will be added to your work area and the Properties box will contain the information about the new module. Go to the Name field and change the name to SpeechClass in order to follow the rest of my code. The class module is shown in Figure 3.

Figure 3: The SpeechClass Class Module

      Figure 3: The SpeechClass Class Module

    The SpeechClass class module defines two objects: DirectSR and DirectSS. Both objects are defined using the WithEvents clause to intercept events. The first event defined below the objects is called when the speaker finishes a phrase. The second event handles the action when the text-to-speech engine is finished speaking and reactivates the speech recognition engine. I'll return to the DirectSR event later when I develop the code module and initialization subroutine.

    I did not include any code to perform an action when the event doesn't recognize something that has been spoken; this situation is parsed as a blank string. You might want to have the event do something when it receives this blank string, since things can get pretty annoying when the program doesn't recognize a phrase.

    Notice that the speech recognition control needs to be deactivated before the text-to-speech engine begins speaking. Most sound cards cannot multiplex recording and playing, so you must disable the listening state of the speech recognition software before attempting to generate sound, and vice versa. Otherwise, you will get an error when you try to speak.
Defining PowerPoint and Speech Objects

    Let's call the next code module SpeechModule. To create this module, select the Insert menu item from the Visual Basic Editor and then insert a new module. Go to the Properties box and change the name of the module to SpeechModule. The first section of the new module will define the objects necessary for speech and to navigate through the PowerPoint slides.

    The code in Figure 4 shows the definitions and the initialization subroutine you need. The App object provides access to the top of the object hierarchy used by PowerPoint. The SClass object variable provides access to the SpeechClass code that I developed.

    The second section of the SpeechModule controls the initialization process. The Init subroutine creates an instance of the speech control and connects it to the speech server. Once you have the speech object variables in place, you need to set the SpeechClass so it will be able to intercept speech events. First I set the Speech variable in the class module to point to the speech instance that I have just created, then I initialize the PowerPoint objects. The App variable is created first to enable access to the active presentation.
Defining the Grammars

    You may have noticed the GrammarFromFile method of the DirectSR object. The computer.txt file referenced in the DirectSR.GrammarFromFile method contains my grammar, or list of recognized voice commands. I created a simple file because I wanted my computer to stay asleep until I gave it the magic word. The word I chose, computer, can be replaced with one of your choosing. Here is the code for computer.txt:
[Grammar] langid=1033 type=cfg [<Start>] <Start> =computer "Computer"
      The langid setting 1033 is English, while type=cfg stands for context-free grammar. The <Start> tags define each of the recognized voice commands. The first command can be read as: listen for "computer" and set "Computer" as the parsed string to send to the DirectSR_PhraseFinished event that I defined in my SpeechClass class module. Until it recognizes the word "computer," it will do nothing.

    When the PhraseFinished event fires, it must do three things: deactivate the speech recognition engine, use the text-to-speech engine to tell the speaker that "I am listening," and load a new grammar with more commands for navigating through the PowerPoint presentation. The new grammar, voice1.txt, looks like the following:
[Grammar] langid=1033 type=cfg [<start>] <start>=next slide "Next" <start>=back "Back" <start>=close "Close" <start>=go to sleep "Sleep"
      Now you need to expand the cases in your code for the DirectSR_PhraseFinish event to the code in Figure 5. The Next and Back cases call the next and previous methods of the PowerPoint object and allow you to move forward and backward through the presentation. Many more commands could be added to the grammar and the event's corresponding cases to expand what you can do with speech commands during your presentation. Notice also that the Sleep case reloads the original grammar so that nothing will be recognized until the word "computer" is spoken again.

    This brings up one of the nice features of the Shure Wireless TCHS microphone: it has a mute switch that lets you turn it off, which comes in really handy if you need to do a lot of talking and may need to say your magic word in another context.
Providing Access to the Speech Session

    PowerPoint has two different ways to allow you to turn on speech recognition. The easiest way is to designate a keyword within the text that serves as a link, activating the session. Position the mouse anywhere inside the selected keyword and right-click. Select Action Settings from the popup menu. In the Action Settings dialog, click on the "Run macro" radio button and select Init from the listbox (see Figure 6). Click OK and the keyword will be selected, underlined, and displayed in the default color, red. When you click on the keyword at presentation time, the link will activate the Init procedure.

Figure 6: Configuring Action Settings

      Figure 6: Configuring Action Settings

    The second technique places a button on the surface of the slide and programs it to call the session when the button is pressed (Click event). Placing the button is done via the Control Toolbox (see Figure 7), one of the toolbars available with

Figure 7: Add a Button
PowerPoint 2000. If your copy of PowerPoint does not display the toolbox, right-click anywhere on the toolbar area and check the Control Toolbox option.

    To add the button, select the button icon (third from the left in the second row) and then draw and size it on the surface of the slide. Double-click on the button to invoke the Visual Basic Editor. If this is the first button you place on the slide, the editor will take two actions. First, the editor will add the slide to the project list. Second, a code window for the slide will open and the skeleton of a subroutine for capturing the Click event will be ready for coding. Simply code Init as the body of the subroutine:
Private Sub CommandButton1_Click() Init End Sub

The Windows Sound System

    One of the most important parts of getting all of this to work well is setting up the Windows Sound System and positioning the microphone properly. You can choose among a number of different sound cards and microphones. The Ensoniq PCI and Turtle Beach Montego sound cards have received good reviews for speech recognition. The most common microphones are a close-talk or headset microphone that is held close to the mouth, a handheld model such as the Philips SpeechMike (which I highly recommend, as it includes a trackball and programmable buttons), or a medium-distance microphone that rests on the computer 30 to 60 centimeters away from the speaker. A headset microphone is needed for noisy environments.

    A microphone setup wizard, micwiz.exe, comes with the SAPI SDK (usually found in C:\Program Files\Microsoft Speech SDK\Misc\). You should run this program to make sure your microphone and sound system are working properly. I set up a shortcut to the wizard on my desktop so that I can run it before I give a presentation. It sets the volume levels and gives readings on background noise levels. In case the wizard fails to give you an OK on your setup, you must adjust your sound system settings manually. The following is a very lengthy and detailed set of instructions, but will often fix the problem.

    If you do not already have the Volume Control applet running, follow these steps:
Double-click the speaker icon in your taskbar.
If you don't have a speaker icon, you need to turn this feature on from the Control Panel. Double-click the Multimedia control panel, and select "Show volume control on the taskbar." Now double-click the speaker icon in your taskbar.
      Once the Volume Control applet has appeared, select the Options menu and make sure the Advanced Controls menu item is checked. If this menu item is disabled, you don't need to do anything.

    Check the playback volume:
Select the Options menu, then Properties.
In the Properties dialog, make sure that the right Mixer device is selected. If you don't know which one you're using and there's more than one choice, you'll need to repeat all of the following steps for each Mixer device.
Select the Playback radio button. Make sure that all of the checkboxes in the "Show the following volume controls" listbox are checked, then press OK.
Look at the first vertical slider for the volume control. It's usually called Volume Control. Make sure that the slider is not all the way at the bottom, and that Mute is not checked.
Find the slider named Wave or Wave Out. Make sure that the slider is someplace near the middle or above, and that Mute is not checked.
If there's a slider named Microphone, make sure that it is set to the bottom, and that it is muted. If you don't do this, you might hear a loud feedback whine when speech recognition starts listening.
      Check the recording volume:
Select the Options menu, then Properties.
In the Properties dialog, make sure that the right Mixer device is selected. If you don't know which one you're using and there's more

     than one choice, then you will need to repeat all of the following steps for each Mixer device.
Click on the Recording radio button. Make sure that all of the checkboxes in the "Show the following volume controls" listbox are checked and then press OK.
Find the slider named Microphone. Make sure the slider is someplace near the middle and that the checkbox is selected.
Make sure the checkboxes on the other sliders are not selected; otherwise speech recognition may recognize from your CD-ROM or other devices.
If there's an Advanced button under the Microphone slider, click it. A dialog will be displayed. If the dialog has options for Automatic gain, you should make sure they're checked and press OK. Most sound cards need this checked, although if your microphone volume is too loud, you may need to turn the setting off..
      Check the Other volume:
Select the Options menu, then Properties.
In the Properties dialog, make sure that the right Mixer device is selected. If you don't know which one you're using and there's more than one choice, then you will need to repeat the following steps for each Mixer device.
Click on the Other radio button. Make sure that all of the checkboxes in the "Show the following volume controls" listbox are checked. If the Other radio button is disabled then don't follow the next steps in this section. Press OK.
Find the slider named Microphone. Make sure the slider is someplace near the middle and that the checkbox is selected.
Make sure the checkboxes on the other sliders are not selected; otherwise speech recognition may recognize from your CD-player or other devices.
If there's an Advanced button under the Microphone slider, click it. A dialog will pop up. If the dialog has options for Automatic gain, make sure they're checked and press OK. Most sound cards need this checked, although if your microphone volume is too loud you may need to turn the setting off.
      Check the Settings using Sound Recorder:
Go to Sound Recorder by selecting Start | Programs | Accessories | Multimedia | Sound Recorder. (The Multimedia menu is called Entertainment in Windows 98.)
Click the Edit menu item, and then Audio Properties. Where it says Preferred Quality pick CD Quality in the dropdown list. Make sure the Recording Volume slider is about halfway and click OK.
Click the File menu and Properties. If the Audio Format does not say "PCM 44.100 kHz, 16 Bit, Stereo," press Convert Now so that it does and click OK.
Click on the red button (record) and dictate. As you record, the flat green line should show bubbles as the sound comes through. Hit the rectangular button box (stop) after recording a few seconds.
Hit the playback arrow to play back the sound. The sound should be clear with little static or electronic feedback. Listen carefully to the playback. If the sound is clear with no static or background noise from the environment, your sound card and microphone are set up correctly.
      You should also test the settings by returning to the Microphone Setup Wizard. You may want to manipulate the volume controls while you are in the "Make sure the microphone is working" dialog so you can see the effects of manipulating the mixer.

    If the microphone element moves even slightly away from the optimal position, your recognition accuracy may significantly deteriorate. For optimal speech recognition, make sure you position the microphone carefully and consistently every time you use it.

    To position a close-talk microphone such as the Shure Wireless TCHS:
Squeeze the foam rubber windscreen so that you feel the microphone element.
Make sure the front of the microphone element points toward your mouth. A colored dot, the word Talk, or some other label may indicate the front.
Position the element so that the back of your thumb, which you are using to squeeze the element, just touches one corner of your mouth.
Keep the microphone about a thumb's width from the corner of your mouth, and not directly in front of your mouth.

Presenting . . .

    You are now ready to start your presentation. When you click your button or linked keyword, your Init subroutine will start and turn on the speech recognition engine. If all goes well, it will definitely impress your friends. Best of luck with other audiences!

http://msdn.microsoft.com/library/psdk/englishquery/eq02_1.htm

From the July 1999 issue of Microsoft Internet Developer.