Speech Recognition

Windows CE for the Auto PC provides SAPI to simplify interaction between applications and the speech recognition engine. SAPI provides a standard by which applications integrate speech features.

The following illustration shows a basic speech recognition system.

Object
Function
Voice Command Provides a high-level interface for discrete speech recognition
Speech Recognition Engine Provides a low-level speech recognition interface
Audio Source Enables the speech recognition engine to obtain standard audio input from a WAV input device. The engine does not have to consider what kind of WAV input device is available or what kind of audio drivers or APIs are supported. Windows CE for the Auto PC supports a WAV format quality of 11.025 kHz, 16-bit, mono.

A speech recognition engine recognizes speech commands and sends the results to applications. On an Auto PC, the speech recognition engine supports discrete speech recognition, which resolves a speech command delineated by pauses. A speech command can be one word, such as Radio, or a short phrase, such as What Can I Say? Discrete speech recognition does not require complex computation because pauses signify the start and end of speech commands. Discrete speech recognition allows for higher accuracy and lower memory requirements than continuous speech recognition, that recognizes uninterrupted speech.

Discrete speech recognition imitates a desktop computer menu, providing users with a list of commands to choose from. Discrete speech recognition users speak a command, where desktop computer users select a command from a visual menu. A speech recognition engine resolves audio input from a microphone into text or symbols that map to the speech command menu for the application.

To perform speech recognition, an application requires:

The audio source object and speech recognition engine object are included in Windows CE for the Auto PC. Your application must provide only the grammar to the speech recognition engine and handle the results returned.

The following illustration shows how an application sends a grammar to the speech recognition engine.

Note An original equipment manufacturer (OEM) supplies the speech recognition engine for Windows CE for the Auto PC. Although an OEM can install a speech recognition engine for continuous speech or for discrete speech, both function as a discrete speech recognition engine on an Auto PC.

If an OEM chooses to supply an alternate speech recognition engine, the engine must export a specified set of Microsoft Component Object Model (COM) interfaces that permit communication with SAPI, applications, and an audio source object that handles the details of taking input from a microphone and converting it into digital form.