Microsoft(R) Windows[TM] for Pens Computing-Moderation of the Recognition Process

Stephen Liffick

Created: March 20, 1992

ABSTRACT

This article discusses the primary data structures in MicrosoftÒ WindowsÔ for Pen Computing and the methods used by application programmers to moderate the recognition process with these structures. It provides a high-level overview of the recognition process followed by detailed information on the methods of modification and control over this process available to applications.

REVIEW: DATA FLOW AND THE RECOGNITION PROCESS*

Most of the time, a pen behaves like a mouse. Pen actions generate mouse events. For example, if the pen touches the surface of the digitizing device, a WM_LBUTTONDOWN message is generated. If it moves across the surface of the digitizing device, WM_MOUSEMOVE messages are generated. When the pen leaves the surface of the digitizing device, a WM_LBUTTONUP message is generated.

The recognition process begins with a call to the Recognize function in response to a pen event (usually a WM_LBUTTONDOWN message). When an application receives the mouse event over an inking or a recognition area, it calls the Recognize function so that inking and recognition can begin.

The Recognize function does not return until the RC Manager sends the results to the application and the user terminates recognition mode—usually through time-out (by lifting the pen and waiting). As a result of calling Recognize, ink is drawn, data is recognized, dictionaries verify results, and one or more WM_RCRESULT messages are sent to the window handle provided by the caller.

The function returns as soon as the last WM_RCRESULT message has been processed. The application performs any necessary cleanup and returns from the WM_LBUTTONDOWN message handler.

Note:

The application must return from WM_LBUTTONDOWN explicitly without letting the DefWindowProc function see the message. The application does not receive a WM_LBUTTONUP message. The “pen up” event that terminated recognition is passed back as part of the ink data in the final WM_RCRESULT message.

A HIGH-LEVEL OVERVIEW OF THE RECOGNITION PROCESS

Figure 1 shows the steps in the recognition process.

Figure 1.

Step 1: Display and Buffering. The RC Manager takes raw data from the pen driver and displays it as ink on the screen. This step is incidental to the recognition process; its purpose is to give the user the feel of a pen on paper. As ink is displayed on the screen, it is simultaneously buffered for use in shape recognition.

Step 2: Shape Recognition. “Shape recognition” is a generic term describing the process of turning raw pen data into predefined symbols such as roman characters, geometric figures, Kanji characters, and so on. This phase begins with a recognizer calling back into the RC Manager to query the buffered data. Recognizers generally receive the data in convenient chunks and process it to determine the most likely set of symbols associated with the ink.

Shape recognition is often unable to resolve ambiguities in user input (see Figure 2).

Figure 2.

To a reader familiar with the English language, Figure 2 reads kit. To a recognizer, however, it might just as well be lcit. The recognizer does not have enough useful information to help make this decision. The spacing of the letters could be used to make a determination, but the relatively close spacing of the l and the c is possibly accidental. A recognizer might return that the input has a 40-percent chance of being kit and a 60-percent chance of being lcit.

Step 3: Postprocessing. The postprocessing facility adds a measure of contextual intelligence to the recognition process by comparing a particular recognition result against a set of expected results (for example, a set of English words). In the previous example, a postprocessor would replace the lcit result with kit because lcit is not an English word.

Step 4: Timing of results. The final step of the process ensures that the results are returned to the caller in the method requested—on character, word, or line boundaries or when the recognition event is complete. The RC Manager passes results back to the application at the requested intervals, thereby completing the recognition process.

THE PRIMARY MODERATING DATA STRUCTURE--RC

An application can control the recognition process through the configuration options provided by the recognition context (RC) data structure.

What's My Ink?

The application can use the nInkWidth and rgbInk elements of the RC structure to specify the width and color of the ink left by the pen as the user moves it across the screen. nInkWidth is an integer in display coordinates (pixels) that describes the width of the ink to draw, and rgbInk is a DWORD that contains a red-green-blue (RGB) value used to render the ink on the screen. For speed reasons ink is drawn only in solid colors. The GetNearestColor function is used with the RGB value to determine the solid color closest to the requested color.

When Is Recognition Over?

Before beginning the recognition process, the application designer should decide when the recognition event will be considered complete. Several termination options are available with corresponding RC components for setting each option.

The input type influences this choice. For example, all system gestures consist of a single stroke, so recognition can terminate after the stroke is entered. If free-hand drawing is allowed, recognition can terminate only when the pen leaves the bounding rectangle of the drawing area. The standard method is to wait for the pen to time out—that is, to wait until a specific interval of time has elapsed without new data from the pen—and then terminate recognition.

Termination conditions are set with the lPcm bit field in the RC structure. The pen completion mode (PCM) flags can be combined with OR statements to create the desired termination “complex.” The flags and their meanings are:

PCM_PENUP: Terminates recognition as soon as the user lifts the pen from the tablet surface. This option limits user input to a single stroke before returning recognition results to the caller. PCM_PENUP is useful for gestures-only fields. It can also be used by applications that perform some special action on single input strokes—for example, applications that implement scrolling with flicks of the pen or allow only system gestures for a given area of the screen.

PCM_RANGE: Terminates recognition as soon as the pen leaves the range of the tablet’s proximity detection mechanism. This option makes sense only for tablet devices that support proximity detection. It is useful when an application seeks termination after a single letter of input or immediately after the user finishes writing. The disadvantage of PCM_RANGE is that the distance associated with the “range” is not under the application’s control; it is a hardware-specific trait that is not configurable with the pen application programming interface (API). This fact makes using PCM_RANGE somewhat tricky. Applications should test this option extensively on target hardware to ensure that user variances in writing do not result in premature termination.

PCM_RECTBOUND: Terminates recognition with the next “pen down” action outside the bounding rectangle. This termination option keeps the pen in inking mode for long periods of time—for example, until the user indicates that recognition should ensue by tapping outside a signature field. Another common use for this option is to limit the inking area to a window’s client area.

The bounding rectangle is provided by the RC structure rectBound element. This is a RECT structure that specifies the bounding rectangle in screen coordinates or, if the RCO_TABLETCOORDS flag is set, in tablet coordinates (1000ths of an inch).

PCM_RECTEXCLUDE: Terminates recognition when a “pen down” action occurs within the specified rectangle. This option is useful for a window button that terminates recognition when pressed. The rectangle is specified by the RC structure rectExclude element. Like rectBound, rectExclude is in screen coordinates or, if the RCO_TABLETCOORDS flag is set, in tablet coordinates.

PCM_TIMEOUT: Terminates recognition if no pen activity occurs for the specified duration. The RC structure wTimeOut element indicates how long, in milliseconds, the RC Manager should wait before time-out. This method of recognition termination is the most common. Users generally like to wait and see how their handwriting turns out after entering characters and have a tendency to pause for thought during writing. These two reasons make PCM_TIMEOUT a natural choice for recognition termination.

Telling the Recognizer What to Expect

To achieve high recognition rates, applications must tell the recognizer what type of input to expect from the user whenever possible. A recognizer can use this information to process ink from the pen and to derive recognized symbols more intelligently. An application can answer different types of questions (such as where, what, when, how, and who) to improve recognition.

Where?

Applications may provide some context to help users write neatly and to achieve high levels of recognition. For example, they may draw lines on the screen to serve as input guides, like ruled paper, or use the bedit window class Windows for Pen Computing provides (Figure 3). The input metaphor for the bedit class is a comb.

Figure 3.

The bedit class provides cells for users to write. The pen API allows the application (or, in this case, the window class) to pass along any visual cues to the recognizer to facilitate the recognition process. Most recognizers can leverage the “form” used for writing to decide how ink should be recognized. This information is passed to the recognizer through an RC structure element called guide.

The guide element is itself a structure—a GUIDE structure. A fully defined GUIDE structure describes a grid used by a recognizer to “fit” the ink entered by the user. In the preceding example, lc was confused with k. The recognizer could have based its lc versus k decision on grid information provided by the guide element, relying on the fact that l and c reside within a single grid unit and hence must be part of the same character.

The guide element also helps determine whether a letter is uppercase or lowercase when this distinction is not obvious from the shape of the letter (for example, c/C, k/K, o/O, s/S, u/U, and w/W). The recognizer can make this determination, but any information the application provides to the writer regarding baseline and midline rules can aid this process.

The wRcOrient and wRcDirect elements provide additional information on how the user will input data:

Although rarely used, the wRcOrient element lets an application tell the recognizer where the logical origin of the digitizer is located—that is, where the system should logically locate the origin when processing ink. The coordinates returned from the digitizing device never change—they are simply interpreted differently. This option is useful for applications that let users label the axis of a chart.

The wRcDirect element specifies the primary and secondary directions for input. A recognizer is not required to support this capability (the Microsoft Recognizer does not). Most languages have a primary and a secondary direction for writing. In English, the primary direction is from left to right, and the secondary direction is from top to bottom. In Chinese, the primary direction is from top to bottom, and the secondary direction is from right to left. A recognizer can leverage this information to parse various input orientations correctly.

What?

There are two parts to this question that applications can answer through the RC structure:

Which recognizer is going to handle the input?

What type of input should that recognizer expect?

The first question is handled through the hrec element of the RC structure. hrec is a handle to a recognizer that turns the ink data into a recognized set of symbols. Applications can “load” recognizers by hand with the InitRecognizer function. This is the action InitRC takes when it generates a default RC structure. If hrec is NULL, no recognizer is called; an application typically uses the NULL value if the intended result of the recognition event is only to display and store the ink the user enters.

The second question is what type of input to expect from the user. The three areas in which the application can assist the recognizer are expected characters, priority, and language.

Expected characters

In many situations, especially those involving form or dialog box fields, an application will “know” what type of characters the user is likely to enter. If this is the case, the application can improve the recognition rate by giving this information to Windows for Pen Computing. For example, if the application expects only numeric input, it should pass that information to the recognizer, enabling the recognizer to preferentially compare ink entered by the user to known prototypes for numbers. A large number of contextual hints, referred to as alphabet codes (ALCs), can be provided through the 32-bit alc element of the RC structure:

ALC_ALL	ALC_DEFAULT	ALC_LCALPHA	ALC_UCALPHA
ALC_ALPHA	ALC_NUMERIC	ALC_ALPHANUMERIC	ALC_PUNC
ALC_MATH	ALC_MONETARY	ALC_OTHER	ALC_INTL
ALC_WHITE	ALC_NONPRINT	ALC_SYSMINIMUM	ALC_GESTURE
ALC_USEBITMAP	ALC_DBCS	ALC_HIRAGANA	ALC_KATAKANA
ALC_KANJI	ALC_OEM	ALC_RESERVED	ALC_NOPRIORITY

An application can combine the ALC codes with OR statements to precisely describe the type of input expected. The majority of the ALC codes are straightforward; the following describes those that are less obvious or otherwise of interest:

ALC_OTHER: Recognizes all symbols not included in ALPHANUMERIC, MONETARY, MATH, and PUNC.

ALC_WHITE: Recognizes spaces between characters. An application might choose not to specify ALC_WHITE if it is expecting purely contiguous input—for example, in a zip code field where a space would force the user to edit the results.

ALC_NONPRINT: Recognizes nonprintable characters or keystrokes such as the ESC key and function keys. Some recognizers have special means to enter these glyphs.

ALC_INTL: Recognizes all ANSI characters not included on the current code page. This feature is desirable if users commonly enter text in more than one language.

ALC_DBCS: Returns characters in double-byte character format (that is, in UNICODE).

ALC_USEBITMAP: Provides a set of enabled characters from which the recognition result will be determined. An application can toggle recognition for any of the 256 characters in the ANSI character set—literally turning on recognition for only those characters that the user is expected to enter—through the rgbfAlc element. For example, if a field has a single-letter code with only five acceptable results, an application can set the bit in the rgbfAlc array corresponding to each result. The recognizer then preferentially maps the ink entered by the user to one of the acceptable inputs.

Priority

Another ace in the application’s hand is the ability to give precedence to one ALC code out of all codes expected, through the alcPriority element of the RC structure. This element should be set to the ALC code to be given the highest priority in making recognition decisions. For example, if an application uses ALC_UCALPHA | ALC_LCALPHA but expects most users to enter uppercase letters, alcPriority can be set to ALC_UCALPHA. This causes the recognizer to weight results that include uppercase characters more heavily. It also helps the recognizer make determinations regarding letters that have the same lowercase and uppercase appearance.

Note:

The alcPriority element does not support ALC_USEBITMAP.

Language

The final element of the “what?” question is the language of the expected input. An application can indicate that a subset of languages should be enabled during recognition instead of turning on the entire ANSI code page with the ALC_INTL flag. The lpLanguage element points to an array of concatenated three-letter language codes describing the current set of languages expected from the user. For U.S. versions of Windows for Pen Computing, this element generally points to a single language, but multiple language values are possible for European and other multinational pen platform users. By default, this element is derived from the sLanguage setting of the [Intl] section in the WIN.INI file, but it can be expanded to support a specific application during handwriting recognition.

When?

The application can answer two questions of this type:

When should results be returned?

At what point does the ink stored in the internal pen buffer become significant to the recognition event that is about to begin?

Returning results

Like most roman character recognizers, the Microsoft Recognizer can return recognition results at different intervals: at word boundaries, at stroke boundaries, at character boundaries, at new lines, or when recognition is complete. An application can specify the desired interval with the wResultMode element of the RC structure. The wResultMode setting ensures that results are not returned more quickly than the interval requested. For example, an application may request results stroke by stroke but may not receive any results until an entire word or sentence has been entered.

Determining significant points

Windows for Pen Computing uses a buffer to store pen events separately from the system queue of mouse events. This buffer contains the high-resolution data from the digitizer used by recognizers to provide adequate recognition. Because the buffer for pen events (in PENWIN) is separate from the buffer for mouse events (in the kernel), a mouse event is not inherently associated with the appropriate pen event. The Windows for Pen Computing API is designed specifically to circumvent this problem.

Windows for Pen Computing submits all pen events as mouse events; however, along with the simple x,y screen coordinates, Windows for Pen Computing passes a pointer that associates the mouse event with the pen event that generated it. This pointer is stored with the queued mouse event in the kernel where applications can query it. An application can get this information from the kernel queue by calling the GetMessageExtraInfo function while handling the mouse message in question. For example, an application should call GetMessageExtraInfo when handling a WM_LBUTTONDOWN message to get the pen event pointer associated with this mouse event. An application can either assign the wEventRef element of the RC structure to this extra information or use the RC_WDEFAULT flag to force Windows for Pen Computing to assign it on the application’s behalf before calling the Recognize function.

How?

The RC structure elements clErrorLevel, lpfnYield, and wRcOptions affect how recognition proceeds. An application can use these values to implement special capabilities or to manage special considerations during recognition:

clErrorLevel is a value between 1 and 100, representing the probability of a particular recognition result being correct. The application can determine the level at which a recognizer “gives up” on input and returns “I don’t know” rather than making a potentially bad guess. A recognition result is considered unrecognized below this error level. This setting can be particularly important—for example, for the social security field in a form where low error rates are an absolute necessity. In this case, the application wants a result only if the recognizer is certain that it has recognized the input correctly. The application can enforce this certainty by setting clErrorLevel to a high number.

lpfnYield is a long pointer to an application-provided function that the Recognize function should call when it needs to yield the CPU. If lpfnYield is NULL, the default Windows Yield function is called instead. This is useful for a time-critical application interested in ensuring that adequate time is available to perform background processing even during recognition. It is the lpfnYield function’s responsibility to eventually call the Windows Yield function. The default handler is not called unless lpfnYield is NULL.

wRcOptions specifies various control panel options. It is a bitwise combination of the following flags:

RCO_BOXED: Indicates that the guide element of the RC structure contains valid data that should be leveraged in the recognition process. The guide element is ignored unless this flag is set.

RCO_DISABLEGESMAP: Disables the Gesture Manager’s ability to replace circle letter gestures with one or more keystroke combinations. An application that has reserved the circle letter gestures should not use this flag to “turn off” any user-provided meanings for those gestures. The Gesture Manager is a macro layer. As with all macro layers, the user must understand that any input bound to a macro cannot be used in the context of another application because it never reaches the application—it becomes “macro-ized” before it gets there. This behavior is as true for the Gesture Manager circle letter gestures as it is for keystroke macro recorders.

RCO_NOFLASHCURSOR and RCO_NOFLASHUNKNOWN: If a recognition result is SYV_COPY or SYV_UNKNOWN, the pen cursor changes briefly to a “copy” or “question mark” cursor. An application can turn off cursor feedback with these options.

RCO_NOHIDECURSOR: Prevents the cursor from being hidden during inking. For example, a drawing package with an opaque digitizer would want to leave the cursor visible to ensure that the user has adequate visual feedback. Cursor feedback is unnecessary with an integrated digitizer display because the user presses the pen directly on the desired location.

RCO_NOHOOK: Prevents the systemwide recognition results hook from being called before those results are passed to an application. Applications with reentrancy problems that use hooks generally use this option. The application can simply turn off the ability to hook recognition results when it is likely to cause problems.

RCO_NOSPACEBREAK: Informs the recognizer to send entire sentences instead of individual words to the dictionary (see the “Dictionary Processing” section for more information). This option is useful for a custom dictionary with special contextual or natural language parsing capabilities that would be crippled by breaking up input at every space in the input stream.

RCO_SAVEALLDATA: Ensures that the RC Manager returns all data associated with the pen. The RC Manager normally records ink and passes it back to the application. By default, it does not save all ink; it returns only the points associated with the pen being down. If the SAVEALLDATA option is set, the RC Manager also returns data associated with the pen when it was not in contact with the digitizing surface, assuming that the digitizer is capable of reporting this information.

RCO_SAVEHPENDATA: Prevents pen data from being thrown away. The ink the RC Manager returns to the application is normally transient. That is, it is immediately thrown away after the application returns from WM_RCRESULT. If an application wants to save the data, it must either copy it before returning from WM_RCRESULT or set RCO_SAVEHPENDATA to prevent the RC Manager from throwing the ink away. By setting this flag, the application takes responsibility for freeing the pen data.

RCO_SUGGEST: Frees the dictionary to make suggestions that are not necessarily any of the recognizer’s alternatives. In general, a dictionary can promote only a less likely alternative as the first choice; it cannot make suggestions of its own. This flag tells the dictionary that it can be liberal in the type of help it provides, given a particular recognition result.

RCO_TABLETCOORD: Indicates that all coordinate values in the RC structure have already been converted to tablet coordinates. By default, all coordinate values in the RC structure are in screen coordinates.

Who?

The final important question an application can answer through the RC structure is “who?” That is:

Who is entering data?

Who is interested in seeing the results of recognition?

Three RC structure elements, lpUser, wRcPreferences, and hwnd, help provide this information:

lpUser points to the name of the current user. The current user name is accessible through the Handwriting applet in the Control Panel. It serves as the basis for all training and identifies personal characteristics (such as left-handedness or right-handedness and desired time-out) used by the recognizer. This information should be available at recognition time because all training is inherently user specific, and handedness information can be especially important during shape recognition. The Control Panel supports the addition of new users. Multiple users can use the same machine and have their own training and preferences, thereby assuring maximum recognition rates.

The wRcPreferences element of the RC structure also contains information on the current user’s preferences and is used with the lpUser element. wRcPreferences currently specifies whether the user is left-handed or right-handed and includes several reserved bits for internal use.

An application can use the hwnd element of the RC structure to indicate where results should be sent. The specified window receives the WM_RCRESULT message containing recognition results, as discussed later in this article.

Dictionary Processing

After recognition is complete, dictionaries postprocess the results. The RC structure allows the application to moderate and control the dictionary-processing stage.

Each dictionary dynamic link library (DLL) must export a procedure called DictionaryProc. The rglpdf element of the RC structure contains an array of pointers to these dictionary procedures—one pointer for each dictionary DLL an application wants in the review loop. Recognition results are passed to these dictionaries, one after the other, until a dictionary indicates that it has corrected or modified a recognition result. In general, dictionary processing occurs on a word-by-word basis. When a dictionary corrects a word, the results string is updated with the new result and dictionary processing continues with the next word.

The application can use the RCO_NOSPACEBREAK option to pass entire sentences to dictionaries for correction. This option is not supported by the Microsoft dictionary; it may be used by third-party dictionaries because of the advantages of greater contextual information.

An application can also set the level at which a result is too certain to be passed to the dictionary. The wTryDictionary element of the RC structure contains a value between 1 and 100 reserved for this purpose. If the certainty of a particular recognition result is relatively low, applications will probably want the dictionary consulted. If the certainty of a particular recognition result is relatively high, applications will probably not want the dictionary consulted, no matter how the string is garbled.

PROCESSING RECOGNITION RESULTS: THE RCRESULT STRUCTURE

All recognition results are sent from Windows for Pen Computing to pen applications in the RCRESULT structure form. The following sections explain the RCRESULT structure and its components.

The WM_RCRESULT Message

The WM_RCRESULT message carries RCRESULT structures back to applications. Windows for Pen Computing sends WM_RCRESULT messages to applications with lParam pointing to an RCRESULT structure. Applications must be prepared to receive this message before calling the Recognize function because all WM_RCRESULT messages associated with a particular recognition event are received before Recognize returns. In effect, the WM_RCRESULT message is a “callback” to the application.

WM_RCRESULT can arrive more than once for a given recognition event, depending on the interval requested by the application. Each message contains a pointer to a new, self-contained RCRESULT structure that embodies the recognition results from the last WM_RCRESULT message to the present.

An application can determine why the message was sent—because recognition is over or because a character, word, or line boundary was reached—by examining wParam. This parameter contains an REC code that describes why recognition was terminated. For example, REC_OK means recognition is over and the application can proceed as expected. REC_BUSY means recognition could not be completed because another application is currently using the recognizer. The REC codes are the same as those returned by the Recognize function.

Symbols and Symbol Values

Windows for Pen Computing defines a 32-bit space for storing recognition results. Values are allocated from this space for geometric shapes, gestures, letters of the alphabet, Kanji, Katakana, musical notes, resistors, capacitors, and so on. For example, a unique 32-bit value is associated with the letter a, another unique 32-bit value is associated with b, and unique 32-bit values are associated with the rectangle shape, the space gesture, and so on.

The 32-bit values associated with various symbols are known as symbol values (SYVs). The pen APIs use symbol values to refer to recognized input internally.

The Symbol Graph

The first element of the RCRESULT structure, syg, conveys the alternative results of user input to the application. This data structure is called a symbol graph, colloquially defined as “the likely alternatives for the ink entered.” A symbol graph is a directed graph of symbol values that describes the recognition event in as much detail as possible.

For example, in our previous discussion kit was confused with lcit because too much space was left between the l and c parts of the letter k. If the recognizer considered lc a more likely result than k, the symbol graph would represent the input as follows:

{ lc | k }it

If the recognizer considered k more likely than lc, the symbol graph would be:

{ k | lc }it

Each RCRESULT structure contains a symbol graph that fully describes all the results generated since the last RCRESULT structure was sent to the application. In most cases, the symbol graph contains all recognized input associated with a recognition event.

It is possible for multiple symbol values to hold a place in the graph. For example, the possible meaning lc is actually two distinct symbol values jointly holding a place in the recognized input—the same place that a single symbol value, k, may hold properly.

This characteristic of handwritten input—the inability to know with certainty which letter or letters are associated with an ink stream—necessitates a data structure that supports multiple alternatives with multiple meanings for each location in the recognized result. Furthermore, each meaning potentially represents a group of symbol values. To help make probability decisions, the symbol graph must bind locations in the ink stream to multiple alternatives of one or more symbol values, each with associated probabilities. As it happens, the symbol graph contains two data structures that provide this information when evaluated together.

The first half of the data structure contains an array of symbol correspondence (SYC) structures. An SYC structure delineates a specific chunk of ink from the ink stream entered by a user. Each SYC contains a first stroke and a last stroke; the first and last strokes and all strokes in between define the chunk of ink associated with the SYC. The symbol graph contains an array of SYC structures, each mapped to a different part of the ink input. Taken together, the SYC structures map to all ink associated with the RCRESULT structure being handled.

The second half of the data structure contains an array of symbol elements (SYEs). An SYE contains a symbol value, a confidence (probability) level, and a pointer into the array of SYCs. Each symbol value in the recognized input has an SYE with its own confidence level and pointer. Thus, by using the SYE, each symbol value in the array can be mapped to the ink from whence it came, and the relative certainty with which it was recognized can be known.

Therefore, to determine that a single chunk of ink actually maps to multiple characters, one need only observe that multiple SYEs map to the same SYC structure. In other words, more than a single symbol value is associated with the same chunk of ink.

A comparison of the confidence levels associated with SYEs determines which result is more likely. The cost of a particular path (that is, how bad a match it is and how much we are stretching the definition of a prototype by accepting it as a possible match) is easily computed through the symbol graph by adding the cost per symbol value and dividing it by the number of symbol values for a potential mapping. Thus, the lowest cost solution can be obtained and provided as the best guess. Higher cost solutions can remain in the symbol graph as alternatives; dictionaries are free to promote an alternative to “best guess” if appropriate.

The symbol graph allows applications to generate accurate choice lists and to target input appropriately by illustrating the exact relationship between the symbols recognized and the ink entered by the user. Given these capabilities, the symbol graph can be considered the heart of the recognition results.

The Best Guess

The RCRESULT structure contains three elements that provide best-guess information. Together, they describe what the recognizer and dictionaries believe to be the desired meaning of the ink entered by the user.

The lpsyv element is a string of symbol values that map to the best guess. The best guess can be in three states:

It can be the lowest cost path (in terms of error level) in the symbol graph discussed previously. An API (EnumSymbols) actually generates such an lpsyv from a symbol graph, leading one to believe that the lpsyv element is redundant. However, the lpsyv element can be the result of other forces that require its presence in the RCRESULT structure.

It can be a dictionary-promoted path through the symbol graph or a dictionary suggestion not related to the symbol graph. If this is the case, a dictionary must have promoted one of the higher cost paths in the symbol graph to “first place,” recommending it as the best guess. To determine whether this has occurred, the application must compare the lowest cost solution (obtainable through EnumSymbols) with lpsyv.

It can be the result of a gesture mapping. Because circle letter gestures can be mapped to character strings by the user, a circle letter gesture may have been entered and turned into a string of symbol values by Windows for Pen Computing. To determine whether this has occurred, the application can compare lpsyv with the EnumSymbols result as discussed previously or can check the wResultsType of the RCRESULT structure for the RCRT_GESTURETRANSLATED flag. If set, this flag indicates that a gesture was translated.

Additional information about the lpsyv element is provided in cSyv, which represents the number of symbol values in the lpsyv string, and hSyv, which is the handle to the memory block from which lpsyv was allocated.

If the lpsyv element contains a string of symbol values associated with printable characters, applications can translate the string of SYVs to a string of characters with the SymbolToCharacter API. This step is required to generate a normal C string.

The Location and Position of the Input

Three RCRESULT structure elements, nBaseLine, nMidLine, and rectBoundInk, provide information about the location and position of the ink entered by the user:

nBaseLine is the recognizer’s best guess for the baseline of the ink entered by the user. If this value is not known, nBaseLine is zero. If the Microsoft Recognizer is being used, nBaseLine is always zero because the Microsoft Recognizer does not support this element.

nMidLine is the recognizer’s best guess for the midline of the ink entered by the user. If this value is not known, nMidLine is zero. If the Microsoft Recognizer is being used, nMidLine is always zero because the Microsoft Recognizer does not support this element.

rectBoundInk is a Windows RECT structure that contains the bounding rectangle of the ink entered by the user. This is typically used either to invalidate the area of the screen on which inking occurred or to update the display in the appropriate location, for example, with the recognized text. Note that rectBoundInk is computed with ink width taken into account and is the bounding rectangle for all data entered by the user. Windows for Pen Computing does not guarantee that the RCRESULT rectBoundInk will be a subset of the rectBoundInk provided as the bounding rectangle for input.

Contextual Information

Two elements of the RCRESULT structure, lprc and wResultsType, provide information about the recognition event but are not part of the actual recognition results:

lprc is a far pointer to the RC structure passed to Recognize.

wResultsType is a flag that describes how the recognition event proceeded. Possible values for wResultsType include the following:

RCRT_ALREADYPROCESSED: Informs the application that the recognition result was processed before being sent to the application. This includes recognition hooks and work performed by the Gesture Manager on behalf of the application.

RCRT_GESTURE: Tells the application that the recognition result was a gesture. If seen in conjunction with the RCRT_GESTURETRANSLATED flag, this bit indicates that a circle letter gesture was translated into another gesture.

RCRT_GESTURETOKEYS: Indicates that a user-defined gesture was mapped directly into keys. This bit is seen only in conjunction with the RCRT_GESTURETRANSLATED flag and is necessary if the user maps a circle letter gesture to unprintable keys (for example, to CTRL, ESC, or function keys).

RCRT_GESTURETRANSLATED: Describes what action the Gesture Manager took on the recognition result, as discussed previously. The three possible results are:

If the gesture is mapped to printable keys (for example, normal text) by the user, no other flags are set and the application processes results as they appear in the lpsyv element in the RC structure.

If the gesture is mapped to unprintable keys, Windows for Pen Computing automatically generates the keystrokes corresponding to the gesture mapping. The application need not take any action on the results.

If the gesture is mapped to keystrokes associated with standard keyboard shortcuts for Cut, Copy, Paste, or Undo, the circle letter gesture maps the keystrokes into equivalent gestures, that is, SYV_CUT, SYV_COPY, SYV_PASTE, and SYV_UNDO.

RCRT_NORECOG: Indicates that the hrec element of the RC structure was NULL and that no recognition was attempted for the result.

RCRT_NOTHINGRECOG: Indicates that the recognizer was not able to recognize the ink as any of its symbols.

RCRT_UNIDENTIFIED: Indicates that some of the recognition results could not be identified. Applications that can display alternative lists or otherwise handle errors can leverage this information to make those mechanisms available.

The Ink

The final two elements of the RCRESULT structure, pntEnd and hpendata, contain information about the ink entered by the user:

The pntEnd element contains the last point of the ink data in tablet coordinates.

The hpendata element is a handle to a PENDATA data structure that contains all ink information entered by the user. Applications that manage ink in any way will reference this element extensively. The HPENDATA data type is used throughout the system, including internally by recognizers, whenever ink must be managed. This data type is accepted and manipulated through a series of ink management APIs. The functions an application can perform on ink include recognizing it, training it, scaling it, drawing it, adding points to it, retrieving points from it, copying it, and saving it.

CONCLUSION

Applications manage their interaction with Windows for Pen Computing primarily through the RC and RCRESULT data structures. An application must understand these data structures and how they tailor and monitor the recognition process in order to implement the full range of pen functionality.

*For a detailed discussion of the information in this section, see the "Microsoft Windows for Pen Computing System Architecture" article.