The Input Method Editor and Unicode

There are two issues involved with Unicode handling and the IME. One is the existence of Unicode versions of IME routines, and the other is getting the IME to return Unicode characters (rather than DBCS) in the WM_CHAR and WM_IME_CHAR messages.

All routines which handle strings have a Unicode equivalent in the IME, even under Windows 98. The IME is different from other Unicode function implementations in Windows, however, in that the Unicode versions return the size of a buffer in bytes rather than 16-bit Unicode characters.

You can use the RegisterClassW function under Windows NT (and only under Windows NT; under Windows 9x this function is stubbed out) to cause the WM_CHAR and WM_IME_CHAR messages to return Unicode characters in the wParam parameter rather than DBCS characters.

Using Recoversion with the Input Method Editor

In Windows 98 and Windows NT 5.0, the IME implements a new feature called reconversion. Normally the IME determines the lists of candidates based only on what is typed. Reconversion allows the IME to determine candidates (or only one candidate) based on the sentence it is in (its context). There are three types of reconversion: simple, normal, and enhanced.

Reconversion is useful when a user notices a composition error in the document. In this case the user can select the error, choose recoversion from a menu, and the IME will use the context to determine the best replacement.