By Jack Lin, Software Design Engineer
Microsoft Corporation
December 2004
An input method editor (IME) is a program that allows easy text entry using a standard keyboard for East Asian languages such as Chinese, Japanese, Korean, and other languages with complex characters. For example, with IMEs a user can type complex characters in a word processor, or a player of a massive multiplayer online game can chat with friends in complex characters.
This topic explains how you can implement a basic IME edit control in a full-screen Microsoft DirectX application. Applications that take advantage of DXUT automatically get IME functionality. For applications that do not make use of the framework, this topic describes how to add IME support to an edit control.
Contents:
IMEs map keyboard input to phonetic components or other language elements specific to a selected language. In a typical scenario, the user types keys that represent pronunciation of a complex character. If the IME recognizes the pronunciation as valid, it presents the user with a list of word or phrase candidates from which the user can select a final choice. The chosen word is then sent to the application through a series of Microsoft Windows WM_CHAR messages. Because the IME works at a level below the application by intercepting keyboard input, the presence of an IME is transparent to the application. Almost all Windows applications can readily take advantage of IMEs without being aware of their existence and without requiring special coding.
A typical IME displays several windows to guide the user through character entry, as shown in the following examples.
Window Type | Description | IME Output |
---|---|---|
A. Reading Window | Contains keystrokes from the keyboard; typically changes after each keystroke. | reading string |
B. Composition Window | Contains the collection of characters that the user has composed with the IME. These characters are drawn by the IME on top of the application. When the user notifies the IME that the composition string is satisfactory, the IME then sends the composition string to the application via a series of WM_CHAR messages. | composition string |
C. Candidate Window | When the user has entered a valid pronunciation, the IME displays a list of candidate characters that all match the given pronunciation. The user then selects the intended character from this list, and the IME adds this character to the Composition Window display. | the next character in the composition string |
D. Input Locale indicator | Shows the language the user has selected for keyboard input. This indicator is embedded in the Windows taskbar. The input language can be selected by opening the Regional and Language Options Control Panel and then clicking Details on the Languages tab. | - |
In DXUT, the CDXUTIMEEditBox class implements IME functionality. This class is derived from the CDXUTEditBox class, the basic edit control provided by the framework. CDXUTIMEEditBox extends that edit control to support IMEs by overriding the CDXUTIMEEditBox methods. The classes are designed this way to help developers learn what they need to take from the framework to implement IME support in their own edit controls. The rest of this topic explains how the framework, and CDXUTIMEEditBox in particular, overrides a basic edit control to implement IME functionality.
Most of the IME-specific variables in CDXUTIMEEditBox are declared as static, because many IME buffers and states are specific to the process. For instance, a process has only one buffer for the composition string. Even if the process has ten edit controls, they will all be sharing the same composition string buffer. The composition string buffer for CDXUTIMEEditBox is therefore static, preventing the application from taking up unnecessary memory space.
CDXUTIMEEditBox is implemented in the following DXUT code:
(SDK root)\Samples\C++\Common\DXUTgui.cpp
Normally an IME uses standard Windows procedures to create a window (see Using Windows). Under normal circumstances, this produces satisfactory results. However, when the application presents in full-screen mode, as is common for games, standard windows no longer work and may not display on top of the application. To overcome this issue, the application must draw the IME windows itself instead of relying on Windows to perform this task.
When the default IME window creation behavior does not provide what an application requires, the application can override the IME window handling. An application can achieve this by processing IME-related messages and calling the Input Method Manager (IMM) API. See Input Method Editor.
When a user interacts with an IME to input complex characters, the IMM sends messages to the application to notify it of important events, such as starting a composition or showing the candidate window. An application typically ignores these messages and passes them to the default message handler, which causes the IME to handle them. When the application, instead of the default handler, handles the messages, it controls exactly what happens at each of the IME events. Often the message handler retrieves the content of the various IME windows by calling the IMM API. Once the application has this information, it can properly draw the IME windows itself when it needs to render to the display.
An IME needs to get the reading string, hide the reading window, and get the orientation of reading window. This table shows the the functionalities per IME version:
Getting reading string | Hiding reading window | Orientation of reading window | |
---|---|---|---|
Before version 6.0 | Access IME private data directly. See "4 Structure" | Trap IME private messages. See "3 Messages" | Examine registry information. See "5 Registry Information" |
After version 6.0 | GetReadingString | ShowReadingWindow | GetReadingString |
The following messages don't have to be processed for newer IME that implements ShowReadingWindow().
The following messages are trapped by application message handler (i.e. they are not passed to DefWindowProc) to prevent the reading window from showing up.
Msg == WM_IME_NOTIFY wParam == IMN_PRIVATE lParam == 1, 2 (CHT IME version 4.2, 4.3 and 4.4 / CHS IME 4.1 and 4.2) lParam == 16, 17, 26, 27, 28 (CHT IME version 5.0, 5.1, 5.2 / CHS IME 5.3)
The following examples illustrate how to get reading string information from older IME that doesn't have GetReadingString(). The code generates the following outputs:
DWORD dwlen | Length of the reading string |
DWORD dwerr | Index of error char |
LPWSTR wstr | Pointer to the reading string |
BOOL unicode | If true, the reading string is in Unicode format. Otherwise it's in multibyte format. |
LPINPUTCONTEXT lpIMC = _ImmLockIMC(himc); LPBYTE p = *(LPBYTE *)((LPBYTE)_ImmLockIMCC(lpIMC->hPrivate) + 24); if (!p) break; dwlen = *(DWORD *)(p + 7*4 + 32*4); dwerr = *(DWORD *)(p + 8*4 + 32*4); wstr = (WCHAR *)(p + 56); unicode = TRUE;
LPINPUTCONTEXT lpIMC = _ImmLockIMC(himc); LPBYTE p = *(LPBYTE *)((LPBYTE)_ImmLockIMCC(lpIMC->hPrivate) + 3*4); if (!p) break; p = *(LPBYTE *)((LPBYTE)p + 1*4 + 5*4 + 4*2 ); if (!p) break; dwlen = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16); dwerr = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16 + 1*4); wstr = (WCHAR *)(p + 1*4 + (16*2+2*4) + 5*4); unicode = FALSE;
LPINPUTCONTEXT lpIMC = _ImmLockIMC(himc); LPBYTE p = *(LPBYTE *)((LPBYTE)_ImmLockIMCC(lpIMC->hPrivate) + 4); if (!p) break; p = *(LPBYTE *)((LPBYTE)p + 1*4 + 5*4); if (!p) break; dwlen = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16 * 2); dwerr = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16 * 2 + 1*4); wstr = (WCHAR *) (p + 1*4 + (16*2+2*4) + 5*4); unicode = TRUE;
// GetImeId(1) returns VS_FIXEDFILEINFO:: dwProductVersionLS of IME file int offset = ( GetImeId( 1 ) >= 0x00000002 ) ? 8 : 7; LPINPUTCONTEXT lpIMC = _ImmLockIMC(himc); BYTE p = *(LPBYTE *)((LPBYTE)_ImmLockIMCC(lpIMC->hPrivate) + offset * 4); if (!p) break; dwlen = *(DWORD *)(p + 7*4 + 16*2*4); dwerr = *(DWORD *)(p + 8*4 + 16*2*4); dwerr = min(dwerr, dwlen); wstr = (WCHAR *)(p + 6*4 + 16*2*1); unicode = TRUE;
int nTcharSize = IsNT() ? sizeof(WCHAR) : sizeof(char); LPINPUTCONTEXT lpIMC = _ImmLockIMC(himc); BYTE p = *(LPBYTE *)((LPBYTE)_ImmLockIMCC(lpIMC->hPrivate) + 1*4 + 1*4 + 6*4); if (!p) break; dwlen = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16 * nTcharSize); dwerr = *(DWORD *)(p + 1*4 + (16*2+2*4) + 5*4 + 16 * nTcharSize + 1*4); wstr = (WCHAR *) (p + 1*4 + (16*2+2*4) + 5*4); unicode = IsNT() ? TRUE : FALSE;
A full-screen application must properly handle the following IME-related messages:
The IMM sends a WM_INPUTLANGCHANGE message to the active window of an application after the input locale has been changed by the user with a key combination (usually ALT+SHIFT), or with the input locale indicator on the taskbar or language bar. The language bar is an on-screen control with which the user can configure a text service. (See How to show the language bar.) The following screen shot shows a language selection list that is displayed when the user clicks on the locale indicator.
When the IMM sends a WM_INPUTLANGCHANGE message, CDXUTIMEEditBox must perform several important tasks:
The IMM sends a WM_IME_SETCONTEXT message when a window of the application is activated. The lParam parameter of this message contains a flag that indicates to the IME which windows should get drawn and which should not. Because the application is handling all of the drawing, it does not need the IME to draw any of the IME windows. Therefore, the application's message handler simply sets lParam to 0 and returns.
The IMM sends a WM_IME_STARTCOMPOSITION message to the application when an IME composition is about to begin as a result of keystrokes by the user. If the IME uses the composition window, it displays the current composition string in a composition window. CDXUTIMEEditBox handles this message by performing two tasks:
The IMM sends a WM_IME_COMPOSITION message to the application when the user enters a keystroke to change the composition string. The value of lParam indicates what type of information the application can retrieve from the Input Method Manager (IMM). The application should retrieve the available information by calling ImmGetCompositionString and then should save the information in its private buffer so that it can render the IME elements later.
CDXUTIMEEditBox checks for and retrieves the following composition string data:
WM_IME_COMPOSITION lParam Flag Value | Data | Description |
---|---|---|
GCS_COMPATTR | Composition Attribute | This attribute contains information such as the status of each character in the composition string (for example, converted or non-converted). This information is needed because CDXUTIMEEditBox colors the composition string characters differently based upon their attributes. |
GCS_COMPCLAUSE | Composition Clause Information | This clause information is used when the Japanese IME is active. When a Japanese composition string is converted, characters may be grouped together as a clause that gets converted to a single entity. When the user moves the cursor, CDXUTIMEEditBox uses this information to highlight the entire clause, instead of just a single character within the clause. |
GCS_COMPSTR | Composition String | This string is the up-to-date string being composed by the user. This is also the string displayed in the composition window. |
GCS_CURSORPOS | Composition Cursor Position | The composition window implements a cursor, similar to the cursor in an edit box. The application can retrieve the cursor position when processing the WM_IME_COMPOSITION message in order to draw the cursor properly. |
GCS_RESULTSTR | Result String | The result string is available when the user is about to complete the composition process. This string should be retrieved and the characters should be sent to the edit box. |
The IMM sends a WM_IME_ENDCOMPOSITION message to the application when the IME composition operation is ending. This can occur when the user presses the ENTER key to approve the composition string, or the ESC key to cancel the composition. CDXUTIMEEditBox handles this message by setting the composition string buffer to be empty. It then sets s_bHideCaret to FALSE because the composition window is closed and the cursor in the edit box should again be visible.
The CDXUTIMEEditBox message handler also sets s_bShowReadingWindow to FALSE. This flag controls whether the class draws the reading window when the edit box renders itself, so it must be set to FALSE when a composition ends.
The IMM sends a WM_IME_NOTIFY message to the application whenever an IME window changes. An application that handles the drawing of the IME windows should process this message so that it is aware of any update to the content of the window. The wParam indicates the command or the change that is taking place. CDXUTIMEEditBox handles the following commands:
IME Command | Description |
---|---|
IMN_SETOPENSTATUS | Sent to the application when the open status of the input context is changed. For a language that uses IME, this command can be sent when the user changes from native conversion mode to non-native, or vice versa. (See WM_INPUTLANGCHANGE for definitions of native and non-native modes.) CDXUTIMEEditBox handles this command similar to the way it handles WM_INPUTLANGCHANGE. It must retrieve the current input language and conversion mode, and it must initialize the indicator text and color. |
IMN_OPENCANDIDATE / IMN_CHANGECANDIDATE | Sent to the application when the candidate window is about to be opened or updated, respectively. The candidate window opens when a user wishes to change the converted text choice. The window is updated when a user moves the selection indicator or changes the page. CDXUTIMEEditBox uses one message handler for both of these commands because the tasks required are exactly the same:
|
IMN_CLOSECANDIDATE | Sent to the application when a candidate window is about to close. This happens when a user has made a selection from the candidate list. CDXUTIMEEditBox handles this command by setting the visible flag of the candidate window to FALSE and then clearing the candidate string buffer. |
IMN_PRIVATE | Sent to the application when the IME has updated its reading string as a result of the user typing or removing characters. The application should retrieve the reading string and save it for rendering. CDXUTIMEEditBox has two methods to retrieve the reading string, based upon how reading strings are supported in the IME:
|
Rendering of the IME elements and windows is straightforward. CDXUTIMEEditBox lets the base class render first because IME windows should appear on top of the edit control. After the base edit box is rendered, CDXUTIMEEditBox checks the visibility flag of each IME window (indicator, composition, candidate, and reading window) and draws the window if it should be visible. See Default IME Behavior for descriptions of the different IME window types.
The input locale indicator is rendered before any other IME windows because it is an element that is always displayed. It should therefore appear below other IME windows. CDXUTIMEEditBox renders the indicator by calling the RenderIndicator method, in which the indicator font color is determined by examining the s_ImeState static variable, which reflects the current IME conversion mode. When the IME is enabled and native conversion is active, the method uses m_IndicatorImeColor as the indicator color. If the IME is disabled or is in non-native conversion mode, m_IndicatorImeColor is used to draw the indicator text. By default, the indicator window itself is drawn to the right of the edit box. Applications can change this behavior by overriding the RenderIndicator method.
The following figure shows the different appearances of an input locale indicator for English, Japanese in alphanumeric conversion mode, and Japanese in native conversion mode:
The drawing of the composition window is handled in the RenderComposition method of CDXUTIMEEditBox. The composition window floats above the edit box. It should be drawn at the cursor position of the underlying edit control. CDXUTIMEEditBox handles the rendering as follows:
The cursor should blink for Korean IMEs, but it should not for other IMEs. RenderComposition determines whether the cursor should be visible based upon timer values when the Korean IME is used.
The reading and candidate windows are rendered by the same CDXUTIMEEditBox method, RenderCandidateReadingWindow. Both windows contain an array of strings for vertical layout, or a single string in the case of horizontal layout. The bulk of the code in RenderCandidateReadingWindow is used to position the window so that no part of the window falls outside the application window and gets clipped.
IMEs sometimes contain advanced features to improve the ease of inputting text. Some of the features found in newer IMEs are shown in the following figures. These advanced features are not present in DXUT. Implementing support for these advanced features can be challenging because there is no interface defined to obtain the necessary information from the IMEs.
Advanced Traditional Chinese IME with expanded candidate list:
Advanced Japanese IME with some candidate entries that contain additional text to describe their meanings:
Advanced Korean IME that includes a handwriting recognition system:
The following registry information is checked to determine the orientation of the reading window, when the current IME is older CHT New Phonetic that doesn't implement GetReadingString().
Key | Value |
---|---|
HKCU\software\microsoft\windows\currentversion\IME_Name | keyboard mapping |
Where: IME_Name is MSTCIPH if the IME file version is 5.1 or later; otherwise IME_Name is TINTLGNT.
The orientation of reading window is horizontal if either:
If neither condition is met, the reading window is vertical.
Operating System | CHT IME Version |
---|---|
Windows 98 | 4.2 |
Windows 2000 | 4.3 |
unknown | 4.4 |
Windows ME | 5.0 |
Office XP | 5.1 |
Windows XP | 5.2 |
Standalone web downloadble | 6.0 |
For additional information, see the following:
Gets reading string information.
The reading string length.
If the return value is greater than the value of uReadingBufLen, then all output parameters are undefined.
This function is implemented in CHT IME 6.0 or later, and can be acquired by GetProcAddress on an IME module handle; the IME module handle can be acquired by ImmGetIMEFileName and LoadLibrary.
Declared in Imm.h.
Use Imm.lib.
Show (or hide) the reading window.
Return TRUE if successful or FALSE if otherwise.
This function is implemented in CHT IME 6.0 or later, and can be acquired by GetProcAddress on an IME module handle; the IME module handle can be acquired by ImmGetIMEFileName and LoadLibrary.
Declared in Imm.h
Use Imm.lib.