THE WINDOWS CHARACTER SETS

I mentioned earlier that letter keys preceded by dead-character keys generate WM_CHAR messages where wParam is the ASCII code for a character with a diacritic. This may be a little puzzling because the ASCII character set doesn't include any codes for characters with diacritics. What exactly is the value of wParam in this case? The answer to this question requires that we tackle the subject of character sets, a topic that may at first seem more appropriate for a later discussion about character fonts. However, it is also of vital importance in keyboard handling.

The standard 7-bit ASCII character set defines codes from 0 through 31 (1FH) and 127 (7FH) as control characters, and it defines codes from 32 (20H) through 126 (7EH) as displayable characters. None of these characters have diacritics. Because personal computers use 8-bit bytes, computer manufacturers often define character sets that use 256 codes rather than the 128 ASCII codes. The additional codes may be assigned characters with diacritics. The resultant ”extended character set“ then includes the ASCII character set and up to 128 other characters.

If Windows supported such an extended character set, displaying characters with diacritics would be easy. But Windows doesn't support a simple extended character set. Windows supports two extended character sets. Unfortunately, the presence of these two character sets doesn't make things twice as easy.