18.1.4 Character Sets

All fonts use a character set. A character set contains punctuation marks, numerals, uppercase and lowercase letters, and all other printable characters. Each element of a character set is identified by a number.

Most character sets used in Windows are supersets of the U.S. ASCII character set, which defines characters for the 96 numeric values from 32 through 127. There are four major groups of character sets:

Windows

OEM

Symbol

Vendor-specific

18.1.4.1 Windows Character Set

The Windows character set is the most commonly used character set in Windows programming. It is essentially equivalent to the ANSI character set. The blank character is the first character in the Windows character set. It has a hexadecimal value of 0x20 (decimal 32). The last character in the Windows character set has a hexadecimal value of 0xFF (decimal 255).<$IWindows character set;described>

Many fonts specify a default character. Whenever a request is made for a character that is not in the font, GDI provides this default character. Many fonts using the Windows character set specify the period (.) as the default character. TrueType fonts typically use an open box as the default character.

Fonts use a break character to separate words and justify text. Most fonts using the Windows character set specify the blank character, whose hexadecimal value is 0x20 (decimal 32).

For Windows version 3.1, 24 characters have been added to the Windows code page:

Character Name Windows character code

, base line single quote 130
<134> florin 131
  base line double quote 132
. . . ellipsis 133
dagger 134
double dagger 135
  circumflex 136
  permille 137
  S Hacek 138
  left single guillemet 139
  OE ligature 140
` left single quote 145
' right single quote 146
left double quote 147
right double quote 148
  bullet 149
en dash 150
em dash 151
~ tilde 152
Ô trademark ligature 153
  s Hacek 154
  right single guillemet 155
  oe ligature 156
  Y Dieresis 159

The characters for left and right single quote were added to the character set for the release of Windows version 3.0.

18.1.4.2 OEM Character Set

The OEM character set is typically used in full-screen MS-DOS sessions for screen display. Characters 32 through 127 are usually the same in the OEM, U.S. ASCII, and Windows character sets. The other characters in the OEM character set (0 through 31 and 128 through 255) correspond to the characters that can be displayed in a full-screen MS-DOS session. These characters are generally different from the Windows characters.

18.1.4.3 Symbol Character Set

The Symbol character set contains special characters typically used to represent mathematical and scientific formulas.

18.1.4.4 Vendor-Specific Character Sets

Many printers and other output devices provide fonts based on character sets that differ from the Windows and OEM sets—for example, the EBCDIC character set. To use one of these character sets, the printer driver translates from the Windows character set to the vendor-specific character set.