Single-byte Character Sets

A single-byte character set is a mapping of 256 individual characters to their identifying numeric values. The character codes 0x20 through 0x7E represent standardized displayable characters, but the characters represented by the remaining codes vary among character sets. The ASCII character set covers the range 0x00 through 0x7F.

The ANSI character set is used in window manager and graphics device interface (GDI), but the MS-DOS file allocation table (FAT) file system uses a character set called the original equipment manufacturer (OEM) character set. Variations on the character sets, called code pages, include different special characters, typically customized for a language or group of languages. The OEM code page generally used in the United States is code page 437.

Win32-based applications can use Unicode to avoid the inconsistencies of varied code pages and as an aid in developing easily localized applications.

An application can use the GetACP function to retrieve the ANSI code-page identifier for the system or use the GetOEMCP function to retrieve the OEM code-page identifier.

The OemToChar and OemToCharBuff functions allow an application to convert a character or string from the OEM code page to either the ANSI code page or Unicode. To convert in the other direction, you can use either the CharToOem or CharToOemBuff function. In addition, an application can use the MultiByteToWideChar and WideCharToMultiByte functions to map single-byte character set (SBCS) strings to Unicode and Unicode strings to SBCS strings.

The GetCPInfo function fills a CPINFO structure with information that includes the size, in bytes, of the largest character in the code page and the default character used when a character code is entered that has no corresponding entry in the code page.