Code Pages and Unicode

A code page is an ordering or encoding of a standard set of characters within a specific locale. This encoding provides a consistent way for computer devices to exchange and process data. Each code page includes a common set of core characters (the first 128 characters of the code page). Windows NT supports several code pages, including ANSI and OEM code pages. ANSI code pages are supported for Windows 3.1 compatibility; OEM code pages are supported for MS-DOS and OS/2 compatibility. Other code pages are available, based on the installed locale, for use in data translation. These include secondary OEM code pages, MAC code pages, and EBCDIC code pages.

The following table shows the various code pages supported in Windows NT.

Table C.2 Windows NT Code Pages

Code page name	Number	Type
Windows 3.1 Eastern European	1250	ANSI
Windows 3.1 Cyrillic	1251	ANSI
Windows 3.1 US (ANSI)	1252	ANSI
Windows 3.1 Greek	1253	ANSI
Windows 3.1 Turkish	1254	ANSI
MS-DOS U.S.	437	OEM
MS-DOS Greek	737	OEM
MS-DOS Multilingual (Latin I)	850	OEM
MS-DOS Slavic (Latin II)	852	OEM
IBM Cyrillic (primarily Russian)	855	OEM
IBM Turkish	857	OEM
MS-DOS Portuguese	860	OEM
MS-DOS Icelandic	861	OEM
MS-DOS Canadian-French	863	OEM
MS-DOS Nordic	865	OEM
MS-DOS Russian (former USSR)	866	OEM
IBM Modern Greek	869	OEM
Macintosh Roman	10000
Macintosh Greek I	10006
Macintosh Cyrillic	10007
Macintosh Latin II	10029
Macintosh Icelandic	10079
Macintosh Turkish	10081
EBCDIC	037
EBCDIC "500V1"	500
EBCDIC	1026
EBCDIC	875

Windows NT uses Unicode (the BMP region of ISO specification 10646) for all internal text processing. Unicode is a 16-bit, fixed-width character encoding standard, with sufficient encoding space to accommodate most of the world's modern characters. All character sets and code pages supported by Windows NT can be mapped to Unicode.

By using Unicode-enabled applications, users can benefit from multilingual processing and a rich selection of characters.

For more information, see The Unicode Standard (version 1.0); The Unicode Consortium, Addison-Wesley Publishing Company, Inc.; 1991

Note Most code pages have a core set of characters in common (ASCII characters–the first 128 characters in the code page). In addition, each code page includes some unique "extended" characters not available on other code pages. Be sure not to use these extended characters in server names, computer names, and share names. Also, don't use these extended characters with applications used across the network. The FAT and HPFS file systems, which use the OEM code page, must translate the characters they don't recognize in the filename to a best-fit character, no character, or some non-recognized character.