Talking with MS-DOS

If Windows were the only operating environment running on a machine, then you could forget about the OEM character set and use only the ANSI character set. However, users can create files in the MS-DOS environment and use them in Windows; they can also create files in Windows and use them when back in MS-DOS. Unfortunately, MS-DOS uses the OEM character set.

Here's an example of the communications problems that can occur. Suppose that a German-speaking PC user creates a file named \UUBUNGEN.TXT (”practice exercises“) in an MS-DOS program such as EDLIN. On the IBM PC, the \UU is part of the IXC (that is, OEM) character set and has a code of 154 or 9AH. (When using MS-DOS with a U.S. keyboard on an IBM PC, you can also create this letter by typing Alt-154 using the numeric keypad.) MS-DOS uses that character code in the directory entry of the file.

If a Windows program uses MS-DOS function calls to obtain a directory of files and then writes them directly to the display using an ANSI character set font, the first letter of \UUBUNGEN will show up as a solid block, because the code 154 is one of the undefined characters in the ANSI character set. The Windows program needs to convert the IXC extended character set code of 154 (9AH) to an ANSI character set code of 220 (or DCH), which is the letter \UU in the ANSI character set. That's what the Windows function OemToAnsi does for you. It requires two far pointers to strings. The OEM characters in the first string are converted to ANSI characters and stored in the second string:

OemToAnsi (lpszOemStr, lpszAnsiStr) ;

Now let's take the opposite example. The German-speaking user wants your Windows program to create a file named \UUBUNGEN.TXT. The filename entered by the user has a 220 (DCH) as the first character. If you use an MS-DOS function call to open the file, MS-DOS uses that character in the filename. When the user later looks at the file under MS-DOS, the first character shows up as a block. Before you use the MS-DOS function calls, you must convert the filename to the OEM character set:

AnsiToOem (lpszAnsiStr, lpszOemStr) ;

This converts a 220 (DCH) to a 154 (9AH). Windows also includes two functions named AnsiToOemBuff and OemToAnsiBuff that do not require a zero-terminated string.

Windows has an OpenFile call that will convert this for you. If you use OpenFile, don't do your own AnsiToOem conversion. If you use MS-DOS function calls to obtain lists of filenames (as the Windows File Manager program does), then these filenames should be passed through OemToAnsi before being displayed.

Converting the contents of files is another problem that arises when files are used in both Windows and MS-DOS. If your Windows program uses files that you are certain have been created in an MS-DOS program, then you may need to pass the text contents of the file through the OemToAnsi function. (For instance, Windows WRITE does this when converting Microsoft Word files to WRITE format.) Similarly, if your Windows program is preparing a file for use in an MS-DOS program, you may want to use AnsiToOem to convert the text.

The OemToAnsi and AnsiToOem functions are located in the keyboard driver. They incorporate very simple lookup tables. The OemToAnsi routine converts an OEM code from 80H through FFH to a character code in the ANSI set that most closely resembles the OEM character. In some cases, this conversion is only grossly approximate. For instance, most of the line-drawing characters in the IXC character set are translated as plus signs, dashes, and vertical lines. Most of the OEM codes from 00H through 1FH are not translated to ANSI codes.

The AnsiToOem routine converts ANSI codes from A0H through FFH into codes in the OEM set. The accented characters in the ANSI character set that do not appear in the OEM character set are translated into regular ASCII codes for the characters without the diacritics.