Language-Specific Code

Windows operating systems carry a great deal of international-specific information, including sort and case algorithms. (See Chapter 5.) There is no reason to carry proprietary sort, case, or character property tables in your code, unless they are for languages the system does not support. If you are concerned about the overhead of continually calling the system, call the system at startup time to create static tables. The following code will return accurate information for only a small percentage of the languages that Windows supports:

int CharUpper (int ch)
{
if (ch >= 'a' && ch <= 'z')
return ch - 'a' + 'A';
else
return ch;
}

Outside the ASCII range, you cannot simply add or subtract 32 to do case conversion. For some languages, such as French, German, and Turkish, that approach will not work correctly for all characters. There are also languages, such as Arabic and Japanese, that have no sense of case. Even the following optimization will not work correctly for Turkish:

// optimized CharUpper
int CharUpperOptimized (int ch)
{
if (ch >= 'A' && ch <= 'Z')
return ch;
if (ch >= 'a' && ch <= 'z')
return ch - 'a' + 'A';
else
return (CharUpper(ch));
}

Obviously, it's important to make your code as fast as possible. Just be careful to test special cases for foreign languages when optimizing routines. Traditional algorithms for mapping characters to lowercase, determining which characters are alphabetic, and comparing characters are based on sequences built into the ASCII character set and will often fail for languages such as Czech, Hungarian, Spanish, and Turkish. When in doubt, call the system. A great deal of research went into creating the character parsing and formatting (date/time/currency/number) routines available in Windows. Save yourself some trouble and take advantage of the information that the system provides. Alternatively, if you use Microsoft Visual C++, you can use locale-sensitive run-time functions. The help files for Visual C++ 2 contain a comprehensive list of these routines under the topic "Locale" in the Run-Time Routines category.