Eliminating Compile Dependencies

Glossary

A major incentive for creating internationalized core code is not having to recompile source files to create localized editions of your program. If you eliminate compile dependencies by avoiding hard-coded strings or constants and by not putting localizable resources in header files, your localization process will go much faster.

Very large programs take hours to compile, even with powerful processors. Building debug and nondebug executables from scratch for each language takes significantly longer (unless you buy a number of expensive machines) and requires large amounts of disk space. If you share common code, you have to compile your main executable only once. To create localized editions, you compile only the localized resource files and link them to the executable or to a separate DLL. It's even possible to eliminate compilation entirely by using localization tools that allow translators to edit executables directly.

This "no-compile" strategy has a number of obvious benefits. Separating code from localizable resources helps you write internationalized code. Significantly reduced compile times allow for faster turnaround in the development, testing, and translation cycle. Testers have to test only one executable. Translators, especially third-party consultants, don't need to have access to source code: with only an executable, resource files, and a build script, they can compile a localized program to check their work.

If your program is not based on Unicode (see Chapter 3), you might need more than one core executable—one for all single-byte, left-to-right languages; one for multibyte languages; and one for single-byte, bidirectional languages. Multibyte and right-to-left languages require special input and layout handling. The efficiency gained by sharing core code is not always worth the overhead of handling all character sets in all language editions of a program. The Windows 95 team, for example, decided that enabling all language editions of Windows for double-byte character sets (DBCS's) would delay the release of single-byte editions of Windows and make them significantly larger. Thus, additional processing required for DBCS editions of Windows is isolated in the source files with #ifdef statements. Bidirectional functionality is contained in add-on libraries. The three code bases for Windows 95, however, are used to produce 30 different language editions. Most of the Windows 95 code is still shared, and most language editions are no-compile.

Even though source files for Windows contain #ifdef statements to distinguish code bases, #ifdefs are never used to handle special cases for single languages. The best place to put #ifdef statements is in header files. (See Chapter 3.) Don't use code with #ifdef statements to fix bugs that occur only in the non–domestic-language editions. This practice makes international concerns too easy to ignore; with the increasing importance of multilingual documents, it's harder to label bugs as being "international only." Often the code surrounding #ifdefs is poorly designed and the #ifdef is simply a hack. You can be sure that developers who use #ifdefs and hacks to fix international-related problems will forget to update the code inside the #ifdefs when they update the surrounding code.

You can easily replace language-specific #ifdef statements with run-time if and switch statements. The following fragment from a hypothetical program's startup code uses #ifdefs to omit features for certain language editions:

#ifdef USA
#define DEFAULT_PAPER_SIZE 1 // USLetter
#endif

#ifdef JAPAN
#define NO_SPELL_CHECKER
#define FAREAST
#define DEFAULT_PAPER_SIZE 2 // A4
#endif

#ifdef KOREA
#define NO_SPELL_CHECKER
#define FAREAST
#define DEFAULT_PAPER_SIZE 2 // A4
#endif

...

#ifndef NO_SPELL_CHECKER
InitSpellChecker(); // no spell-checker for Japan or Korea
#endif

...

#ifdef JAPAN
InitImperialCalendar(); // This is for Japan only.
#endif

...

#ifdef FAREAST
InitIME(); // needed for all Far East builds
#endif

This code is rewritten below to use a global variable that is initialized at startup time. Before the program calls any code related to spell-checkers, IMEs, or paper sizes, it checks this variable. (See Chapter 4 for an explanation of PRIMARYLANGID and LANGIDFROMLOCALE.)

typedef struct _LOCINFO
{
int fSpellChecker;
int fUsesIME;
int DefaultPaperSize;
LCID lcidDefaultLocale;
} LOCINFO;

...

LOCINFO locinfo;

GetLocInfo(&locinfo); // function implemented in language DLL

if (locinfo.fSpellChecker)
InitSpellChecker();

if (locinfo.fUsesIME)
InitIME();

if (PRIMARYLANGID(LANGIDFROMLOCALEID(locinfo.lcidDefaultLocale)) ==
LANG_JAPAN)
{
InitImperialCalendar();
}

...

If your application is based on Unicode, sharing a single set of source files and creating a single worldwide binary is a much simpler process. While you won't have to deal with differences in character sets, you will still have to deal with a few differences in the APIs that ship with different editions of Windows. For example, the Input Method Editor API functions for Far Eastern languages are included in the API set for Far East editions of Windows only. While they are stubbed in non–Far East editions of Windows 95, they are not stubbed in European-language editions of Windows NT 3.5. You can still maintain a single binary that will run on all editions of Windows NT 3.5 by using the Win32 API GetProcAddress() to call the IME APIs instead of calling them by name. (See Chapter 7.)