C.5. Application Compatibility
For example, if an application program supporting Unicode works in English, it should be easily localizable into Korean, given operating system support. The main areas that cause problems in localization are input, manipulation and rendering:
- Input is easier with jamos, since the keyboard input can exactly match the characters in the data stream. There is no requirement for application programs to support input methods, which removes a significant burden.
- Manipulation includes cases such as concatenation or truncation of text. Conjoining jamos must not be confused with a double-byte character set (DBCS), such as shift-JIS, where there is a mixture of codes with different lengths. A major problem with DBCS is that if bytes are treated in isolation (or misinterpreted as a single-byte character set [SBCS]), then the text will be misparsed. For example, if a random byte is misinterpreted as a single byte and removed from a text stream, the meaning of the surrounding bytes can be completely corrupted.
The individual jamos maintain their independent identity: If a character is removed from a text stream, for example, the surrounding characters maintain their correct interpretation. However, programs may want to preserve syllable block boundaries, which does require some analysis of the text.
- Rendering is not generally a problem for application compatibility. In modern systems it is handled by the the operating system, and does not require any additional work on the part of the application program.
- The storage of Korean text using conjoining jamos takes about 2.2 times as many bytes as it does when stored as precomposed Hangul syllables. (The exact figure depends on the particular composition of the text: The factor of 2.2 is based on samples with half of the Korean syllables having two jamos, the others having three, and 20 percent of the text consisting of other characters such as space, punctuation, and so on.) The number of characters in Korean text expressed in jamos is roughly equivalent to the corresponding English text: Systems and application programs that can handle the volume of data necessary for English Unicode will easily handle that of Korean.