Platform SDK: International Features |
An application that uses complex scripts has problems with a simple approach to formatting and display. First, the width of a complex script character depends on its context. It is not possible to save the widths in simple tables. Second, breaking between words in scripts like Thai requires dictionary support since there is no separator character between Thai words. Third, Arabic, Hebrew, Farsi, Urdu and other bidirectional text requires reordering before display. And finally, some form of font association is often required to easily use complex scripts.
To deal adequately with these issues, Uniscribe uses the paragraph as the unit for display. Note, this means that Uniscribe must be used for the entire paragraph, even if sections of the paragraph are not complex scripts.
Before using Uniscribe, an application divides the paragraph into runs, that is, a string of characters with the same style. The style depends on what the application has implemented, but typically includes such attributes as font, size, and color. Uniscribe divides the paragraph into items -- strings that have the same script and direction. The application applies the item information to produce runs that are unique in script and direction.
Uniscribe identifies the clusters in each run and determines the size of each cluster. A cluster is a script-defined, indivisible character grouping. For European languages, a cluster is a single character, but, in languages such as Thai, it is a grouping of glyphs. Uniscribe sums the clusters to determine the size of a run. Then the application sums the lengths of the runs until they overflow a line (or reach the margin), and divides the run that overflows the line between the current line and the next line. For each line, a map is built from visual position to a run. For each run, the code points are shaped into glyphs, which are then positioned and rendered.
With this overview in mind, we can look at the process in detail and how Uniscribe fits in. An application does text layout, or formatting, one time. Then it either saves the glyphs and positions for display purposes or it generates them each time it displays the text. The trade-off is speed vs. memory. Typically, an application will generate the glyphs and positions each time for display, so the process is presented as a layout procedure and a display procedure.
To Lay out Text Using Uniscribe
This procedure assumes that the application has already divided the paragraph into runs.
This completes layout of the line. Repeat steps 6 through 10 for each line in the paragraph. However, if the application needed to break the last run on the line, call ScriptShape to reshape the remaining part of the run as the first run on the next line.
To Display Text Using Uniscribe
This procedure is done for each line. It assumes that the text has already been laid out using Uniscribe, and that the glyphs and positions from the layout process were not saved. If speed is a concern, an application can save the glyphs and positions from the layout procedure and start at #2.