Platform SDK: International Features

Caret Placement and Hit Testing

Complex script languages are broken into clusters by ScriptShape. Character reordering always occurs within cluster boundaries. The clusters themselves are guaranteed to advance monotonically in the reading order.

Conventions for caret placement within clusters depend on the script. For the Arabic script, if the caret position is set between a base character and its combining mark, then the caret is displayed halfway through the base character. For the Thai script, the caret may not be positioned within a cluster. Thus, when the user advances the caret, the application must advance past all the glyphs that make up the cluster.

The ScriptXtoCP and ScriptCPtoX functions translate between caret positions (in codepoint offsets) and x positions (in pixels). Both functions require the attribute and position information returned by ScriptShape and ScriptPlace. If you don't save the width information, you may want to do hit testing and caret placement after you display each run. As an alternative, you could cache enough information to do hit testing and caret placement on the current line without requiring reprocessing of the paragraph.

ScriptXtoCP returns a trailing edge flag so the caller knows which side of the character or cluster the user has clicked on. The value of the flag is either zero or the width of the character or cluster in code points. The returned character position is the position of the character on which the user clicked. Most editors set the caret closest to the cluster whose leading edge the user clicked. To achieve this, add the flag value to the returned character position.

For languages such as Thai where the user conventionally does not want to place the caret into a cluster, ScriptXtoCP sets the trailing side flag to zero or the cluster width. For languages such as Arabic, where the user expects to be able to edit within a cluster, ScriptXtoCP sets the trailing side flag to zero or one.

To help the client establish valid locations for the caret when handling the arrow keys, Uniscribe provides information on valid caret positions in the fCharStop member in the logical attributes returned by ScriptBreak: TRUE is returned for most characters and FALSE for intercluster characters in scripts such as Thai. Check the fNeedsCaretInfo flag in the SCRIPT_PROPERTIES for an item to see if it is necessary to call ScriptBreak to check for valid caret positions. If the fNeedsCaretInfo member is FALSE, all code points are valid caret positions.