3.2. Zero-Width Joining

U+200C

ZERO WIDTH NON-JOINER

U+200D

ZERO WIDTH JOINER


In the merger with ISO/IEC 10646-1, the semantics of these two characters have been given a narrow interpretation. This brings added precision to the explanation given in Volume 1, p. 77.

The intent of these characters is to address cursive graphical connections between the glyphs of a script, for example, in scripts like Arabic whose printed form emulates handwriting. NON-JOINER and JOINER are best thought of as behaving like tiny letters that neighboring glyphs may connect to (JOINER) or avoid connecting to (NON-JOINER). They are thus processed as ordinary cursive letters rather than as control characters.

NON-JOINER and JOINER affect how the two neighboring glyphs connect to them, not to each other. As such, they have no direct relationship with ligature formation; in particular, JOINER does not in any way request that its two neighbors be ligatures to each other. Indeed, both NON-JOINER and JOINER may break up ligatures by interrupting the character sequence required to form the ligature.

The precise relationship between cursive appearance and ligated appearance may differ from script to script, and therefore the precise usage of these characters is script-dependent. In the case of Latin typography, corrosiveness (handwriting emulation) and ligatures are independent. Thus the text on Volume 1, p. 77, may be clarified as follows:

f + JOINER + i will not form the ligature . Instead, if cursive versions of the f and i are available in the font, each will independently connect to the JOINER on the appropriate side (having the same appearance as f + i).

Usage of optional ligatures such as is not currently controlled by any codes within the Unicode standard, but is determined by protocols or resources external to the text sequence.

As further illustration, let a hyphen stand for a cursive connection to a preceeding or following letter. In that case, a cursive Latin font would produce the following results:

Unicodes

Rendering

f i s h

f- -i- -s- -h

(optionally using a ligature: - -s- -h)

f i s h

f- -i- -s- -h

f i s h

f i- -s- -h

f i s h

f- i- -s- -h

f i s h

f -i- -s- -h


With regard to the Arabic script, the statements in Volume 1, p. 77, remain correct. In Volume 2, p. 390, according to Arabic rules L2 and L3, the JOINER can be used to get the appearance in parentheses.

With regard to conjuncts in Indic scripts, the statements in Volume 1, pp. 53-56, and Volume 2, pp. 399–414, remain correct. However, for clarity the term ligature should be replaced by the term conjunct throughout pp. 399–414.