Character Text

Character text has the following syntax:

<char> <ptext> | <atext> | '{' <char> '}'
<ptext> (<chrfmt>* <data>+ )+
<data> #PCDATA | <spec> | <pict> | <obj> | <do> | <foot> | <annot> | <field> | <idx> | <toc> | <book>

Font (character) Formatting Properties

These control words (described as <chrfmt> in the syntax description) change font (character) formatting properties. A control word preceding plain text turns on the specified attribute. Some control words (indicated in the following table by an asterisk following the description) can be turned off by the control word followed by 0 . For example, \b turns on bold, while \b0 turns off bold.

The font (character)-formatting control words are listed in the following table.

Control word Meaning
\plain Reset font (character) formatting properties to a default value defined by the application (for example, bold, underline and italic are disabled; font size is reset to 12 pt). The associated font (character) formatting properties (described in the section "Associated Font (character) Properties" on page 37 of this Application Note) are also reset.
\animtextN Animated text properties.

1Las Vegas Lights

2Blinking background

3Sparkle text

4Marching black ants

5Marching red ants

6Shimmer

\b Bold.*
\caps All capitals.*
\charscalexN Character scaling value. The N argument is a value representing a percentage (the default is 100).
\deleted Marks the text as deletion revision marked.*
\dnN Subscript position in half-points (the default is 6).
\embo Emboss.
\impr Engrave.
\sub Subscripts text and shrinks point size according to font information.
\nosupersub Turns off superscripting or subscripting.
\expndN Expansion or compression of the space between characters in quarter-points; a negative value compresses (the default is 0).
\expndtwN Expansion or compression of the space between characters in twips; a negative value compresses. For backward compatibility, both \expndtw and \expnd should be emitted.
\kerningN Point size (in half-points) above which to kern character pairs. \kerning0 turns off kerning.
\fN Font number. N refers to an entry in the font table.
\fsN Font size in half-points (the default is 24).
\i Italic.*
\outl Outline.*
\scaps Small capitals.*
\shad Shadow.*
\strike Strikethrough.*
\strikedl Double strikethrough.
\ul Continuous underline. \ul0 turns off all underlining.
\uld Dotted underline.
\uldash Dash underline.
\uldashd Dot dash underline.
\uldashdd Dot dot dash underline.
\uldb Double underline.
\ulnone Stops all underlining.
\ulth Thick underline
\ulw Word underline.
\ulwave Wave underline.
\upN Superscript position in half-points (the default is 6).
\super Superscripts text and shrinks point size according to font information.
\v Hidden text.*
\cfN Foreground color (the default is 0).
\cbN Background color (the default is 0).
\rtlch The character data following this control word will be treated as a right-to-left run.
\ltrch The character data following this control word will be treated as a left-to-right run (the default).
\csN Designates character style. If a character style is specified, style properties must be specified with the character run. N refers to an entry in the style table.
\cchsN Indicates any characters not belonging to the default document character set and tells which character set they do belong to. Macintosh character sets are represented by values greater than 255. The values for N correspond to the values for the \ fcharset control word.
\langN Applies a language to a character. N is a number corresponding to a language. The \plain control word resets the language property to the language defined by \deflangN in the document properties.

The following table defines the standard languages used by Microsoft. This table was generated by the Unicode group for use with TrueType and Unicode.

Language name Language ID
No language 0x0400
Albanian 0x041c
Arabic 0x0401
Bahasa 0x0421
Belgian Dutch 0x0813
Belgian French 0x080c
Brazilian Portuguese 0x0416
Bulgarian 0x0402
Catalan 0x0403
Croato-Serbian (Latin) 0x041a
Czech 0x0405
Danish 0x0406
Dutch 0x0413
English (Australian) 0x0c09
English (U.K.) 0x0809
English (U.S.) 0x0409
Finnish 0x040b
French 0x040c
French (Canadian) 0x0c0c
German 0x0407
Greek 0x0408
Hebrew 0x040d
Hungarian 0x040e
Icelandic 0x040f
Italian 0x0410
Japanese 0x0411
Korean 0x0412
Norwegian (Bokmal) 0x0414
Norwegian (Nynorsk) 0x0814
Polish 0x0415
Portuguese 0x0816
Rhaeto-Romanic 0x0417
Romanian 0x0418
Russian 0x0419
Serbo-Croatian (Cyrillic) 0x081a
Simplified Chinese 0x0804
Slovak 0x041b
Spanish (Castilian) 0x040a
Spanish (Mexican) 0x080a
Swedish 0x041d
Swiss French 0x100c
Swiss German 0x0807
Swiss Italian 0x0810
Thai 0x041e
Traditional Chinese 0x0404
Turkish 0x041f
Urdu 0x0420
Sesotho (Sotho) 0x0430
Afrikaans 0x0436
Zulu 0x0435
Xhosa 0x0434
Venda 0x0433
Tswana 0x0432
Tsonga 0x0431
Farsi (Persian) 0x0429

To read negative \expnd values from Word for the Macintosh, an RTF reader should use only the low-order 6 bits of the value read. Word for the Macintosh does not emit negative values for \expnd. Instead, it treats values from 57 through 63 as –7 through –1, respectively (the low-order 6 bits of 57 through 63 are the same as –7 through –1).

Character Borders and Shading

Character shading has the following syntax.

<shading> (\chshdng | <pat>) \chcfpat? \chcbpat?
<pat> \chbghoriz | \chbgvert | \chbgfdiag | \chbgbdiag | \chbgcross | \chbgdcross | \chbgdkhoriz | \chbgdkvert | \chbgdkfdiag | \chbgdkbdiag | \chbgdkcross | \chbgdkdcross

Control word Meaning
\chbrdr Character border (border always appears on all sides).
\chshdngN Character shading. The N argument is a value representing the shading of the text in hundredths of a percent.
\chcfpatN N is the color of the background pattern, specified as an index into the document’s color table.
\chcbpatN N is the fill color, specified as an index into the document's color table.
\chbghoriz Specifies a horizontal background pattern for the text.
\chbgvert Specifies a vertical background pattern for the text.
\chbgfdiag Specifies a forward diagonal background pattern for the text (\\\\).
\chbgbdiag Specifies a backward diagonal background pattern for the text (////).
\chbgcross Specifies a cross background pattern for the text.
\chbgdcross Specifies a diagonal cross background pattern for the text.
\chbgdkhoriz Specifies a dark horizontal background pattern for the text.
\chbgdkvert Specifies a dark vertical background pattern for the text.
\chbgdkfdiag Specifies a dark forward diagonal background pattern for the text (\\\\).
\chbgdkbdiag Specifies a dark backward diagonal background pattern for the text (////).
\chbgdkcross Specifies a dark cross background pattern for the text.
\chbgdkdcross Specifies a dark diagonal cross background pattern for the text.

The color, width, and border style keywords for character borders are the same as the keywords for paragraph borders.

Control word Meaning
Track Changes (Revision Mark) properties
\revised Text has been added since revision marking was turned on.
\revauthN Index into the revision table. The content of the Nth group in the revision table is considered to be the author of that revision.
\revdttmN Time of the revision. The 32-bit DTTM structure is emitted as a long integer.
\crauthN Index into the revision table. The content of the Nth group in the revision table is considered to be the author of that revision.

Note This keyword is used to indicate formatting revisions, such as bold, italic, and so on.

\crdateN Time of the revision. The 32-bit DTTM structure is emitted as a long integer.
\revauthdelN Index into the revision table. The content of the Nth group in the revision table is considered to be the author of that deletion.
\revdttmdelN Time of the deletion. The 32-bit DTTM structure is emitted as a long integer.

Associated Character Properties

Bidirectional-aware text processors often need to associate a Latin (or other left-to-right) font with an Arabic or Hebrew (or other right-to-left) font. The association is needed to match commonly used pairs of fonts in name, size, and other attributes. Although RTF defines a broad variety of associated character properties, any implementation may choose not to implement a particular associated character property and share the property between the Latin and Arabic fonts.

Property association uses the following syntax:

<atext> <ltrrun> | <rtlrun>
<ltrrun> \rtlch \af & <aprops>* \ltrch <ptext>
<rtlrun> \ltrch \af & <aprops>* \rtlch <ptext>

Here are some examples of property association:

\ltrch\af2\ab\au\rtlch\u Sample Text

This is a right-to-left run. Text will use the default bidirectional font, and will be underlined. The left-to-right font associated with this run is font 2 (in the font table) with bolding and underlining.

\plain\rtlch\ltrch Sample Text 

This is a left-to-right run. The right-to-left font and the left-to-right font use the default font (specified by \deff).

\rtlch\af5\ab\ai\ltrch\u Sample Text

This is a left-to-right run. The right-to-left font is font 5, bold and italicized. The left-to-right font is the default font, underlined. If the reader does not support underlining in the associated font, both fonts will be underlined.

The property association control words (described as <aprops> in the syntax description) are listed in the following table. Some control words (indicated in the following table by an asterisk following the description) can be turned off by the control word followed by 0 .

Control word Meaning
\ab Associated font is bold.*
\acaps Associated font is all capitals.*
\acfN Associated foreground color (the default is 0).
\adnN Associated font is subscript position in half-points (the default is 6).
\aexpndN Expansion or compression of the space between characters in quarter-points; a negative value compresses (the default is 0).
\afN Associated font number (the default is 0).
\afsN Associated font size in half-points (the default is 24).
\ai Associated font is italic.*
\alangN Language ID for the associated font. (This uses the same language ID codes described on page 35 of this Application Note.)
\aoutl Associated font is outline.*
\ascaps Associated font is small capitals.*
\ashad Associated font is shadow.*
\astrike Associated font is strikethrough.*
\aul Associated font is continuous underline. \aul0 turns off all underlining for the alternate font.
\auld Associated font is dotted underline.
\auldb Associated font is double underline.
\aulnone Associated font is no longer underlined.
\aulw Associated font is word underline.
\aupN Superscript position in half-points (the default is 6).

Highlighting

This property applies highlighting to text. The formatting is not a character format, so it cannot be part of a style definition.

Control Word Definition
\highlightN Highlights the specified text. N specifies the color.

For \highlight, the N argument can have the following values:

Value Description
1 Black
2 Blue
3 Cyan
4 Green
5 Magenta
6 Red
7 Yellow
8 Unused
9 Dark Blue
10 Dark Cyan
11 Dark Green
12 Dark Magenta
13 Dark Red
14 Dark Yellow
15 Dark Gray
16 Light Gray

Special Characters

The RTF Specification includes control words for special characters (described as <spec> in the character-text syntax description). If a special-character control word is not recognized by the RTF reader, it is ignored, and the text following it is considered plain text. The RTF Specification is flexible enough to allow new special characters to be added for interchange with other software.

The special RTF characters are listed in the following table.

Control word Meaning
\chdate Current date (as in headers).
\chdpl Current date in long format (for example, Thursday, October 28, 1997).
\chdpa Current date in abbreviated format (for example, Thu, Oct 28, 1997).
\chtime Current time (as in headers).
\chpgn Current page number (as in headers).
\sectnum Current section number (as in headers).
\chftn Automatic footnote reference (footnotes follow in a group).
\chatn Annotation reference (annotation text follows in a group).
\chftnsep Anchoring character for footnote separator.
\chftnsepc Anchoring character for footnote continuation.
\cell End of table cell.
\row End of table row.
\par End of paragraph.
\sect End of section and paragraph.
\page Required page break.
\column Required column break.
\line Required line break (no paragraph break).
\softpage Nonrequired page break. Emitted as it appears in galley view.
\softcol Nonrequired column break. Emitted as it appears in galley view.
\softline Nonrequired line break. Emitted as it appears in galley view.
\softlheightN Nonrequired line height. This is emitted as a prefix to each line.
\tab Tab character.
\emdash Em-dash (—).
\endash En-dash (–).
\emspace Nonbreaking space equal to width of character "m" in current font. Some old RTF writers use the construct ‘{\emspace }’ (with two spaces before the closing brace) to trick readers unaware of \emspace into parsing a regular space. A reader should interpret this as an \emspace and a regular space.
\enspace Nonbreaking space equal to width of character "n" in current font. Some old RTF writers use the construct ‘{\enspace }’ (with two spaces before the closing brace) to trick readers unaware of \enspace into parsing a regular space. A correct reader should interpret this as an \enspace and a regular space.
\bullet Bullet character.
\lquote Left single quotation mark.
\rquote Right single quotation mark.
\ldblquote Left double quotation mark.
\rdblquote Right double quotation mark.
\| Formula character. (Used by Word 5.1 for the Macintosh as the beginning delimiter for a string of formula typesetting commands.)
\~ Nonbreaking space.
\- Optional hyphen.
\_ Nonbreaking hyphen.
\: Specifies a subentry in an index entry.
\* Marks a destination whose text should be ignored if not understood by the RTF reader.
\'hh A hexadecimal value, based on the specified character set (may be used to identify 8-bit values).
\ltrmark The following characters should be displayed from left to right; usually found at the start of \ltrch runs.
\rtlmark The following characters should be displayed from right to left; usually found at the start of \rtlch runs.
\zwj Zero-width joiner. This is used for ligating (joining) characters.
\zwnj Zero-width nonjoiner. This is used for unligating a character.

A carriage return (character value 13) or linefeed (character value 10) will be treated as a \par control if the character is preceded by a backslash. You must include the backslash; otherwise, RTF ignores the control word. (You may also want to insert a carriage-return/linefeed pair without backslashes at least every 255 characters for better text transmission over communication lines.)

A tab (character value 9) should be treated as a \tab control word. Not all RTF readers understand this; therefore, an RTF writer should always emit the control word for tabs.

The following are the code values for the special characters listed.

Control word Word for Windows and OS/2 Apple Macintosh
\bullet 149 0xA5
\endash 150 0xD1
\emdash 151 0xD0
\lquote 145 0xD4
\rquote 146 0xD5
\ldblquote 147 0xD2
\rdblquote 148 0xD3