Character Constants

Character constants are one or more members of the “source character set,” the character set in which a program is written, surrounded by single quotation marks ('). They are used to represent characters in the “execution character set,” the character set on the machine where the program executes.

Microsoft Specific

For Microsoft C++, the source and execution character sets are both ASCII.¨

There are three kinds of character constants:

Normal character constants

Multicharacter constants

Wide character constants

Note:

Use wide character constants in place of multicharacter constants to ensure portability and upward compatibility.

Character constants are specified as one or more characters enclosed in single quotation marks. For example:

char ch = 'x'; // Specify normal character constant.

int mbch = 'ab'; // Specify system-dependent

// multicharacter constant.

wchar_t wcch = L'ab'; // Specify wide character constant.

Note that mbch is of type int. If it were declared as type char, the second byte would not be retained. The number of meaningful characters in a multicharacter constant is equal to the expression sizeof( int ). For 16-bit targets (/G0, /G1, and /G2 compilation options), this is 2; for 32-bit targets (flat-model compilation), this is 4. Specifying too many characters for a multicharacter constant generates an error message.

Syntax

character-constant:
'c-char-sequence'
L'c-char-sequence'

c-char-sequence:
c-char
c-char-sequence c-char

c-char:
any member of the source character set except the single quote ('),
backslash (\), or newline character
escape-sequence

escape-sequence:
simple-escape-sequence
octal-escape-sequence
hexadecimal-escape-sequence

simple-escape-sequence: one of
\' \" \? \\
\a \b \f \n \r \t \v

octal-escape-sequence:
\octal-digit
\octal-digit octal-digit
\octal-digit octal-digit octal-digit

hexadecimal-escape-sequence:
\x hexadecimal-digit
hexadecimal-escape-sequence hexadecimal-digit

Microsoft C++ supports normal multicharacter, and wide character constants. Use wide character constants to specify members of the extended execution character set (for example, to support an international application). Normal character constants have type char, multicharacter constants have type int, and wide character constants have type wchar_t. (The type wchar_t is defined in the standard include files STDDEF.H, STDLIB.H, and STRING.H. The wide-character functions, however, are prototyped only in STDLIB.H.)

The only difference in specification between normal and wide character constants is that wide character constants are preceded by the letter L. For example:

char schar = 'x'; // Normal character constant

wchar_t wchar = L'\x81\x19'; // Wide character constant

Table 1.3 shows reserved or nongraphic characters that are system dependent or not allowed within character constants. These characters should be represented with escape sequences.

Table 1.3 C++ Reserved or Nongraphic Characters


Character
ASCII Representation
ASCII Value

Escape Sequence

Newline NL (LF) 10 or 0x0a \n
Horizontal tab HT 9 \t
Vertical tab VT 11 or 0x0b \v
Backspace BS 8 \b
Carriage return CR 13 or 0x0d \r
Formfeed FF 12 or 0x0c \f
Alert BEL 7 \a
Backslash \ 92 or 0x5c \\
Question mark ? 63 or 0x3f \?
Single quotation mark ' 39 or 0x27 \'
Double quotation mark " 34 or 0x22 \"
Octal number ooo ––– \ooo
Hexadecimal Number hhh ––– \xhhh
Null character NUL 0 \0

Important:

If the character following the backslash does not specify a legal escape sequence, the result is implementation defined. In Microsoft C++, the character following the backslash is taken literally, as though the escape were not present, and a level 1 warning (“unrecognized character escape sequence”) is issued.

Octal escape sequences, specified in the form \ooo, consist of a backslash and one, two, or three octal characters. Hexadecimal escape sequences, specified in the form \xhhh, consist of the characters \x followed by a sequence of hexadecimal digits. Unlike octal escape constants, there is no limit on the number of hexadecimal digits in an escape sequence.

Octal escape sequences are terminated by the first character that is not an octal digit, or when three characters are seen. For example:

wchar_t och = L'\076a'; // Sequence terminates at a

char ch = '\233'; // Sequence terminates after 3 characters

Similarly, hexadecimal escape sequences terminate at the first character that is not a hexadecimal digit. Because hexadecimal digits include the letters a through f (and A through F), make sure the escape sequence terminates at the intended digit.

Because the single quotation mark (') encloses character constants, use the escape sequence \' to represent enclosed single quotation marks. The double quotation mark (") can be represented without an escape sequence. The backslash character (\) is a line-continuation character when placed at the end of a line. If you want a backslash character to appear within a character constant, you must type two backslashes in a row (\\). (See Appendix A, “Phases of Translation,” for more information about line continuation.)