String Literals

A string literal consists of zero or more characters from the source character set surrounded by double quotation marks ("). A string literal represents a sequence of characters, which, taken together, forms a null-terminated string. While some C++ class libraries, including the Microsoft libraries, supply sophisticated string-handling functionality, the strings defined in the language are relatively simple.

Syntax

string-literal:
"s-char-sequenceopt"
L"s-char-sequenceopt"

s-char-sequence:
s-char
s-char-sequence s-char

s-char:
any member of the source character set except double quotation marks ("),
backslash (\), or newline
escape-sequence

C++ strings have these types:

Array of char[n], where n is the length of the string (in characters) plus 1 for the terminating '\0' that marks the end of the string

Array of wchar_t, for wide-character strings

The result of modifying a string constant is undefined. For example:

char *szStr = "1234";

szStr[2] = 'A';// Results undefined

Microsoft Specific

In some cases, identical string literals may be “folded”to save space in the executable file. In string-literal folding, the compiler causes all references to a particular string literal to point to the same location in memory, instead of having each reference point to a separate instance of the string literal:

#include <iostream.h>

#include <string.h>

// Define two pointers that refer to identical

// string literals.

char *sz1 = "A String";

char *sz2 = "A String";

void main()

{

// Reverse sz1

for( int i = 0, j = strlen( sz1 ) - 1; i < j; ++i, --j )

{

char chTmp = sz1[i];

sz1[i] = sz1[j];

sz1[j] = chTmp;

}

// Display the result of the program.

cout << "sz1 = " << sz1 << endl;

cout << "sz2 = " << sz2 << endl;

}

If the literals are not folded, the output of the program is:

sz1 = gnirtS A

sz2 = A String

However, if the strings are folded, the output of the program is:

sz1 = gnirtS A

sz2 = gnirtS A¨

When specifying string literals, adjacent strings are concatenated. Therefore, this declaration:

char szStr[] = "12" "34";

is identical to this declaration:

char szStr[] = "1234";

This concatenation of adjacent strings makes it easy to specify long strings across multiple lines:

cout << "Four score and seven years "

"ago, our forefathers brought forth "

"upon this continent a new nation.";

In the preceding example, the entire string “Four score and seven years ago, our forefathers brought forth upon this continent a new nation.” is spliced together. This string might also have been specified using line splicing as follows:

cout << "Four score and seven years \

ago, our forefathers brought forth \

upon this continent a new nation.";

After all adjacent strings in the constant have been concatenated, the NULL character, '\0', is appended to provide an end-of-string marker for C string-handling functions.

When the first character of the first string is an escape character, string concatenation can yield surprising results. Consider the following two declarations:

char szStr1[] = "\01" "23";

char szStr2[] = "\0123";

While it is natural to assume that szStr1 and szStr2 contain the same values, the values they actually contain are shown in Figure 1.1.

Microsoft Specific

The maximum length of a string literal is 2,048 bytes. This limit applies both to strings of type char[]and wchar_t[].¨Determine the size of string objects by counting the number of characters and adding 1 for the terminating '\0' .

Because the double quotation mark (") encloses strings, use the escape sequence (\") to represent enclosed double quotation marks. The single quotation mark (') can be represented without an escape sequence. The backslash character (\) is a line-continuation character when placed at the end of a line. If you want a backslash character to appear within a string, you must type two backslashes (\\). (For more information about line continuation, see Appendix A, “Phases of Translation.”)

To specify a string of type wide character (wchar_t[]), precede the opening double quotation mark with the character L. For example:

wchar_t wszStr[] = L"1a1g";

All normal escape codes listed in the “Character Constants” example on topic are valid in string constants. For example:

cout << "First line\nSecond line";

cout << "Error! Take corrective action\a";

Because the escape code terminates at the first character that is not a hexadecimal digit, specification of string constants with embedded hexadecimal escape codes can cause unexpected results. The following example is intended to create a string literal containing ASCII 5, followed by the characters five:

"\x05five"

The actual result is a hexadecimal 5F, which is the ASCII code for an underscore, followed by the characters ive. The following example produces the desired results:

"\005five" // Use octal constant.

"\x05" "five" // Use string splicing.