Chapter 1 Lexical Conventions

This chapter introduces the fundamental elements of a C++ program, as they are meaningful to the compiler. These elements, called “lexical elements,” are used to construct statements, definitions, declarations, and so on, which are used to construct complete programs. These elements are:

Tokens

Comments

Identifiers

C++ keywords

Punctuators

Operators

Literals

Although the C++ operators are summarized in this chapter, a complete discussion of operators is deferred until Chapter 4, “Expressions.”

C++ programs, like C programs, consist of one or more files. Each of these files is translated in the following conceptual order (the actual order follows the “as if” rule: translation must occur as if these steps had been followed):

1.Lexical tokenizing. In this translation phase, character mapping and trigraph processing, line splicing, and tokenization are performed.

2.Preprocessing. This translation phase brings in ancillary source files referenced by #include directives, handles “stringizing” and “charizing” directives, and performs token pasting and macro expansion (see Chapter 13, “Preprocessing,” for more information about preprocessor behavior). The result of the preprocessing phase is a sequence of “tokens,” which, taken together, defines a “translation unit.”

Preprocessor directives always begin with the number-sign (#) character (that is, the first non-white-space character on the line must be a number sign). Only one preprocessor directive can appear on a given line. For example:

#include <iostream.h> // Include text of iostream.h in

// translation unit.

#define NDEBUG // Define NDEBUG (NDEBUG contains empty

// text string).

3.Code generation. This translation phase uses the tokens generated in the preprocessing phase to generate object code.

During this phase, syntactic and semantic checking of the source code is performed.

See Appendix A, “Phases of Translation,” for more specific information about how a source program is translated.

Note:

The C++ preprocessor is a strict superset of the ANSI C preprocessor. It differs in its support for the single-line comment, its definition of the __cplusplus constant, and in its support of the C++ operators:

.*

–>*

::

(For more information about these operators, see “Operators” and Chapter 4, “Expressions”; for more information about comments, see “Comments”.)