19.1.5 Problem #5: Cast versus Parenthesized Expression

Consider the production:

CastExpression:
( PrimitiveType ) UnaryExpression
( ReferenceType ) UnaryExpressionNotPlusMinus

Now consider the partial input:

class Problem5 { Problem5() { super((matthew)

When the parser is considering the token matthew, with one-token lookahead to symbol ), it cannot yet tell whether (matthew) will be a parenthesized expression, as in:

super((matthew), 9);

or a cast, as in:

super((matthew)baz, 9);

Therefore, after the parser reduces matthew to the nonterminal Name, it cannot tell with only one-token lookahead whether Name should be further reduced to PostfixExpression and ultimately to Expression (for a parenthesized expression) or to ClassOrInterfaceType and then to ReferenceType (for a cast). Therefore, the productions shown above result in a grammar that is not LALR(1).

The solution is to eliminate the use of the nonterminal ReferenceType in the definition of CastExpression, which requires some reworking of both alternatives to avoid other ambiguities:

CastExpression:
( PrimitiveType Dimsopt ) UnaryExpression
( Expression ) UnaryExpressionNotPlusMinus
( Name Dims ) UnaryExpressionNotPlusMinus

This allows the parser to reduce matthew to Expression and then leave it there, delaying the decision as to whether a parenthesized expression or a cast is in progress. Inappropriate variants such as:

(int[])+3

and:

(matthew+1)baz

must then be weeded out and rejected by a later stage of compiler analysis.

The remaining sections of this chapter constitute a LALR(1) grammar for Java syntax, in which the five problems described above have been solved.