Reliable Software Logo

C++ In Action: Language

Specification

The purpose of the program is to accept arithmetic expressions from the user, evaluate them, and display the results. The expressions are to be parsed by a simple top-down recursive descent parser (if you have no idea what I'm talking about, don't panic-it's just a name for what I'm going to describe in detail later). The parser's goal is to convert the input string into an arithmetic tree. The simple grammar accepted by the parser is defined as follows: The parser is looking for an expression.

  1. An expression is
    1. a term followed by a plus or a minus sign followed by another expression.
    2. Sometimes an expression doesn’t contain any plus or minus signs, and then it is just equal to a term.
  2. A term is
    1. a factor multiplied or divided by another term.
    2. Sometimes a term contains no multiplication or division and then it is just equal to a factor.
  3. A factor can be
    1. a number,
    2. an identifier corresponding to a variable,
    3. a minus sign followed by a factor (unary minus), or
    4. the whole expression enclosed within parentheses.

For instance, the expression 1 + x * (2 - y) is parsed as in Figure 2-5

Example

Figure 2-5 The parsing of an arithmetic expression. The arrows with numbers correspond to rules of grammar that were used to make each parsing step.


This grammar is simple and natural, but it has one flaw—the arithmetic operators are right associative instead of being left associative. That means, for instance, that a - b + c will be parsed as a - (b + c) which is not exactly what we’d expect. There are standard ways of fixing this grammar or modifying the parser, but that’s beyond the scope of this section. (We will revisit this issue later.)

Since we want our calculator to be able to use variables, we will need a symbol table. A symbol table remembers names of variables. We will reuse the string table described in the previous section.

We’ll also need to be able to assign values to variables. We can do it by expanding the definition of expression to include the following clause

  1. An expression can also be
      a term followed by the equal sign followed by an expression.

Not every term though is a correct left-hand side of an assignment. It has to be an lvalue—something that can be the left hand side of an assignment. In our case the only allowed type of lvalue will be an identifier corresponding to a variable.

We will also introduce a scanner, which will convert the input string into a series of tokens. It will recognize arithmetic operators, numbers, and identifiers. This way the parser will have an easier job--matching sequences of tokens to grammatical productions.


NextNext: Stubbed Implementation