Virtual member functions, virtual destructors, pure virtual functions, protected data members.

I will demonstrate the use of polymorphism in an example of a data structure — the arithmetic tree. An arithmetic expression can be converted into a tree structure whose nodes are arithmetic operators and leaf nodes are numbers. Figure 2-3 shows the example of a tree that corresponds to the expression 2 * (3 + 4) + 5. Analyzing it from the root towards the leaves we first encounter the plus node, whose children are the two terms that are to be added. The left child is a product of two factors. The left factor is number 2 and the right factor is the sum of 3 and 4. The right child of the top level plus node is number 5. Notice that the tree representation doesn't require any parentheses or the knowledge of operator precedence. It uniquely describes the calculation to be performed.

We will represent the nodes of the arithmetic tree as objects inheriting from a single class `Node`. The direct descendants of the Node are `NumNode` representing a number and `BinNode` representing a binary operator. For simplicity, we will restrict ourselves to only two classes derived from `BinNode`, the `AddNode` and the `MultNode`. Figure 2-4 shows the class hierarchy I have just described. Abstract classes are the classes that cannot be instantiated, they only serve as parents for other classes. I'll explain this term in a moment

What are the operations we would like to perform on a node? We would like to be able to calculate its value and, at some point, destroy it. The `Calc` method returns a double as the result of the calculation of the node's value. Of course, for some nodes the calculation may involve the recursive calculations of its children. The method is `const` since it doesn't change the node itself. Since each type of node has to provide its own implementation of the `Calc` method, we make this function virtual. However, there is no "default" implementation of `Calc` for an arbitrary `Node`. The function that has no implementation (inherited or otherwise) is called ** pure virtual**. That's the meaning of

A class that has one or more pure virtual functions is called an ** abstract class** and it cannot be instantiated (no object of this class can be created). Only classes that are derived from it, and which provide their own implementations of all the pure virtual functions, can be instantiated. Notice that our sample arithmetic tree has instances of

A rule of thumb is that, if a class has a virtual function, it probably needs a virtual destructor as well--and once we decide to pay the overhead of a vtable pointer, all subsequent virtual functions don't increase the size of the object. So, in such a case, adding a virtual destructor doesn't add any significant overhead.

In our case we can anticipate that some of the descendant nodes will have to destroy their children in their destructors, so we really need a virtual destructor. A destructor cannot be made pure virtual, because it is actually called by the destructors of the derived classes. That's why I gave it an empty body. (Even though I made it inline, the compiler will create a function body for it, because it needs to stick a pointer to it into the virtual table).

class Node { public: virtual ~Node () {} virtual double Calc () const = 0; };

`NumNode` stores a `double` value that is initialized in its constructor. It also overrides the `Calc` virtual function. In this case, `Calc` simply returns the value stored in the node.

class NumNode: public Node { public: NumNode (double num) : _num (num ) {} double Calc () const; private: const double _num; }; double NumNode::Calc () const { cout << "Numeric node " << _num << endl; return _num; }

`BinNode` has two children that are pointers to nodes. They are initialized in the constructor and deleted in the destructor — this is why I could make them `const` pointers (but not *pointers to* `const`, since I have to call the non-`const` method on them—the destructor). The `Calc` method is still pure virtual, inherited from `Node`, only the descendants of `BinNode` will know how to implement it.

class BinNode: public Node { public: BinNode (Node * pLeft, Node * pRight) : _pLeft (pLeft), _pRight (pRight) {} ~BinNode (); protected: Node * const _pLeft; Node * const _pRight; }; BinNode::~BinNode () { delete _pLeft; delete _pRight; }

This is where you first see the advantage of polymorphism. A binary node can have children which are arbitrary nodes. Each of them can be a number node, an addition node, or a multiplication node. There are nine possible combinations of children — it would be silly to make separate classes for each of them (consider, for instance, `AddNodeWithLeftMultNodeAndRightNumberNode`). We had no choice but to accept and store pointers to children as more general pointers to `Nodes`. Yet, when we call destructors through them, we need to call different functions to destroy different nodes. For instance, `AddNode` has a different destructor than a `NumNode` (which has an empty one), and so on. This is why we had to make the destructors of `Nodes` virtual.

Notice that the two data members of `BinNode` are not `private`—they are `protected`. This qualification is slightly weaker than `private`. A private data member or method cannot be accessed from any code outside of the implementation of the given class (or its friends). Not even from the code of the *derived* class. Had we made `_pLeft` and `_pRight` private, we'd have to provide public methods to set and get them. That would be tantamount to exposing them to everybody. By making them `protected` we are letting classes *derived* from `BinNode` manipulate them, but, at the same time, bar anybody else from doing so.

**Table 1**

Access specifier |
Who can access such member? |

public |
anybody |

protected |
the class itself, its friends and derived classes |

private |
only the class itself and its friends |

The class `AddNode` is derived from `BinNode`.

class AddNode: public BinNode { public: AddNode (Node * pLeft, Node * pRight) : BinNode (pLeft, pRight) {} double Calc () const; };

It provides its own implementation of `Calc`. This is where you see the advantages of polymorphism again. We let the child nodes calculate themselves. Since the `Calc` method is virtual, they will do the right thing based on their actual class, and not on the class of the pointer (`Node *`). The two results of calling `Calc` are added and the sum returned.

double AddNode::Calc () const { cout << "Adding\n"; return _pLeft->Calc () + _pRight->Calc (); }

Notice how the method of `AddNode` directly accesses its parent's data members `_pLeft` and `_pRight`. Were they declared private, such access would be flagged as an error by the compiler.

For completeness, here's the implementation of the `MultNode` and a simple test program.

class MultNode: public BinNode { public: MultNode (Node * pLeft, Node * pRight) : BinNode (pLeft, pRight) {} double Calc () const; }; double MultNode::Calc () const { cout << "Multiplying\n"; return _pLeft->Calc () * _pRight->Calc (); } int main () { // ( 20.0 + (-10.0) ) * 0.1 Node * pNode1 = new NumNode (20.0); Node * pNode2 = new NumNode (-10.0); Node * pNode3 = new AddNode (pNode1, pNode2); Node * pNode4 = new NumNode (0.1); Node * pNode5 = new MultNode (pNode3, pNode4); cout << "Calculating the tree\n"; // tell the root to calculate itself double x = pNode5->Calc (); cout << x << endl; delete pNode5; // and all children }Next: Exercises