Netting C++

EEK!—Time to Design the Mouse

Stanley B. Lippman

Contents

Where to Begin
Sentient Entities

In a series of recent columns, I have been working through the design of a software simulation of a mouse (EEK!) and its environment using the resources of the Microsoft® .NET Framework and C++/CLI, the revised C++ language binding to .NET that was introduced in Visual Studio® 2005. In my last column (msdn.microsoft.com/msdnmag/issues/07/10/nettingC), I looked at how EEK! can be initialized using an XML world description file and the facilities of the System::XML and System::Data namespaces. This time, I'll show you the beginnings of the actual Mouse class design.

Where to Begin

In EEK!, a mouse is a kind of sentient entity. All of us working in .NET recognize that an is-a relationship is represented using class inheritance, which provides the first type/subtype class relationship I'll create:

public ref class SentientEntity abstract { ... };
public ref class Mouse : SentientEntity { ... };

Unlike native C++, under .NET, a class is distinguished by its projected use within your design. Inheritance is supported only by reference classes, indicated in C++/CLI by the contextual keyword, ref, followed by the reserved keyword, class. (Unlike a reserved keyword, which a programmer may never use as an identifier, a contextual keyword is treated as a keyword only in certain program contexts; otherwise, the program is free to use it.) In the case of ref, it is treated as a keyword only if it occurs before the class keyword.

A reference class is a garbage-collected, heap-allocated entity manipulated by a named handle. The C++/CLI reserved keyword, gcnew, is used to allocate memory from the .NET managed heap and construct an instance of the reference class:

Mouse ^milkyWhite = gcnew Mouse;

The hat (^) is a C++/CLI token indicating that milkyWhite is a handle to a reference class object of type Mouse residing on the managed heap.

What this is-a relationship between the Mouse and SentientEntity class implies is that I need to design the abstract SentientEntity base class before I tackle the Mouse class. I'll begin by considering how a SentientEntity functions within the simulation.

Sentient Entities

All entities within the simulation, other than sentient entities, are fully described within the XML world file, as I described in my previous column. Sentient entities are different. The simulation manipulates them through a set of four functions:

  1. void onInit performs all the necessary initialization of the sentient entity with the simulated environment, such as binding the mouse's internal map to the initial world tile where the mouse is released into the simulation.
  2. void onTick performs the actual behavior of the sentient entity on each tick of the simulation. The simulator iterates across each sentient entity within the environment, invoking its onTick method in turn.
  3. void onTerminate performs all cleanup behavior at the close of the simulation, such as persisting the internal state of the mouse to a database.
  4. void onLog provides the sentient entity's logging service at the indicated level of detail. (Recall that default arguments are not supported in .NET.)

In order to generalize these methods, I am going to factor them into a separate interface class:

interface class ISimulation 
{
    void onInit();
    void onTick();
    void onTerminate();
    void onLog( int levelOfDetail );
};

Interfaces represent a refinement of the original object-oriented model as supported by C++. They resolve the complexity associated with multiple inheritance. It is considerably more lightweight and simpler to manage. In EEK!'s design, any class—be it a rock or a rodent, a tree or a titmouse—that wishes to be an active participant in the simulation need only implement the ISimulation interface. In fact, that becomes the definition of agency within EEK!

Because the SentientEntity class does not represent an actual entity in the simulation, but represents the base class from which all actual entities are derived, it must be declared as abstract and declare the methods associated with ISimulation. In C++/CLI, you write that code as shown in Figure 1.

Figure 1 The SentientEntity Abstract Base Class

public ref class SentientEntity abstract
    : ISimulation {
public:
    // ...
    virtual void onInit() abstract;
    virtual void onTick() abstract;
    virtual void onTerminate() abstract;
    virtual void onLog( int levelOfDetail ) abstract;
};

As you can see, the abstract keyword does double duty, both as a class and as a virtual method specifier. Within the class head, it serves to indicate that instances of the class may not be instantiated. SentientEntity is an incomplete class that serves as the root of a hierarchy of specialized derived classes. EEK! manipulates specific kinds of sentient entities, such as cats and mice, through this common abstract base class. It does not need to know the actual nature of its entities: that is the essential quality of a successful object-oriented design.

Within each function prototype, following the signature of the virtual method, the abstract keyword indicates that this is a pure virtual function. That is, there is no implementation associated with this method, and all concrete classes derived from SentientEntity, such as the Mouse, must provide their own implementations.

These uses of the keyword abstract highlight the intention of the class designer in a much clearer fashion than the equivalent ISO-C++ declaration that's pictured here:

class SentientEntity : public ISimulation {
public:
    // ...
    virtual void onInit() = 0;
    virtual void onTick()= 0;
    virtual void onTerminate()= 0;
    virtual void onLog( int levelOfDetail ) = 0;
};

In Stroustrup's original design of what came to be called Release 2.0 of C++ back in the late 1980s, he actually introduced a keyword called abstract in the final weeks prior to release of cfront 2.0 (Stroustrup's C++ compiler on which I worked). However, it was decided that it was too close to the release date to establish a reserved keyword that could potentially break a great deal of code.

A non-contextual keyword, recall, functions as a reserved word and therefore may not be used within the client's program. This is why keywords such as ref and value in C++/CLI were made contextual rather than reserve keywords.

For some reason, the original ISO-C++ standard chose not to reintroduce the abstract keyword. I hope the revised ISO-C++ standard currently underway revisits this issue. Its uses within C++/CLI clearly demonstrate its elegance over the ISO-C++ syntax for the specification of a pure virtual function.

The last methods necessary to complete the implementation of SentientEntity are a constructor and destructor, together with a data member to store a handle to the constructor's one argument, and a property declaration to retrieve it (see Figure 2).

Figure 2 SentientEntity Constructor

public ref class SentientEntity abstract
    : ISimulation {
public:
    ~SentientEntity();
    property Genome^ genome {
        Genome^ get(){ return _genome; }
    }
    // ... ISimulation methods from Figure 1
protected:
    SentientEntity( Genome^ );
private:
    Genome^ _genome;
};

A fundamental design pattern of an object-oriented class hierarchy is the declaration of its root abstract base class constructor as protected (rather than public) and its destructor as both public and virtual.

Because I manipulate the various sentient entities within EEK! indirectly through a SentientEntity handle, to free up the resources associated with that entity the destructor needs to be able to be invoked within the general program. This is why it needs to be public.

Because my handle refers not to an instance of a SentientEntity but to an instance of a concrete derived class, in order to correctly invoke that unknown derived class destructor, the destructor must be virtual. Of course, the astute reader will object, "you do not declare it as virtual in Figure 2."

True, but I did that to make the following point: because it is so necessary for a ref class destructor to be virtual, and because every ref class is implicitly derived from Object, whose destructor is virtual, all ref class destructors within C++/CLI are always treated as virtual. You cannot have a non-virtual ref class destructor.

There are other things that need to be said about destructors with regard to both .NET and C++/CLI, but I will defer those discussions until I design the derived Mouse class.

Although the only handles I manipulate within EEK! are SentientEntities, I am not actually permitted to create an instance of a SentientEntity since it is an abstract class:

// illegal: SentientEntity is abstract
Genome^ getGenome( System::String^ );
SentientEntity^ mw = gcnew SentientEntity( getGenome( "Mouse" ));

Therefore, to prevent such an illegal call, I declare the constructor to be protected. A protected member, recall, is inaccessible to the general program but fully accessible to its derived classes. The only place where the SentientEntity constructor is invoked is within the initialization list of its derived class constructors, such as that of the derived Mouse class.

A private member of a base class is not directly accessible to its derived class. I make the genome member private so that all access is forced to go through the (hopefully) inlined get property (see Figure 2). This way, if I want to instrument its access, it's nicely encapsulated and all the code can go in one easily identified place.

In C++/CLI, there is a convenient shorthand notation for properties that implement get as a field return and set as a field value assignment; it's called a trivial property, and it's shown here:

public value class SmallInt {
public:
    // trivial property
    property int Value;
};

Internally, this is expanded to a non-accessible backing store field of the property type together with the simple get/set accessor pair, as seen here:

public value class SmallInt {
    int backingStore;
public:
    // pseudo-internal expansion of trivial property
    property int Value{
        int get{ return backingStore; }
        void set( int value ){ backingStore = value; }
    }
};

However, it is not possible to restrict a trivial property to generating only a get accessor. (To support that, it would be necessary to extend the declaration syntax to allow for an initial value to be specified. Currently, there is no way to indicate a value to the compiler with which to initialize the backing store.) Therefore, you would only use a trivial property if a set accessor were safe for your application. In this case, it is not.

Finally, there is the question of the relationship between the constructor and onInit, and the destructor and onTerminate. Recall that both onInit and onTerminate are virtual functions. The question I wish to address is whether or not it is a good idea for onInit to be invoked from within the constructor.

When Stroustrup first designed C++, the invocation of a virtual function within a base class constructor caused the derived class instance to be invoked. After all, this reflected the normal semantics of virtual function behavior. In addition, it is trivial to implement. What he missed, at first, is that class initialization does not reflect normal semantics. If it did, we could have virtual constructors.

The problem is that in C++, classes are constructed from the bottom up. When the SentientEntity constructor is executing, the Mouse constructor has not yet been invoked. Therefore, invoking the Mouse's onInit within the SentientEntity constructor results in undefined behavior. In Release 2.0, Stroustrup redefined the language behavior: a virtual function invoked within a base class constructor (or destructor) calls the instance associated with the base class rather than the derived class.

In contrast, in .NET, the virtual instance invoked is that of the derived class. This means that if a virtual method from a base class is invoked from a constructor of a derived class, it's possible that the virtual method could execute before the instance has been fully constructed, and thus before state relied on by the virtual method has been fully initialized. For that reason, the CLR team within Microsoft recommends that we never invoke a virtual function within either a constructor.

The other distinction between onInit and the constructor (or onTerminate and the destructor) is that the two ISimulation methods initialize (or terminate) the class instance to the simulation while the constructor (or destructor) sets up or tears down the internal state of the class instance.

The benefit of this loosely coupled design strategy is that we can implement the simulation engine independently of the Sentient Entities that it manipulates. If we had a group of developers, this would represent a functional partition of the design space. The UI interface would represent a third functional partition.

Send your questions and comments for Stanley to purecpp@microsoft.com.

Stanley B. Lippman is best known for working on C++ in the 1980s and 1990s with its inventor, Bjarne Stroustrup, at Bell Laboratories, and for his introductory textbook, C++ Primer (Addison-Wesley, 2005), now in its fourth edition. Stan also worked with the Visual C++ team at Microsoft on the invention of C++/CLI, which is the focus of this column.