Yes, Value Types Have No Destructors
A reader comments
Sender: Joe Pacheco
re: Integrating Dispose and Finalize into the Language
What about destructors for value classes? Was it considered when you did the language design?
To become truly usable they are still missing couple of things, copy constructors being one of them. It's possible that it has some implementation difficulties but if we abstract from them for a moment, the question is if such value classes with copy constructors and destructors are of any value.
Possible applications could be containers for native objects or managed equivalent of smart pointers. I was also thinking about something similar to gcroot<> implemented with value classes that doesn't need to "root" the managed object.
I’ve forwarded the question to Brandon Bray who is one of the language design leaders of the emerging C++/CLI ECMA standard, and of Microsoft’s C++/CLR implementation of that standard in Visual Studio 2005 (there, I think I’ve mastered that nomenclature). He can detail the design thinking that lead to the original inclusion of a destructor and default constructor for a value type and the implementation obstacle(s) that subsequently led to the removal of these special member functions (delicately referred to as SMFs). The fundamental obstacle is the inability of the compiler to guarantee access to the object at creation and destruction points to invoke the appropriate method.
Both the question and the inevitable detailing of frustrated design intentions end up portraying the CLR as an obstacle to the realization of an at least more perfect C++/CLI language. That is certainly one perspective. I thought I would offer a counter perspective, and … well, then duck.
The fundamental nature of a value type it that it is blitable – that is, we can completely reproduce it with integrity by copying a fixed-size, contiguous segment of memory. All we need to implement this is the address of its first byte, and its total size in bytes. For a value type to be blitable, there are two necessary characteristics. One, it needs to maintain all its state information within itself (yes, this implies that pointers can screw up blitability). Two, it must not not contain auxiliary state that, if corrupted, leaves it undefined – for example, a virtual function table pointer. The canonical value type is an integer. This is the litmus test of every type system.
An entity is not blitable under the following conditions (because this is a blog and not a text, I invoke the right of not being exhaustive):
- It contains a copy constructor or copy assignment operator. Copying now is a semantic rather than physical operation.
- It contains either a member object or base class subobject that is not blitable. This requires memberwise rather than bitwise copy.
- It contains an auxiliary state member that supports implementation and which can be compromised by physical copying. The canonical example is the assignment of a derived class object (not pointer or reference) to a base class object. Except in the trivial case of an identical active set of virtual functions between the base and derived class, the bitwise copying of the derived object to the base class object corrupts the base class object’s internal virtual table pointer. Therefore, a form of memberwise copy is required.
- It contains one or more pointer or reference members that exhibit shallow copy, address a dynamic memory address, and the type itself defines a destructor that reclaims that memory. This is an implementation issue at the user level – a milder form of 3 – but one which has been a pitfall of C++.
That is, the ideal value type is pure state – think of a point with two or more coordinate values. It does not need a copy constructor or copy assignment operator because state is blitable. It does not need a destructor because the extent of state is its lifetime, and it is independent of all other state in other entities of this or any other type. A default constructor is not necessary because there is only state, not infrastructure to set up, and at best an object with a default initialization is in a safe but meaningless state – that is, it can be recognized as requiring initial values. So, an optimal value type – a pure value type – is one without these four SMFs. This is what the CLR provides.
Yes, there are conceptual extensions that we would like to provide. The obvious one, which you mention explicitly and which I discuss in an earlier blog, is to wrap pointers to native types. To do this without massive memory leaks, one needs a destructor. Unfortunately, the compiler is unable to guarantee that it can in all cases intervene before object destruction in order to invoke the method, and therefore it is not really possible to support a destructor for a value type. Is this terrible? Well, it is a real disappointment in the case of wrapping a native pointer in which we create massive numbers of instances since wrapping it in a reference type doubles the number of heap allocations (one for the reference, and one for the native).
How likely a scenario is this? Personally, I’m not sure. I spent a few days more than a year ago walking through a number of graded scenarios using a vector and matrix pair of classes that I had implemented in my days as a graphics programmer. I compiled and timed a program exercising these folks natively, then simply compiled and timed it using IJW. Then I wrapped the folks first in a value type – it leaked – and then a sealed reference, then reimplemented the folks in turn as value and reference types. (Of course, reference types in the original managed extensions don’t support operator overloading so that was kind of useless.) I also wrapped a nonsense query application I had written for the third edition of my C++ Primer. My sense from this exercise is that those things which would gain in performance by a value wrapper are those things that in practice are best ported directly into the CLI. But I have no practical data. Back in the late 1970s and early 1980s when the wrapper pattern became wide-spread in the C++ community, wrapping proved primarily of benefit over complex systems, such as sockets or X windows – or to shield an application from a specific api such as a database in case one had to for whatever reason replace it.
The CLI separates reference and value types in a way that is unnatural to the C++ programmer; that is, it constrains what each type is permitted in the body of its definition. C++ does not separate value and reference semantics in the definition of its types but, rather, in the declaration of its objects. This allows us a continuum rather than the ... oh, discrete quantum types of the CLI. This permits a great deal more flexibility, but at the cost of sometimes spectacular complexity in the understanding of our programs. The exemplar of this is probably best captured in the difference between the C++ template facility and the CLI generic mechanism.