Transactions: a personal journey

(by Dave Detlefs)


For me personally my involvement in the current Microsoft transactional memory project is my second attempt to tilt at the windmill of integrating transactional concepts more tightly into general-purpose programming languages. In the late 1980's, while working on Ph.D., I was a member of the Avalon/C++ project at Carnegie-Mellon University, working with Professors Jeannette Wing and Maurice Herlihy. This was an attempt to build support for transactions into C++. This effort grew alongside CMU's Camelot project, headed by Alfred Spector (now at Google), a C library that provided transactional semantics. This required the user to insert calls to begin and end transactions, to lock and log memory before updating it, etc -- in short, it provided all the capabilities you need, but it also had all the convenience and safety issues any library-based systems have: you have to go through the trouble of inserting all the right calls, and if you don't, you don't get any checking, just difficult-to-debug errors. So augmenting a language to insert the appropriate library calls automatically, based on some higher-level specification of atomicity, was obviously attractive. Since the main C++ available at the time was still AT&T's "cfront" C++ compiler, which translated C++ to C, it was quite natural to modify the compiler to augment the translation with the required Camelot library calls.


Avalon/C++ was just one example of a small academic cottage industry. The University of Newcastle had a quite similar effort called Arjuna. But probably the best known of these was the Argus language project done by Barbara Liskov's group at MIT. (If I may briefly indulge in a bit of abject fandom, I think Professor Liskov is one of the real giants in the history of academic programming language research. I'm hardly an unbiased observer: she was my original undergraduate advisor, and I did my undergraduate thesis work in a closely-allied group. I base my statement on the fact that I started to become a mature programmer while programming in the CLU language that she and her team designed and implemented. If you go look this up, you will be struck by how completely this language anticipated the design of current commercially successful languages, like Java or C# -- and I was able to use it *in 1979*. Of course, other languages extant at the time had similar ideas, especially Simula, but CLU had (almost) everything: strong typing, garbage collection, data abstraction, even generics and iterators. This is by no means meant to diminish the achievements of the designers of C# or other languages: the task of the designer of a commercial language is usually to tastefully choose which bold academic ideas from 10-20 years ago were successful.)


Argus and Avalon/C++ were different from similar to current transactional memory efforts in many ways, but differed in one very fundamental way. Transactional memory, as this term is currently understood, takes the database "ACID" properties, and strips this down to A ("atomic") and I ("isolated"). C ("consistent") doesn't really apply -- in the database world, you're allowed to specify declaratively invariants that must hold, and at the end of a transaction, the transaction system checks whether you might have invalidated one of those "consistency invariants." If so, it doesn't commit the transaction, thus preserving this user-defined notion of "consistency". Transactional memory doesn't usually include such a notion of consistency (thought the word "consistency" is used in the TM literature for a quite different concept -- whether the current transaction has seen a consistent view of memory when optimistic reading techniques are used).


This leaves D = "durability." In the database world, this property means that if a transaction commits, its effects will survive subsequent system crashes. That is, if a crash happens, the state after the system recovers will reflect all the committed transactions -- if you're ATM tells you that you've successfully transferred $10,000 into your account, and then there's a city-wide power failure immediately thereafter, you can go home in the dark confident that your money is there. Current TM systems (including the one we're working on) treat durability as a non-goal -- they're just trying to replace locks as a synchronization mechanism (and, at least in our case, to simplify error-handling code via failure atomicity).


The previous group of languages, however, did make durability a goal. I'll describe the Argus design for this; the Avalon/C++ and Arjuna designs were fairly similar. Argus, like its predecessor CLU, provides the ability to define abstract data types (i.e., classes). In addition, it allows the definition of a special form of abstract type called a guardian: a guardian is an abstraction whose state is durable, and will be recovered to a transaction-consistent state after a crash. Guardians are somewhat like CLR app domains, or the "agents" that communicate in concurrent languages based on message passing (e.g., CSP or Hewitt’s Actors language), in the following way: there is no sharing of objects between guardians -- each guardian encapsulates a set of objects disjoint from all others. Operations on guardians are like remote procedure calls, with any non-primitive arguments (or return values) marshaled by deep copy, to preserve the no-sharing rule. Guardians live on nodes, i.e., machines, but you're not supposed to care whether a guardian you're communicating with is on the same or a different node -- in either case, its operations are remote procedure calls that execute on the guardian as transactions (and in fact nested transactions if the invoking code is already in a transaction, as when one guardian operation invokes an operation on a second guardian); the caller, as usual with network RPC's, must be prepared for the call to raise a "failure" exception for arbitrary reasons. But note that now it knows that the system is in a reasonable state, since that failure indicates that the operation's transaction rolled back, restoring invariants. If the calling code is also in a transaction, it now always has the option of aborting itself and raising a failure exception to its caller.


The operations themselves may introduce internal concurrency -- and this concurrency may also be managed by (sub)transactions: Argus already had the concept of "parallel nested transactions." As with a normal abstract type, the guardian state is a set of instance variables. Some of these are labeled "stable", and the rest are considered to be "volatile" -- but this doesn't mean what it means in C/C++ or similar languages. The stable variables determine the guardian state that should survive a crash, and the volatile variables should represent derived state that can be recomputed from the stable state after a crash (like caches or database indices).


The type system is modified to define both atomic and non-atomic types. The basic primitive types are atomic, but aggregate type constructors like arrays or structs now come in two flavors, atomic and not. The atomic versions handle transaction locking and logging. Since Argus was an object-oriented system, it's not enough to say that the stable instance variables of a guardian is recovered: we also have to worry about what is reachable from those variables in the pointer graph. The intention (I forget whether it was checked) is that the transitive closure of what's reachable from the stable roots of a guardian should be atomic types, and that entire object graph is recovered. The closure reachable (only) from volatile roots may contain cheaper non-atomic types.


A user-defined type may use those non-atomic types, but nevertheless declare itself be atomic. Mechanisms are provided for users to do explicit transaction-level locking and logging. In many cases this can allow more concurrency, and require less overhead, than if the abstraction were implemented directly with the built-in atomic type constructors. This facility is quite similar to what the TM research community today discusses under the name "open nested transactions."


For all of these great features, none of these languages succeeded in the commercial marketplace. Of course, they were research projects, not really intended to achieve commercial success, but rather to influence industrial practice at some time in the future. The world of object-oriented databases, once popular but now somewhat eclipsed, could be seen as one of the intellectual progeny of this work. There are also now several systems that note the similarity between a row in a relational database table and the representation of an object type, and try to allow semi-automatic translation between these views, to allow manipulation of database tables to look more like "normal programming" than construction of SQL commands. Microsoft's ADO.NET and Java's JDO are examples. While I understand the reasons for the success of these models, to me, these seem (with all due respect) like somewhat clunkly attempts to regain the marriage of transactions, persistence, and ordinary objects and programming offered by Argus, Avalon, and the like. But perhaps I'm biased :-)


My own Ph.D. thesis dealt with doing concurrent garbage collection in such a system. The essential problem is that the underlying transaction system dealt with physical addresses of data, but the garbage collector might be moving objects, and thus changing addresses. The collection may take much longer than the individual transactions in the system, which is why you want to do it concurrently, so you can't treat the collection like a transaction. I was in a race with an MIT grad student, Elliot (now Hillel) Kolodner, working on much the same problem, to finish. While tense for a while, this eventually turned out fine for all concerned. Our solutions differed enough not to cause problems. Hillel, long since an eminent colleague in GC research, and I have agreed that while I got my thesis finished first, his was the better solution. (I'd advise any Ph.D. student to take my half of that bargain, by the way!)


What lessons should the current round of transactional memory implementations draw from this previous work? Well, we ought to be aware of the work, and let it inform us when it’s appropriate: for example, it would be probably be better if more people understood that there's a history to ideas like parallel or open nesting, that these ideas had relevant precursors in the literature. On the other hand, having worked on both problems, I'm struck by how much the deletion of one letter from "ACID" changes the landscape. Ratios of costs change, different issues come to the fore -- in many ways, it's a completely new world. The engineering of the Argus/Avalon languages had concerns more akin to those of a database system -- how much data you write to stable disk storage, and when, being the overriding concern. Committing a TM transaction is (or ought to be!) a much lighter-weight operation. So intuitions formed in the previous world may not applicable in the TM world. But still, it's nice to have some experience from which to start!