Mismatch between object lifetime when bridging between COM and managed code

From time to time, I see object hierarchies like:

interface IParent : IUnknown

{

    [propget] HRESULT Child([out, retval] IChild ** ppChild);

}

dispinterface IChildEvents

{

    // ...

}

interface IChild : IUnknown

{

}

coclass Parent

{

    [default] interface IParent;

}

[noncreatable]

coclass Child

{

    [default] interface IChild;

    [default, source] dispinterface IChildEvents;

}

This describes a simple parent object with a child that has connection point based events. The user imports the type library for this pseudo-IDL code using tlbimp.exe to generate an Interop assembly. In some C# code, they write:

IParent parent = // ...

parent.Child.myevent = MyHandler;

 

The intent seems clear to someone coming from a Jscript programming background, but they will get very undefined results.

 

Breaking this down, COM objects are bridged to managed code by runtime callable wrappers. For a given COM object in a given app domain, there should be a single runtime callable wrapper (with some caveats to be addressed in a future post). A runtime callable wrapper is a full blown managed object that “manages” a given COM object. When I say that it “manages” the underlying COM object, I mean that it maps to interfaces supported by the object, lifetime, eventing, and automatically performing cross apartment / context calls automatically so that the managed object appears to be apartment agnostic.

 

In the C# code snippet above, there are two runtime callables wrappers (RCWs from now on). The first one is referenced as long as the variable “parent” is in scope. The second one is created on the fly in response to “parent.Child.” This case is interesting in that this RCW is created on the fly, but it is never assigned to anything. From a garbage collector point of view, it is available to be collected as soon as the statement completes.

 

The problem is that the bridge between COM connection point based eventing and managed delegates resides on the RCW. In this case, the user has set up a delegate to subscribe for events. An implementation of the event sink mapping back to the delegate is created and registered using IConnectionPoint::Advise; however, the event sink is part of the RCW and will be destroyed as soon as the RCW is collected.

 

The problem here is that this will most likely work as expected in simple cases, but events will stop as soon as the RCW for the child object is collected. From my experience, this is the worst type of failure.

 

The correct solution is to do:

IParent parent = // ...

IChild child = parent.Child;

Child.myevent = MyHandler;

 

And to save “child” somewhere as long as the event subscription needs to remain valid.

 

This is somewhat annoying since you would look at the original code and expect it to work. Digging into how COM Interop works, I have realized that this is just an artifact of the design and not something that can be easily solved. It is also just one specific instance of the general mismatch between COM lifetime and managed lifetime.

 

COM lifetime is very explicit and deterministic: reference count drops to zero, object is released. Some people get caught up in this determinism and actually break the rules of COM. A good example is another parent child sample:

class Child

{

    // ...

    STDMETHODIMP_(ULONG, AddRef)() {return 2;}

    STDMETHODIMP_(ULONG, Release)() {return 1;}

    // ...

};

class Parent

{

    // ...

    Child m_child;

    STDMETHODIMP GetChild(IChild ** ppChild)

    {

        *ppChild = &m_child;

        return S_OK;

    }

    // ...

};

 

In this case, the child is allocated directly with the parent and doesn’t maintain its own lifetime. This is generally bad idea but usually works when called from a language like C++ from code that looks like:

IParent * pParent = //...

IChild * pChild = NULL;

hr = pParent->GetChild(&pChild);

if(SUCCEEDED(hr))

{

    // ...

    pChild->Release();

}

pParent->Release();

Note that it would be perfectly valid to fetch the child and then immediately release the pointer to the parent (which would cause this code to crash since the parent would release and take the child with it), but this is not common programming practice. When creating a technology that leverages something like COM, programming against the elegance of the COM spec isn’t too hard to get right (mostly...). Trying to support behavior that is relying on assumptions and convention instead of following the rules is hard to do. Note that we definitely try to recognize and support really common conventions, but there are definitely limits to what can be done.

 

In managed, the RCWs are released by garbage collection and there is no ordering. Each is just another managed object that is available for collection so the parent can easily be collected (causing a release under the covers) which will destroy the child and cause an AV when the child RCW is collected later.

 

Unlike the first eventing example, code like this is clearly playing fast and loose with COM, but people still get annoyed when code that works in their existing VB6 or MFC code base fails when called through managed code.

 

In general, bridging between two large, full-featured object models is a challenging problem. Given the size of the problem space I alternate between being impressed by how well COM Interop works and being surprised anything works at all. I’m going to follow up with a couple more object mismatches before branching off into other topics. I have my own hit list of things to cover, but if people have things they would like to see discussed please let me know. I’m the main development owner of the Interop space so I’d also love to hear your feedback on the product and suggestions for things to add or improve.