Thursday, March 22, 2012

C++ interface multiple inheritance

Warning: if you're not a programmer, stop reading now.


Earlier this week I was thinking about multiple inheritance in C++. This practice is often discouraged due to challenges like the diamond problem, demanding solutions so complex that it starts to outweigh the benefits of the multiple inheritance in the first place. However, everyone says that multiple inheritance of interfaces (base classes with only virtual methods) is not only okay, but encouraged. This concept has been rolled into languages like Java and C# which otherwise don't have multiple inheritance!

But this bothered me - in C++, virtual methods are generally implemented by a virtual function table, with a pointer in each implementing class. But how does that work with multiple inheritance? How does the holder of an interface pointer know how to find the virtual functions that matter to their interface as opposed to the other interfaces? Does it even work, or will my code start breaking if I try this? I spent way too much time Bing-ing about looking for the answer with no success. Finally I decided to find out myself.

My test setup is a native C++ Win32 console application, compiled with Microsoft Visual Studio 2010. My output is 32-bit (sizeof(void*)==4).

1. How do we find the right vtable?

class IFoo {
    virtual void Foo() = 0;
};
class IBar {
    virtual void Bar() = 0;
};
class CImpl : public IFoo, public IBar
{
    void Foo() { printf("Foo"); }
    void Bar() { printf{"Bar"); }
};



int _tmain(int argc, _TCHAR* argv[])
{
    CImpl *pImpl = new CImpl;
    IFoo *pFoo = pImpl;
    IBar *pBar = pImpl;
    printf("Ptrs: pImpl = %p, pFoo = %p, pBar = %p", pImpl, pFoo, pBar);
}

Ptrs: pImpl = 000C6F98, pFoo = 000C6F98, pBar = 000C6F9C 

Well that answers that. When we try to take a pointer to the second interface, C++ will adjust your pointer, such that it points to the expected vtable inside the class.

2. But what about the "this" pointer?

So, IBar isn't actually pointing to the object. How does "this" work... is it corrupt when we call it from the offsetted pointer?

class CImpl : public IFoo, public IBar
{
    void Foo() { printf("Foo this: %p\n", this); }
    void Bar() { printf{"Bar this: %p\n", this); }
};



int _tmain(int argc, _TCHAR* argv[])
{
    CImpl *pImpl = new CImpl;
    IFoo *pFoo = pImpl;
    IBar *pBar = pImpl;
    printf("Ptrs: pImpl = %p, pFoo = %p, pBar = %p", pImpl, pFoo, pBar);
    pFoo->Foo(); pBar->Bar();
    pImpl->Foo(); pImpl->Bar();
}

Ptrs: pImpl = 000C6F98, pFoo = 000C6F98, pBar = 000C6F9C 
Foo This: 000C6F98     Bar This: 000C6F98
Foo This: 000C6F98     Bar This: 000C6F98 

Huh. They're both the same, and match pImpl. How did that happen? We just showed that the pointers are different! To answer this, I looked to the assembly. On x86, classes use the "thiscall" convention, which puts the "this" pointer in ECX. Lets see what happens.


When calling via the patched pBar pointer, we set this = pBar. This implies that CImpl::Bar implementation must internally patch the pointer back from IBar to CImpl. But, that wouldn't work when calling from CImpl directly. The compiler, knowing that CImpl::Bar() is meant to look like IBar::Bar(), actually sets this = pImpl + 4.

Oh, you're clever, VC++.

3. What about the same method defined in multiple interfaces?

A fun thing you can do with interfaces, is that different interfaces can define the same method. Since it's just a name (well, a signature), an implementer providing a suitable method can simultaneously satisfy both interfaces.

class IFoo {
    virtual void Foo() = 0;

    virtual void Omg() = 0;

};
class IBar {
    virtual void Bar() = 0;

    virtual void Omg() = 0;

};
class CImpl : public IFoo, public IBar
{
    void Foo() { printf("Foo"); }
    void Bar() { printf{"Bar"); }
    void Omg() { printf{"Omg"); }
};

But how would this work? After all, we just discovered that CImpl::Bar() expects "this" to look like IBar. What does CImpl::Omg() make "this" look like - IFoo or IBar? It can't be both - they point to different locations. But, looking at the vtable in the debugger quickly shows the trick.


Aha, VC++ is more clever than I give them credit for. pFoo directs Omg() straight to CImpl::Omg(), while pBar directs Omg() to Omg`adjustor{4}'. I'd bet the intent of this function is to adjust the "this" pointer to match IFoo - then CImpl::Omg() can assume all callers are referencing it as IFoo::Omg(), and "this" is interpreted appropriately.

Conclusion

You can use "interface-like" classes in C++ just like you would in C# or Java, including multiple inheritance, and the compiler will magically take care of making it work. It accomplishes this through manipulation of your interface pointers and your "this" pointers when using the objects, and through thunk methods where there is ambiguity.

Trust your compiler. Program with interfaces. Win!

2 comments:

Unknown said...

It's all fun and games until windbg loses track of all the black magic :(

Joe said...

It's all fun and games until someone uses 4 interfaces and the debugger can't deal...