Date: Tue, 27 Jul 2004 16:44:07 -0500 (CDT) From: Chris Lattner Subject: Object model stuff: vtables & typeinfo Here is some concrete information on how I do stuff in the MSIL front-end. Any SingleInheritance+Interfaces language could be implemented with a similar design. This is mostly background info for the next email which is about interfaces. The C# example we'll be talking about: class Base { double X; public virtual void foo() { } } class Derived : Base { float Y; public virtual void bar() { foo(); } public static void Main() { new Base(); new Derived(); } } The objects are layed out like this (as you would expect): %System.Object = type { %llvm_msil_vtable_base*, opaque } %Base = type { %System.Object, double } %Derived = type { %Base, float } The opaque element in System.Object is specified by the MSIL-objectmodel.ll file. It defines the object header and stuff, but isn't really related to this email. llvm_msil_vtable_base is the common header used by all vtables. It is defined as: %llvm_msil_vtable_base = type { %llvm_msil_type_info*, %llvm_msil_gc_info*, uint } %llvm_msil_type_info = type { short*, uint, %llvm_msil_type_info* } %llvm_msil_gc_info = type sbyte ;; placeholder The contents of the type_info will be different for you I would expect. For MSIL it's basically a string for the class name (short*/uint pair). The type info pointer in the type info is a pointer to the super class type info object, used for dynamic casts. The GC info is not currently defined. Okay, the 'new Base()' gets compiled into: %tmp = malloc %Base ; <%Base*> [#uses=2] %tmp1 = getelementptr %Base* %tmp, uint 0, uint 0, uint 0 store %llvm_msil_vtable_base* getelementptr ({ %llvm_msil_vtable_base, void (%Base*)* }* %Base.vft, uint 0, uint 0), %llvm_msil_vtable_base** %tmp1 call void %Base..ctor( %Base* %tmp ) The malloc call will eventually be a call to the gc routine (which provides zero'd memory), but you get the idea. This just stores the vtable pointer into the object memory then calls the ctor. The vtable for base looks like this: %Base.vft = constant { %llvm_msil_vtable_base, void (%Base*)* } { ; pointer to Base.ti, no gc info, no flags. %llvm_msil_vtable_base { %llvm_msil_type_info* %Base.ti, sbyte* null, uint 0 }, ; First virtual function pointer void (%Base*)* %Base.foo } The Base type info looks like this: %Base.ti = constant %llvm_msil_type_info { ;; The string "Base" in wide characters short* getelementptr ([4 x short]* %str, uint 0, uint 0), uint 4, ;; Superclass type info %llvm_msil_type_info* %System.Object.ti } Okay, the construction of Derived looks identical: %tmp = malloc %Derived ; <%Derived*> [#uses=2] %tmp = getelementptr %Derived* %tmp, uint 0, uint 0, uint 0, uint 0 store %llvm_msil_vtable_base* getelementptr ({ %llvm_msil_vtable_base, void (%Base*)*, void (%Derived*)* }* %Derived.vft, uint 0, uint 0), %llvm_msil_vtable_base** %tmp call void %Derived..ctor( %Derived* %tmp ) The derived vtable looks like so: %Derived.vft = constant { %llvm_msil_vtable_base, void (%Base*)*, void (%Derived*)* } { ;; type info %llvm_msil_vtable_base { %llvm_msil_type_info* %Derived.ti, sbyte* null, uint 0 }, ;; Pointer to foo function. If 'Derived' overrode foo, this ptr would ;; be to Derived.foo instead of Base.foo void (%Base*)* %Base.foo, ;; New pointer for bar void (%Derived*)* %Derived.bar } The Derived typeinfo looks like: %Derived.ti = constant %llvm_msil_type_info { ;; "Derived" as a wchar string short* getelementptr ([7 x short]* %str, uint 0, uint 0), uint 7, ;; Superclass typeinfo pointer %llvm_msil_type_info* %Base.ti } Okay, so now lets look at a vcall, such as in Derived.bar: void %Derived.bar(%Derived* %this) { ; Load the pointer to the vtable %tmp = getelementptr %Derived* %this, uint 0, uint 0, uint 0, uint %vtbl = load %llvm_msil_vtable_base** %tmp ; Cast the vtable pointer to be the appropriate type. %tmp = cast %llvm_msil_vtable_base* %vtbl to { %llvm_msil_vtable_base, void (%Base*)* }* ; Get a pointer to the foo slot and load it. %vtslot = getelementptr { %llvm_msil_vtable_base, void (%Base*)* }* %tmp, uint 0, uint 1 %tmp = load void (%Base*)** %vtslot ; Upcast from Derived to base (yaay containment) %tmp = getelementptr %Derived* %this, uint 0, uint 0 ; Call the function call void %tmp( %Base* %tmp ) ret void } For reference, note that there are two loads (one of the vtable pointer, one of the function address) and an indcall. The vtable load would be CSE'd away in many common cases though. Next up, interfaces. -Chris