OOP Class Notes 09/26/13
- If we want to add fields in a subclass, where do we store that data in memory?
- Below the memory reserved for the superclass.
- Code from the subclass can access the memory above using the same offsets as code from the superclass.
- If we have polymorphic data structures of variable sizes, how should we pass the data?
- By reference. Hence in Java, all objects are passed by reference
- For virtual methods we have a per class (for all instances) vtable containing pointers to the different implementations
- Every instance has a vptr pointing to the vtable
- For example, at offset zero in the vtables we have the
toStringmethod - New methods added by a subclass extend the vtable
- Overriden methods go into an existing slot
ObjectandStringhave their own vtables, andString’s vtable is an overriden clone ofObject’s vtable- In Java when you declare a subclass that extends a superclass, you clone the superclass' vtable and add the addresses of any new methods that are not private or static to the vtable
- Here is a visual representation:
- Please note that the data layouts of object instances and their respective vtables are not necessarily adjacent in memory
- Everything is inside the
langnamespace which is inside thejavanamespace We are using double underscore convention to prefix names which are internal to the translator.
// Forward declarations of data layout and vtables. struct __Object; struct __Object_VT; struct __String; struct __String_VT; struct __Class; struct __Class_VT; // Definition of type names, which are equivalent to Java semantics, // i.e., an instance is the address of the object's data layout. typedef __Object* Object; typedef __Class* Class; typedef __String* String;- We then use
typedefto define the Java type names as pointers to the internal structs that are prefixed with the double underscores. Data layout of
java.lang.Objectin C++struct __Object { __Object_VT* __vptr; // The constructor. __Object(); // The methods implemented by java.lang.Object. static int32_t hashCode(Object); static bool equals(Object, Object); static Class getClass(Object); static String toString(Object); // The function returning the class object representing // java.lang.Object. static Class __class(); // The vtable for java.lang.Object. static __Object_VT __vtable; };__Objecthas a__vptrfield which points to its virtual method table, a static__vtablefield for all instances, the built in methodshashCode,equals,getClass,toString, and a__classmethod which returns the class object representing itself.- Remember that every instance of
__Objecthas a pointer to the vtable, otherwise the whole premise of virtual methods would not work.
- Remember that every instance of
- We use
static Class __class()as opposed to declaring_classas a static variable to avoid the possibility that it is initialized after other static variables that depend on it. See this resource for more details about the “static initialization order fiasco”. __Object’s vtable layout:struct __Object_VT { Class __isa; int32_t (*hashCode)(Object); bool (*equals)(Object, Object); Class (*getClass)(Object); String (*toString)(Object); __Object_VT() : __isa(__Object::__class()), hashCode(&__Object::hashCode), equals(&__Object::equals), getClass(&__Object::getClass), toString(&__Object::toString) {} };__Object_VThas an__isaproperty which points to its class, and then pointers to the methods ofjava.lang.Object. It has a “no-argument” constructor denoted by__Object_VT()which stores the addresses of__Object’shashCode,equals,getClassandtoStringmethods in the appropriate fields of the vtable.- For example,
int32_t (*hashCode)(Object)is a pointer to a function that takes an Object as a parameter and returns anint32_ttype. - Notice how we use
&to store an address in the no-argument constructor
- For example,
- Remember that in Java
thisis an implicit argument for every instance’s method so each vtable method has__Objectas the parameter. intin Java has 32 bits, but in C++ it depends on the architecture so we specifyint32_tforhashCode's return type.The data layout of
java.lang.Stringin C++ is a clone of__Objectexcept in addition it addsdatawhich uses the C++std::stringtype, and also adds alengthandcharAtmethod.struct __String { __String_VT* __vptr; std::string data; // The constructor; __String(std::string data); // The methods implemented by java.lang.String. static int32_t hashCode(String); static bool equals(String, Object); static String toString(String); static int32_t length(String); static char charAt(String, int32_t); // The function returning the class object representing // java.lang.String. static Class __class(); // The vtable for java.lang.String. static __String_VT __vtable; };The vtable for
__String// The vtable layout for java.lang.String. struct __String_VT { Class __isa; int32_t (*hashCode)(String); bool (*equals)(String, Object); Class (*getClass)(String); String (*toString)(String); int32_t (*length)(String); char (*charAt)(String, int32_t); __String_VT() : __isa(__String::__class()), hashCode(&__String::hashCode), equals(&__String::equals), getClass((Class(*)(String))&__Object::getClass), toString(&__String::toString), length(&__String::length), charAt(&__String::charAt) { } };The type of the first parameter for
__Object’sgetClassand__String’sgetClassdiffer (the implicit this), so we need a cast –getClass((Class(*)(String)) …)- We also need to define
java.lang.Classbecause every object needs a class object, which is static and shared by all instances of the class. - The
Classobjects are used to keep track of the dynamic type of objects The data layout of
java.lang.Classin C++struct __Class { __Class_VT* __vptr; String name; Class parent; // The constructor. __Class(String name, Class parent); // The instance methods of java.lang.Class. static String toString(Class); static String getName(Class); static Class getSuperclass(Class); static bool isInstance(Class, Object); // The function returning the class object representing // java.lang.Class. static Class __class(); // The vtable for java.lang.Class. static __Class_VT __vtable; };__Classhas anamefield to denote the class' name as well as aparentfield to reference the parent class. The latter is used to implement thegetSuperclassmethod, which return's a reference to an object’s superclass.// The vtable layout for java.lang.Class. struct __Class_VT { Class __isa; int32_t (*hashCode)(Class); bool (*equals)(Class, Object); Class (*getClass)(Class); String (*toString)(Class); String (*getName)(Class); Class (*getSuperclass)(Class); bool (*isInstance)(Class, Object); __Class_VT() : __isa(__Class::__class()), hashCode((int32_t(*)(Class))&__Object::hashCode), equals((bool(*)(Class,Object))&__Object::equals), getClass((Class(*)(Class))&__Object::getClass), toString(&__Class::toString), getName(&__Class::getName), getSuperclass(&__Class::getSuperclass), isInstance(&__Class::isInstance) {} };We have a
__rtnamespace for anullvalue function and aStringliteral to convert a C string to ajava.lang.Stringobject instead of letting C++ implicitly convert the C string to astd::stringnamespace __rt { // The function returning the canonical null value. java::lang::Object null(); // Function for converting a C string lieral to a translated // Java string. inline java::lang::String literal(const char * s) { // C++ implicitly converts the C string to a std::string. return new java::lang::__String(s); } }java.lang.Object// java.lang.Object() __Object::__Object() : __vptr(&__vtable) {} // java.lang.Object.hashCode() int32_t __Object::hashCode(Object __this) { return (int32_t)(intptr_t)__this; } // java.lang.Object.equals(Object) bool __Object::equals(Object __this, Object other) { return __this == other; } // java.lang.Object.getClass() Class __Object::getClass(Object __this) { return __this->__vptr->__isa; } // java.lang.Object.toString() String __Object::toString(Object __this) { // Class k = this.getClass(); Class k = __this->__vptr->getClass(__this); std::ostringstream sout; sout << k->__vptr->getName(k)->data << '@' << std::hex << (uintptr_t)__this; return new __String(sout.str()); } // Internal accessor for java.lang.Object's class. Class __Object::__class() { static Class k = new __Class(__rt::literal("java.lang.Object"), (Class)__rt::null()); return k; } // The vtable for java.lang.Object. Note that this definition // invokes the default no-arg constructor for __Object_VT. __Object_VT __Object::__vtable;- For
hashCodewe cannot cast__thisdirectly toint32_tbecause that doesn’t work on 64 bit architectures. So we cast first tointptr_tand then toint32_t hashCodeand other methods takethisas an implicit parameter. Sincethisis a reserved keyword in C++, we useObject __thisas a reference to the instance receiving the method call.- See the rest of the code in java_lang.cc and main.cc
- The implementation of
__Classis important because without it we would not be able to track the dynamic type of objects.Classis what links objects in the inheritance hiearchy. isInstancetraverses the inheritance hierarchy upwards (until it hitsnull) to determine whether an object is an instance of a given class// java.lang.Class.isInstance(Object) bool __Class::isInstance(Class __this, Object o) { Class k = o->__vptr->getClass(o); do { if (__this->__vptr->equals(__this, (Object) k)) return true; k = k->__vptr->getSuperclass(k); } while ((Class)__rt::null() != k); return false; }- We have no notion of classes, inheritance, or virtual methods in the target language of the translator. That is, we need to translate these concepts by hand because using C++ inheritance and virtual methods in our translator IS NOT ALLOWED
- In a statically typed language we can build a per class vtable that represents the behavior of each class.
- We only need the contract once for all instances, and we can hook the behavior up to the vtable with a pointer.
- Then the question becomes, how do we fill things in.
- So, say we have a class B, which inherits from A, which inherits from
Object, and we want to do the data layout for B. - B, by definition of inheritance, has all the same fields and data as A.
- So the data layout for B consists of the data layout for A, and then the new data of B appended to that.
- So, say we have a class B, which inherits from A, which inherits from
- Similarly, the data layout for A consists of the data layout
for
Object, and the new data of A appended. - We know that the data layout for
Objectonly consists of a vptr, because we programmed it today. If we have a class C that is also a subclass of A, its data layout will also consist of the data layout of A andObject, but because it is a sibling of B, it will not have access to the data that is unique to B. - The vtable for
Objectalso has the__isapointer and the pointers to the four methods we need – that’s the contract of__Object. - If we override a method with a new implementation, we already know what slot it has in the vtable.
- That is, overriding of virtual methods is implemented by replacing a pointer in the vtable.
- The pointer required for an object in C++ to use virtual methods adds 8 bytes of space to the object, but any additional virtual methods will not further increase the size of the objects if we add more virtual methods, i.e. once a class has a single virtual method the increase in size is set

Without Virtual MethodsWith Virtual Methodssizeof(Point)32 bytes = 4 doubles40 bytes = 1Point+ pointer for vtablesizeof(ColorPoint)40 bytes = 1Point+ 1Color+ padding48 bytes = 1ColorPoint+ pointer for vtable
Implementing inheritance and virtual method dispatch by hand
We talked about inheritance on Tuesday 9/24, but the challenge of
implementing it efficiently is that we have to worry about the layout
in memory. We will first generally go over inheritance and virtual
methods in Java by drawing the data layout of classes, instances, and
vtables. Then we will write C++ code that
implements java.lang.Object, java.lang.String
and java.lang.Class so that you can see what these
structures look like in C++ code without using inheritance and virtual methods.
Note that you CANNOT use C++ inheritance or virtual methods in your translator
This is why we are going to write these Java object structures by hand in C++.