Inheritance is not Subtyping

Because dissing OOP is cool!

10 min readJul 27, 2019

--

I had my first real programming experience was during college with C++. After mandatory lessons on conditionals, loops and functions we reached classes and objects, at which point the professor listed the advantages of Object Oriented Programming. The year after, during some classes about logic and programming languages, another professor specifically and methodically debunked everything that was previously taught to us.

By now, everyone with a few months of programming experience should know how bad OOP is. There is a variety of reasons that make it a cumbersome and toxic approach and that caused immense damage to the industry over the years. I am most saddened to see new, promising programming languages being implemented on top of OOP foundations just because, despite its glaring shortcomings, it’s still the standard approach.

However, this article is not about how OOP is bad, it’s about how — on top top of that — most of its implementations are based on a misconception.

Strictly speaking, Object Oriented Programming is based on 3 simple principles:

  • Encapsulation : providing tools to restrict access to names
  • Inheritance : a strategy to re-use code
  • Subtyping : a particular take on polymorphism

Of these, encapsulation is by no means exclusive to OOP: most programming languages implement some of its forms and it is a most useful tools for every library interface. It is not the topic of this discussion.

Neither is inheritance: as a code reuse tool it is widely recognized as useless and it is slowly being discarded. Just don’t use it and no harm will come.

Subtyping on the other hand is something worth analyzing. It is just another form of polymorphism, and not the most convenient one. While with time OOP languages added more modern features to replace it (e.g. interfaces, row polymorphism, parametric polymorphism or even duck typing), even new ones keep supporting it by inertia.

In particular, although when presenting the three OOP principles they are meant to be separate aspects in practice inheritance and subtyping are always fused together: that is, the only way to create a subtype B of A is to inherit from A.

Subtypes are forcibly created by inheritance

This should not be the case at all. First of all, we can have subtypes that are not inherited — or better yet, other forms of polymorphism.

Moreover, this access of subtyping exclusively through inheritance is generally fuzzy. Some of the mechanics leveraged by OOP languages do work but often lay on shaky theoretical foundation, leading to implementations riddled with patches and holes that emerge in less than intuitive results.

What is a Subtype

Let’s begin with a brief introduction on subtyping. Type B is called a subtype of type A if an instance of B can be used in every situation where an instance of A is required. In a way, the subtype does everything the supertype does and then some more, so it can be used as substitute for the “parent” or “supertype”.

It is easy to visualize with sets: type A includes all instances of type B (just like all ducks are birds), thus any instance of B can be used where an A would be expected

It is a very simple concept for base types. For example, in most programming languages integers are considered subtypes of floating point numbers: it is easy to see how integers are just special floats; it can however become tricky for complex types. How do we tell when an object’s type is a subtype?

Objects are nothing but records containing fields and functions (equipped with self-reference); for a record B it makes sense to consider if a subtype of another record A when all of its contents are subtypes in turn: a record of 3 integers is subtype to a record with 3 floats. This is pretty straightforward for fields and variables, but what about methods?

Functions have types like any other element in programming language. The type of a function is defined by using arrows (->) to indicate the passing of arguments and the yielding of a result. The type of a function bool equalfloat(float, float) that returns true if two numbers are the same would be float->float->bool. Intuitively one might decide that a subtype of this function should have subtypes in both arguments and return values: bool equalint(int, int). Unfortunately, it would be wrong.

The definition of subtyping states that the subtype should be usable in place of the supertype; however it is immediate to see that the call equalfloat(1.5,2.2) cannot be replaced with equalint(1.5, 2.2) (ignore for a moment that floats and ints can be compared with the same operator).

Parameters in function types must be handled differently for subtyping relations to work: they must go in the opposite direction and become supertypes of the parent function’s arguments: in this light, equalfloat is a subtype of equalint . Types that follow this direction are called contravariant, opposed to covariant types which need to be subtypes in turn.

Note that the return type of functions follows the intuitive subtyping relationship: foo is subtype of bar of it is covariant in its return type and do contravariant in its arguments.

Given this short introduction to subtyping, let’s take a look at a simple example.

Numbers and Colored Numbers

Assume we want to to work with Numbers, just like integers but custom-branded. To better identify OOP as the enemy it is, I shall use Java-like syntax.

Our objects will be very simple: an integer value and a method eq to compare between Numbers:

class Number {
int n;

Number(int n) {
this.n = n;
}
bool eq(Number other) {
if (this.n == other.n) {
System.out.println("The two numbers are equal");
return true;
}
else {
System.out.println("The two numbers are equal");
return false;
}
}

Before continuing we must clarify an important detail: the nature of this/self or generically of the object recursion. Objects are records that are able to reference themselves through recursion, frequently by using a special variable called this or self. It is crucial to understand that self is not spawned magically from the depths of the compiler: as some languages make more explicit, it is a variable implicitly passed among the parameters of each method.

This means that, semantically, there is virtually no difference between using the dot notation and explicitly handling the object reference with an independent function.

num.eq(othernum); // This is like writing eq(num, othernum)

In fact, the dot notation we are so used to is just that — a notation. It holds no special significance other than highlighting the subject of the method. A method is therefore no more special than any function, and it’s the object definition that ties them together.

So, returning to our example, the eq method of the Number class has type Number->Number->bool despite only showing one parameter.

Now, let’s leverage on OOP tools and create an enhanced version of the Number class, adding a field to keep track of the number’s color.

class ColorNumber extends Number {
String color;
ColorNumber(int n, String color); bool eq(ColorNumber other) {
if (this.n == other.n && this.color == other.color) {
System.out.println("The two ColorNumbers are equal");
return True;
} else {
System.out.println("The two ColorNumbers are equal");
return False;
}
}
}

Just arbitrary clutter for the sake of example. Then, we create a couple of instances of such classes taking advantage of subtypes.

Number number = new Number(1);
ColorNumber cnumber = new ColorNumber(1, "black");
number.eq(cnumber);
cnumber.eq(number);

Looks harmless enough. number is declared as a Number object and cnumber is a ColorNumber, as it should be.

The first comparison is pretty straightforward, as we try to compare a Number with a ColorNumber : the latter is a subtype of the former so it can be used in its stead. The method eq from the parent class is invoked, and the function call prints “The two Numbers are equal” as expected.

The second one is more interesting. We invoke the method eq from the ColorNuber class with a Number as argument; the problem is that the subclass in question has not such method. We redefined eq in terms of ColorNumber objects alone, and Number is not a subtype. It should not be possible to compile this code; and yet it works with no issues. But what does it print?

“The two Numbers are equal”

So the eq method of the Number class was called, even if cnumber is a ColorNumber reference that indeed contains a ColorNumber.

This is because, even if it looked like it, we never overrode the eq method from the parent class: we merely overloaded it. ColorNumber contains two identically called methods but with different argument types. That is why cnumber is treated like a Number: the only option for the compiler is to cast it to its supertype and then use the method with type Number->Number->bool.

This behavior may seem obvious to some, but for me it was very confusing. I worked under the impression that I was replacing the method eq, not adding an extra case. Then again if it really had been an overwriting I could not have used the ColorNumber type as argument because of the way function subtyping works: the eq in ColorNumber is not a subtype of the same method for the parent class.

I have tested this use case for some Object Oriented Programming languages and found that the results are mostly coherent: Java, C++, C#, Scala and Kotlin all behave in the same way. One notable exception is Dart, that requires a more precise approach: when defining eq for the inherited class it does not allow to use the same name without overwriting, using contravariant arguments in the process. Running the same example yields an error of this sort:

Which is exactly what you would expect if your were really substituting the method with a subtype. Turns out objects are simpler than I had previously imagined and make me wonder if basing entire languages on extending records with overloaded methods is really worth it.

Playing Tricks on Your Self

It does not end here. Suppose we want to abide to Dart’s rules when creating a new eq method: to overwrite it it must be a subtype and have its arguments contravariant in respect to the parent method. Here it means that other shall have type Number.

I have mentioned before the interpretation of the special variable this/self as an implicit first argument of every method. This means that the real type of the original eq method is not Number->bool, but Number->Number->bool. In the same way the overwriting method as we define it has type ColorNumber->Number->bool, which due to the first argument (the normally implicit self) is not contravariant and not a subtype of the former method. Self for the subclass is of type ColorNumber; yet, even Dart happily runs this code: it ignores this imprecision to keep up with appearances and stops considering self as a parameter.

What does this all mean is that, in theory, you can never overwrite object methods because they all leverage on self recursive access, and the implicit parameter cannot be a supertype of itself lest losing every effectiveness the revision might have. All you can do is add new functions and overload them with the same name, giving the illusion of substituting behavior.

Does this constitute a problem in practice? Well, programs written in OOP languages I have listed run every day arguably without issues. The fact that I rant and complain about Java’s shaky theoretical foundations does not prevent it from working mostly as expected. Still, it would be better to rely on a solid background theory than this.

A second interpretation

When I described self as an implicit argument I willingly let out another possible explanation for object recursion which does not meet my personal favor.

Including self in a method’s arguments is the solution that makes least assumptions: methods are just functions called on an object, the dot notation is just a notation.

Another approach is to consider it a special “global” variable that is instantiated with the object itself by the constructor. Upon calling Number() or ColorNumber() every occurrence of this is bound to the respective type and reference, so that this is a sort of special field of the object and need not be contravariant, but covariant, to make a valid subtype.

Is there any difference in practice between the two approaches? In general, global variables and shared reference are bad form, plus they lack in flexibility: by passing self as argument adding new methods at runtime is as simple as plugging new fields in a record: just create a function that takes the self reference as its first parameter. If instead it is implemented as some magic variable you will not be allowed to create self-referencing functions to be injected into objects.

One exemplary feature that is lost with this approach is the decorator pattern. This blog post explains how having self as an argument in Python enables its exquisite use of decorators. Python is not typed, so it cares not about contravariant arguments and can live happily with the explicit self argument.

Conclusion

This debacle sparked from an example seen in a college class about programming languages. The author then proceeded by chasing what he thought was a fatal flaw in the OOP paradigm; perhaps unsurprisingly, this turned out to be an imprecision mixed with matter of opinion and perspective: for reference, “Inheritance is not subtyping” and “Inheritance is subtyping” are technical papers discussing the two opposite sides. Still, I count those listed here as some of the many idiosyncrasies of the OOP family and yet another reason to steer clear of the approach.

--

--

Mattia Maldini
Mattia Maldini

Written by Mattia Maldini

Computer Science Master from Alma Mater Studiorum, Bologna; interested in a wide range of topics, from functional programming to embedded systems.