Algebraic Data Types

Published on

The Wikipedia article defines algebraic data type (ADT) as a possibly recursive sum type of product types.

Before we go on, note firstly that ADTs combine other types, but if all we have is ADTs then how can we define what we are supposed to be combining? For the rest of this post, just assume that we have the following types available (Haskell syntax):

data Void

data Unit = Unit

Product types

Our definition of ADTs mentions product types. The word product has a direct analogy with the mathematical concept of product, or multiplication, and when talking about ADTs we even use similar notation. If we have any two types A and B, we can combine them into a product type called A × B using the following algorithm:

Once we have some value product type, we can also retrieve the original values used to build it using projection operators prjL and prjR. Given some value (a, b) ∈ A × B:

Correspondence with mathematics

What does all this have to do with multiplication? Notice that the number of values in A × B is exactly the number of values in A times the number of values in B. Is this a coincidence? Let’s look at some other similarities.

Product with 0

In mathematics, if you multiply by 0, you get 0. Is there some corresponding type Z such that, if you take the product of A and Z you get Z? Yes! It’s the type Void we had before. Let’s run it through our algorithm.

To calculate Void × A:

To calculate A × Void:

In both of these cases, we ended up defining a type without any values, which is equivalent to Void.

Product with 1

In mathematics, if you multiply by 1, you get the same thing back. Is there some corresponding type I such that, if you take the product of A and I you get A? You probably guessed this by now, but it’s our type Unit from before. Let’s run it through our algorithm again, to check.

To calculate Unit × A:

But because the type Unit only has one value, the outer “for each” will only happen once, so the number of values in Unit × A is the same as the number of values in A: there is a one-to-one correspondence, which makes them equivalent.

To calculate A × Unit:

Again, because Unit only has one value, for every iteration of the outer “for each”, the inner “for each” will only happen once, so there are the same number of values in A × Unit as in A, so they are equivalent.

What’s the point of all that?

Sum types