Basic Category Theory for Programmers, Part 1: Categories

Jan 03, 2024

Category theory studies the relationships between structures. For mathematicians, this generally means algebraic structures, but it can be applied to other kinds of structures. In this series of short posts we will be looking at how category theory can be applied to a programming language.

A category is composed of objects and arrows between these objects. The arrows are also called morphisms. At a basic level, a category is a graph where the objects are the nodes of the graph and the arrows are the edges. The direction of the arrows is important. If we have an arrow from one object to another, we can't assume that the arrow also goes in the other direction.

A simple category with three objects and arrows between them.

When we define a category we have to decide what counts as objects and what counts as arrows. Once the objects and the arrows are defined, we stop caring about what they mean outside of the category. Category theory is not concerned with what the objects are composed of or with what the arrows might do. This is important because we want to compare different kinds of categories and when we do this we are only interested in whether their objects and arrows have similar relationships.

A common way to form a category from a programming language is to take the types in the language as objects and the functions between them as arrows. Note that each type is a single object. A type can have many values, but there is only one object per type in the category. For the most part we won't be concerned about the type's values.

Let's suppose we have a simple language composed of three types, String, Number and Boolean. The category for this language would have three objects. The arrows would be all the functions between the types, including all the functions that go from one type back to itself. Let's call this category PL (short for "programming language").

a) A simple programming language. b) This programming language considered as a category.

Categories have an operation called composition. Let's say we have three objects, A, B and C. If we have an arrow from A to B and another arrow from B to C, we can compose these two arrows into a single arrow that goes from A to C. Given A→B and B→C, we can form A→C. We can think of composition as the "addition" of arrows.

The arrows `f` and `g` compose into `h`.

Composition is commonly represented with the symbol ◦. If we have two functions, toString and isEmpty, their composition would be represented as isEmpty◦toString.

\(isEmpty ◦ toString\)

The result is a new function which takes a Number and returns a Boolean. This isn't a very useful function since it will always return false, but let's ignore that for now.

The order of composition is typically shown from right to left. In the example above, toString would be applied first, followed by the application of isEmpty. This can be confusing but it makes sense if you think about how this would be implemented.

Boolean isStringValueEmpty(Number x) {
  return isEmpty(toString(x));
}

Categories must satisfy a set of rules. The first rule says that composition must be associative. Given three arrows, f, g and h, the following must be true.

\((f ◦ g) ◦ h = f ◦ (g ◦ h)\)

We mentioned that compositions is sort of like addition. This rule is similar to the associative property for addition.

\((x + y) + z = x + (y + z)\)

Given three arrows, it doesn't matter which pair we compose first. The result should always be the same. Let's see how this applies to programming. Given the following three functions,

\(add3: Number → Number\)

\(isEven: Number → Boolean\)

\(toString: Boolean → String\)

we could compose them like this.

\(toString ◦ isEven ◦ add3\)

We start with a Number. add3 adds 3 to this Number. isEven returns true if the Number is even, and false otherwise. toString then converts the Boolean to a String. If we start with 2 we end up with "False". If we start with 3, we end up with "True".

Note that associativity applies to the composition of functions, not their application. The functions will always be applied from right to left. The order of application can't change because the return type for a function has to match the argument type of the next function.

Let's start composing the functions from the right. We can compose isEven and add3 like this.

Boolean isEvenAfterAddition(Number x) {
  return isEven(add3(x));
}

We then compose the new function with toString.

String stringAfterComposition(Number x) {
  return toString(isEvenAfterAddition(x));
}

Now let's compose from left to right. We start by composing toString and isEven.

String stringAfterIsEven(Number x) {
  return toString(isEven(x));
}

Then we compose the new function with add3.

String stringAfterComposition(Number x) {
  return stringAfterIsEven(add3(x));
}

We won't go over the details, but you can confirm that both functions will always return the same value given the same input even though the functions were composed in a different order.

It seems like a programming language satisfies the rule of associativity by default! But this isn't always the case. For this to work, the functions have to be pure. This means that the functions must always return the same result for the same input. A simple example of a function that doesn't do this would be one that returns a random Number. Another problem would be if the result of a function depends on global state. Since the global state may change, we may get different results for the same input. Given this requirement we can see that PL may not be able to contain all the functions in a real programming language.

We can now turn to the second rule for categories. This rule says that every object must have an identity arrow. Let's call this arrow id. An identity arrow goes from an object back to itself. But there may be many arrows that do this. We can identify the identity arrow because it has the following property.

\(f ◦ id = f \hspace{12pt} and \hspace{12pt} id ◦ f = f\)

The identity arrow works like the identity element in addition and multiplication.

\(x + 0 = x \hspace{12pt} and \hspace{12pt} 0 + x = x\)

\(x * 1 = x \hspace{12pt} and \hspace{12pt} 1 * x = x\)

What does this mean in terms of programming? The identity arrow corresponds to the function which simply returns the value it is given. For Number, the identity arrow goes from Number to Number and it returns 0 when given 0, it returns 1 when given 1, etc. Similarly for every other type in PL. In fact, if the programming language supports generic types, we could create a function that works as the identity arrow for any type.

T identity(T x) {
  return x;
}

For any type T, this function goes from T to T and it simply returns the input value.

If we compose the identity function with any other function, the new function will behave exactly like the original function. It will be as if we hadn't done any composition at all. This is similar to adding 0 or multiplying by 1. But if it doesn't do anything, why would we need an identity function? 0 doesn't do anything in addition, but it is still an important mathematical tool. Likewise, we'll see that the identity arrow is an important tool in category theory.

As mentioned above, category theory is meant to study the relationships between the structures of different categories. But we only have one category, PL, the category of the types and functions in our programming language. However, we can create new categories by considering subsets of PL.

Let's add the Array type to our language. An Array is an ordered set of objects. Some languages call this type List. Array is just another object in our category, but we can also consider it as its own category. Now we can look for interesting relationships between the Array category and our main programming language category, PL.

Suppose an Array only holds String objects. We can annotate this using angle brackets, Array<String>. Let's look at functions that go from an Array of one type to an Array of another type, such as

\(Array \langle String \rangle → Array \langle Number \rangle\)

Such a function is usually called map. Notice that there is a parallel to other functions in PL.

\(String → Number\)

In a way, the map function allows the Array category to mirror the structure of PL. We are starting to see structural relationships between these two categories. We'll explore this further in a future post. But first we'll take a closer look at what it means for a programming language to be a category.

Next: Intuitions

Jose Villegas

Discussion about this post