Basic Category Theory for Programmers, Part 1: Categories
Category theory studies the relationships between structures. For mathematicians, this generally means algebraic structures, but it can be applied to other kinds of structures. In this series of short posts we will be looking at how category theory can be applied to a programming language.
A category is composed of objects and arrows between these objects. The arrows are also called morphisms. At a basic level, a category is a graph where the objects are the nodes of the graph and the arrows are the edges. The direction of the arrows is important. If we have an arrow from one object to another, we can't assume that the arrow also goes in the other direction.
When we define a category we have to decide what counts as objects and what counts as arrows. Once the objects and the arrows are defined, we stop caring about what they mean outside of the category. Category theory is not concerned with what the objects are composed of or with what the arrows might do. This is important because we want to compare different kinds of categories and when we do this we are only interested in whether their objects and arrows have similar relationships.
A common way to form a category from a programming language is to take the types in the language as objects and the functions between them as arrows. Note that each type is a single object. A type can have many values, but there is only one object per type in the category. For the most part we won't be concerned about the type's values.
Let's suppose we have a simple language composed of three types, String
, Number
and Boolean
. The category for this language would have three objects. The arrows would be all the functions between the types, including all the functions that go from one type back to itself. Let's call this category PL
(short for "programming language").
Categories have an operation called composition. Let's say we have three objects, A
, B
and C
. If we have an arrow from A
to B
and another arrow from B
to C
, we can compose these two arrows into a single arrow that goes from A
to C
. Given A→B
and B→C
, we can form A→C
. We can think of composition as the "addition" of arrows.
Composition is commonly represented with the symbol ◦
. If we have two functions, toString
and isEmpty
, their composition would be represented as isEmpty◦toString
.
The result is a new function which takes a Number
and returns a Boolean
. This isn't a very useful function since it will always return false
, but let's ignore that for now.
The order of composition is typically shown from right to left. In the example above, toString
would be applied first, followed by the application of isEmpty
. This can be confusing but it makes sense if you think about how this would be implemented.
Boolean isStringValueEmpty(Number x) {
return isEmpty(toString(x));
}
Categories must satisfy a set of rules. The first rule says that composition must be associative. Given three arrows, f
, g
and h
, the following must be true.
We mentioned that compositions is sort of like addition. This rule is similar to the associative property for addition.
Given three arrows, it doesn't matter which pair we compose first. The result should always be the same. Let's see how this applies to programming. Given the following three functions,
we could compose them like this.
We start with a Number
. add3
adds 3
to this Number. isEven
returns true
if the Number
is even
, and false
otherwise. toString
then converts the Boolean
to a String
. If we start with 2
we end up with "False"
. If we start with 3
, we end up with "True"
.
Note that associativity applies to the composition of functions, not their application. The functions will always be applied from right to left. The order of application can't change because the return type for a function has to match the argument type of the next function.
Let's start composing the functions from the right. We can compose isEven
and add3
like this.
Boolean isEvenAfterAddition(Number x) {
return isEven(add3(x));
}
We then compose the new function with toString
.
String stringAfterComposition(Number x) {
return toString(isEvenAfterAddition(x));
}
Now let's compose from left to right. We start by composing toString
and isEven
.
String stringAfterIsEven(Number x) {
return toString(isEven(x));
}
Then we compose the new function with add3
.
String stringAfterComposition(Number x) {
return stringAfterIsEven(add3(x));
}
We won't go over the details, but you can confirm that both functions will always return the same value given the same input even though the functions were composed in a different order.
It seems like a programming language satisfies the rule of associativity by default! But this isn't always the case. For this to work, the functions have to be pure. This means that the functions must always return the same result for the same input. A simple example of a function that doesn't do this would be one that returns a random Number
. Another problem would be if the result of a function depends on global state. Since the global state may change, we may get different results for the same input. Given this requirement we can see that PL
may not be able to contain all the functions in a real programming language.
We can now turn to the second rule for categories. This rule says that every object must have an identity arrow. Let's call this arrow id
. An identity arrow goes from an object back to itself. But there may be many arrows that do this. We can identify the identity arrow because it has the following property.
The identity arrow works like the identity element in addition and multiplication.
What does this mean in terms of programming? The identity arrow corresponds to the function which simply returns the value it is given. For Number
, the identity arrow goes from Number
to Number
and it returns 0
when given 0
, it returns 1
when given 1
, etc. Similarly for every other type in PL
. In fact, if the programming language supports generic types, we could create a function that works as the identity arrow for any type.
T identity(T x) {
return x;
}
For any type T
, this function goes from T
to T
and it simply returns the input value.
If we compose the identity function with any other function, the new function will behave exactly like the original function. It will be as if we hadn't done any composition at all. This is similar to adding 0
or multiplying by 1
. But if it doesn't do anything, why would we need an identity function? 0
doesn't do anything in addition, but it is still an important mathematical tool. Likewise, we'll see that the identity arrow is an important tool in category theory.
As mentioned above, category theory is meant to study the relationships between the structures of different categories. But we only have one category, PL
, the category of the types and functions in our programming language. However, we can create new categories by considering subsets of PL
.
Let's add the Array
type to our language. An Array
is an ordered set of objects. Some languages call this type List
. Array
is just another object in our category, but we can also consider it as its own category. Now we can look for interesting relationships between the Array
category and our main programming language category, PL
.
Suppose an Array
only holds String
objects. We can annotate this using angle brackets, Array<String>
. Let's look at functions that go from an Array
of one type to an Array
of another type, such as
Such a function is usually called map
. Notice that there is a parallel to other functions in PL
.
In a way, the map
function allows the Array
category to mirror the structure of PL
. We are starting to see structural relationships between these two categories. We'll explore this further in a future post. But first we'll take a closer look at what it means for a programming language to be a category.