Basic Category Theory for Programmers, Part 4: Basic Abstract Algebra
In our common sense idea of numbers we imagine that they're all the same. We don't think there is anything essentially different between 3
, -3
and 0.3
. We also don't imagine there is anything odd about the number zero
. But these ordinary ideas are already informed by mathematical theory. It wasn't always obvious that zero
is a number, that numbers could be negative, or that numbers don't have to be whole.
The operations we perform on numbers also seem natural. We can add numbers and subtract them. We can multiply them and divide them. But even in our common sense idea there is already a problem. We can't apply division to all numbers since we can't divide by zero
. Other problems come up if we restrict what counts as a number. Let's say negative numbers are not permitted. We can still add numbers without restriction, but there are restrictions on subtraction. Once we get to zero
we can't subtract anymore. Now let's say that only whole numbers are allowed. This places restrictions on division since the result of this operation is now required to also be a whole number. We can still divide 4
by 2
, but we can no longer divide 3
by 2
.
The different mathematical operations only work with certain sets of numbers. In mathematics we aren't required to assume that all numbers are the same. Instead we can put them in different sets and then we can see which operations are valid for each set.
In abstract algebra, an algebraic structure is defined as a set of elements and an operation that can be performed on all the elements of this set. This can be represented with brackets around the set of elements and the operation. If the set of elements is all numbers and the operation is addition, we can represent it like this, {Numbers,+}
.
We have been discussing numbers, but algebraic structures can also be about other things. This may seem odd at first since all the operations we have discussed are performed on numbers. Arithmetic studies numbers, but mathematics is much broader. It includes shapes, logic, sets, and many other things. In abstract algebra we can forget that algebraic structures may be related to numbers. By doing this we can see that structures in arithmetic can have properties which are similar to structures which are not arithmetical. This essay is about programming languages. Eventually we will see how numbers and arithmetic operations are related to programming types and functions.
Let's start by looking at the different kinds of numbers. Mathematics has a long tradition of names and symbols for the different sets of numbers. Natural numbers (N) include all the positive whole numbers (which may or may not include zero
), integers (Z) include all the natural numbers plus negative whole numbers, rational numbers (Q) include all numbers which can be expressed by a fraction with integers as the numerator and denominator, such as 2/3
, and real numbers (R) include all rational numbers plus all fractions, not just those expressible with integers. There are other sets, but we don't have to discuss them for now.
We can now be more specific about what kind of number set we are referring to for a pariticular algebraic structure. The important point is that the operation in the algebraic structure needs to be valid for all the elements in that particular set of numbers. We can create an algebraic structure made up of natural numbers and addition {N,+}
, but the natural numbers and subtraction is not a valid structure. We cannot subtract 3
from 1
and get another natural number. Negative numbers are not part of the set of natural numbers. Likewise we can have the integers and multiplication {Z,*}
, but not division since we cannot divide by zero
and division will not always result in an integer.
This isn't very interesting yet. All we have done is divide numbers into different categories and group them with an operator. But we can now start adding conditions to the structures.
A common rule in mathematics is associativity. An operation is associative if the order in which it is performed is not important. Given the operation x+y+z
, it doesn't matter which pair of values we add first, x
and y
or y
and z
, since addition is associative. In the end the result will be the same. This is expressed in the following equation.
Multiplication is also associative. At this level, the two structures are the same. We have a set of elements and an operation that is associative. It doesn't matter that the operation is addition or multiplication. An algebraic structure that follows the rule of associativity is called a semigroup.
Let's consider one more condition. Let's say that the set of elements must have one element that doesn't change the other element when an operation is applied to both. We'll use the symbol i
to represent it. We can express this condition with the following equations.
This element is called the identity element ("i" is short for "identity") and when we add this condition on top of associativity we get an algebraic structure called a monoid. Both addition and multiplication have identity elements. For addition, the identity element is 0
and for multiplication it is 1
.
This means that our two algebraic structures {N,+}
and {N,*}
are also monoids. To express that the identity element is an element of the algebraic structure we can include it inside the bracket notation, {N,+,0}
and {N,*,1}
.
We can extend this idea to programming types. Programming languages have a Number
type in one form or another and they have addition and multiplication functions. So programming languages come automatically equipped with monoids. But they're not the only monoids in a programming language.
We can typically append one String
to another. Some programming languages even use the same symbol for this operation as the one for addition.
The empty string, ""
, is the identity element for this operation since the following is true for every String
x
.
Notice that not every append function will form a monoid. The function needs to be associative. Let's suppose that the operation trims trailing spaces after concatenating the Strings
. The order in which the operations are executed would then have different results. The following equations show boxes instead of spaces for clarity.
This is an example where using mathematics as a guide can be useful when we design a programming language. If we use +
to represent the "addition" of Strings
, it would be a good idea if the operation behaves like addition in arithmetic. We want the operation to be associative and it should have an identity element.
Arrays
are another example of a type with an "addition" operation.
The identity element for Array
is an empty Array
.
We can specify the String
and Array
monoids using the same bracket notation as we did for addition and multiplication, {String,+,””}
and {Array,+,[]}
.
Turning now to category theory, we can ask, is a category an algebraic structure? We could use the objects in a category as the elements of an algebraic structure, but what could we use as the algebraic operation? There is no meaningful sense in which we can "add" the objects in a category. But remember that categories do come equipped with an operation, the composition of arrows. In fact, we mentioned that composition is like the "addition" of arrows.
We may be able to create an algebraic structure using arrows as the elements and arrow composition as the operation. In the first post of this series, we mentioned that composition is associative. It is one of the requirements for a category. This means we can at least form a semigroup.
A monoid requires an identity element, but a category has too many identity elements. We are using arrows as the elements and every object in a category must have an identity arrow. We can solve the problem by reducing the number of objects in a category to one. A category with a single object will form a monoid! All the arrows go from the single object back to itself, their composition is associative, and we have an identity element which is the object's identity arrow. Using bracket notation, we might roughly describe this monoid like this, {arrows,composition,id}
.
Any object on its own forms a monoid in a category. However, this categorical definition loses the monoidal properties that were specific to Number
, Array
and String
. We are no longer dealing with the addition of Numbers
or appending Strings
or Arrays
. More generally, we are no longer using objects as the elements of the monoid. Instead, a categorical monoid "adds" arrows through composition. It is a very general monoid that has little to do with our monoid types, Number
, Array
and String
. Ideally we would like to express the properties of these monoids through category theory as well. We'll look at this problem in the next post.