Swift, Algebraically: Sum Types, Product Types, Exponential Types

Part 1 of 3

Feb 24, 2024

This essay will look at how we can understand programming from a mathematical point of view. I'll be using the Swift language as an example, but the general concepts should apply to most programming languages. I will be covering some material which you may already be familiar with, such as sum types and product types. I will also be discussing other topics such as the peculiar properties of types with one or zero values. I'll wrap it up by showing how an interesting algebraic structure called a semiring fits into the picture.

Swift is composed of data types such as String, Int, Bool, Array, etc. We can think of a type as a set of values. An instance of a type represents one of these values. Bool has two values, true and false. Int on a 32-bit platform has over 4 billion values. String and Array theoretically have an infinite amount of values, but in reality they are constrained by memory and implementation details.

The number of values in a set is called the set's cardinality. When I refer to the size of a type, I will be referring to the cardinality of its set of values.

Swift supports different kinds of types. Some basic kinds are structures, classes, enumerations, tuples and functions. A few types such as Int and Character are simple – they can’t be broken down into component types. Structures and classes are composed of properties which are themselves separate types. A tuple is an ordered set of two or more types. A tuple may also be empty, but it cannot contain a single type. A tuple with a single type would be redundant since it would be equivalent to the type it contains.

Structures, classes and tuples serve different purposes within a programming environment, but they are equivalent from a mathematical point of view. Each instance of these types contains an instance of each of its properties. I'll be discussing structures and tuples for the most part, but keep in mind that the mathematical concepts are the same for all three types.

Collections, such as Array and String, can hold a variable number of instances of a certain type. This is the reason why their size is theoretically infinite. Theoretically, we can always add more Characters to a String.

Now let's look more carefully at the idea that a type is a set of values. Simple types such as Int and Character have a predefined number of values. For example, Int32 is the set of integers from -2,147,483,648 to 2,147,483,647.

\(\{ -2147483648, ... -4, -3, -2, -1, 0, 1, 2, 3, 4 ..., 2147483647 \}\)

It is a little harder to determine what the set of values for Character is, but we can at least show some of them,

\(\{ a, b, c, ... A, B, C, ... 1, 2, 3, ... !, @, \#, ... 😀 🙁 😉, ... \}\)

Bool is an odd type. It is a structure but it has no properties and it only has two values.

\(\{ true, false \}\)

In some ways it is more like an enumeration, but we can’t switch over it.

Sum and Product Types

An enumeration contains a number of different cases and each case represents a value. The number of values for an enumeration corresponds to the number of its cases. This isn't exactly true if we consider associated types, but we'll get to that later.

Enumerations are called sum types because we can calculate their number of values by summing the number of their cases.

When we have a composite type we have to consider the number of values in each property. But to determine the size of a composite type, we can't just add the number of values in each property. For example, let's say we have a Color enumeration with three cases and a structure which includes this enumeration and a Bool.

enum Color {
  case blue
  case red
  case yellow
}

struct Style {
  let color: Color
  let available: Bool
}

If we add the values in Color and Bool we get 5. But Style actually has six possible values.

\((.blue, \hspace{4pt} true)\)

\((.blue, \hspace{4pt} false)\)

\((.red, \hspace{4pt} true)\)

\((.red, \hspace{4pt} false)\)

\((.yellow, \hspace{4pt} true)\)

\((.yellow, \hspace{4pt} false)\)

A value of a composite type is a combination of values, one for each of its properties. So the size of a composite type is the number of possible combinations of its property values.

Composite types are called product types because we calculate their number of values by multiplying the number of values in their component types.

Let's come back to associated types in enumerations. Without associated types, each case in an enumeration is a single value. If we add an associated type we have to take the size of this type into account. Suppose we have the following enumeration,

enum PartyDecoration {
  case balloon(Color)
  case confetti
  case streamer
}

confetti and streamer are ordinary cases. Each is a single value. But the balloon case has an associated Color type. Color is also an enumeration. It has three simple cases, so it has three values. To get the total number of values for PartyDecoration we need to include the number of values in the enumerated cases including the sizes of associated types. So PartyDecoration has five values, 1 + 1 + 3 = 5. Its set of possible values looks like this.

\(.balloon(.blue)\)

\(.balloon(.red)\)

\(.balloon(.yellow)\)

\(.confetti\)

\(.streamer\)

Exponential Types

Functions seem out of place in this discussion, but they are also types. A function is a type with two parts, a set of parameters and a return type. To simplify things, let's pretend the set of parameters is a single type. If there is more than one parameter, we can think of them as a single tuple parameter.

Since we're talking about types as sets of values, it would be helpful to go over the mathematical definition of a function. A function in mathematics is a mapping between one set and another. The first set is called the domain and the second set is the codomain of the function. A function maps every element in the domain to some element in the codomain. Every element in the domain must get mapped, but a function may not necessarily cover every element in the codomain.

For example, we could have a function which takes a Color value and returns a Bool.

func isBlue(_ color: Color) -> Bool {
  switch color {
  case .blue: return true
  case .red: return false
  case .yellow: return false
  }
}

The domain is Color and the codomain is Bool. This function maps each element of Color to an element in Bool. For blue it returns true and it returns false for all other colors. This isn't the only function that goes from Color to Bool. We could also have the following.

func isNotRed(_ color: Color) -> Bool {
  switch color {
  case .blue: return true
  case .red: return false
  case .yellow: return true
  }
}

We can now ask, how many possible functions are there that go from Color to Bool? How many mappings are there from the set of values in Color to the set of values in Bool? We can show each mapping as a set of tuples where the first element of the tuple is a Color value and the second element is a Bool value.

\(\{ (.blue, \hspace{4pt} true), \hspace{4pt} (.red, \hspace{4pt} \hspace{4pt} true), \hspace{4pt} (.yellow, \hspace{4pt} true) \}\)

\(\{ (.blue, \hspace{4pt} true), \hspace{4pt} (.red, \hspace{4pt} true), \hspace{4pt} (.yellow, \hspace{4pt} false) \}\)

\(\{ (.blue, \hspace{4pt} true), \hspace{4pt} (.red, \hspace{4pt} false), \hspace{4pt} (.yellow, \hspace{4pt} true) \}\)

\(\{ (.blue, \hspace{4pt} true), \hspace{4pt} (.red, \hspace{4pt} false), \hspace{4pt} (.yellow, \hspace{4pt} false) \}\)

\(\{ (.blue, \hspace{4pt} false), \hspace{4pt} (.red, \hspace{4pt} true), \hspace{4pt} (.yellow, \hspace{4pt} true) \}\)

\(\{ (.blue, \hspace{4pt} false), \hspace{4pt} (.red, \hspace{4pt} true), \hspace{4pt} (.yellow, \hspace{4pt} false) \}\)

\(\{ (.blue, \hspace{4pt} false), \hspace{4pt} (.red, \hspace{4pt} false), \hspace{4pt} (.yellow, \hspace{4pt} true) \}\)

\(\{ (.blue, \hspace{4pt} false), \hspace{4pt} (.red, \hspace{4pt} false), \hspace{4pt} (.yellow, \hspace{4pt} false) \}\)

All the functions which go from Color to Bool

We have eight possible mappings. There are eight functions which go from Color to Bool. We showed each function as a set of tuples. The first element of the tuple is the function argument and the second element is the result. The first elements are always going to be the same for each function since we have to cover all the values in the domain. We can remove the domain elements to make it easier to read.

\(true \hspace{12pt} true \hspace{12pt}true\)

\(true \hspace{12pt} true \hspace{12pt} false\)

\(true \hspace{12pt} false \hspace{12pt} true\)

\(true \hspace{12pt} false \hspace{12pt} false \)

\(false \hspace{12pt} true \hspace{12pt} true\)

\(false \hspace{12pt} true \hspace{12pt} false\)

\(false \hspace{12pt} false \hspace{12pt} true\)

\(false \hspace{12pt} false \hspace{12pt} false\)

We can now see that what we were looking for were all the possible permutations of three Bool values. Since Bool has two possible values, the answer is 2 x 2 x 2 = 8, or 2^3. The "three" comes from the size of the parameter type. So more generally, if a parameter type A has m values and a return type B has n values, there are n^m functions that go from A to B.

Function types are called exponential types because we can calculate their number of values by raising the size of the return type to the power of the size of the parameter type. Notice that the parameter size is the exponent. This can be confusing since parameters come first in the definition of a function.

Note the difference between functions and function types. A function is a single mapping from a domain to a codomain. A function type, considered as a set of values, consists of all the possible functions that map from the domain to the codomain. Also note that function names are irrelevant. There may be many functions which do the same mapping. For the purposes of this discussion we can consider them all as the same function.

Next: Zero and One

Jose Villegas

Discussion about this post