Graphs and C-sets I: What is a graph?
$$
$$
You probably know what a graph is. A graph looks this:
Formally, a simple graph G = (V,E) is a set of vertices V equipped with an edge relation E: a binary relation E on V that is symmetric (E(v,u) whenever E(u,v), for all vertices u,v \in V) and irreflexive (E(v,v) for no vertex v \in V). In a simple digraph the symmetry axiom is dropped, so that the edges are directed. Graphs and digraphs are basic objects in discrete mathematics, are the source of fundamental data structures in computer science, and are ubiquitous throughout the scientific and engineering fields.
Despite this, the concept of a simple graph is often unsatisfactory for both theoretical and practical reasons. From an algebraic viewpoint, the category of simple graphs is surprisingly badly behaved.1 Moreover, experience shows that there is not a single concept of graph suitable for all applications, but rather that graphs belong to a rich family of related concepts. Graphs in scientific and engineering applications tend to have additional structure that is either ignored or shoehorned into the graph model through ad hoc encodings. The AlgebraicJulia ecosystem, and the field of applied category theory generally, aim to define a flexible, unified framework for scientific modeling using algebraic structures, subsuming and vastly extending the graph data model.
Through the course of this blog we will encounter a diverse cast of algebraic structures generalizing the concept of a graph. In this post we start at the very beginning, with how category theorists understand graphs, how graphs are an example of a general class of structures called “\mathsf{C}-sets,” and how \mathsf{C}-sets can be used as data structures in Julia through Catlab.jl.
Graphs
For a category theorist, a graph G consists of two sets, a vertex set G(V) and an edge set G(E), together with two functions G(\operatorname{src}), G(\operatorname{tgt}): G(E) \to G(V). Every edge e \in G(E) in the graph has a source vertex e \cdot \operatorname{src}:= G(\operatorname{src})(e) and a target vertex e \cdot \operatorname{tgt}:= G(\operatorname{tgt})(e). Thus, a graph is directed, may have multiple edges between two vertices, and may have loops, making it what a graph theorist would call a “multidigraph.” Moreover, the edges are explicitly identified as the elements of an edge set, rather than being pairs of vertices. While this may seem strange, we will gradually see that it has many advantages.
In Catlab, the definition of a graph is captured by the schema:
using Catlab.CategoricalAlgebra
@present SchGraph(FreeSchema) begin
V::Ob
E::Ob
src::Hom(E,V)
tgt::Hom(E,V)
end
The names Ob
and Hom
are short for “object” and “morphism,” interpreted as sets and functions in this context. Having defined the schema for graphs, the module Graphs
in Catlab creates a Julia data type for graphs like this:2
Let’s create a graph using the generated interface.
Code
E | src | tgt |
---|---|---|
1 | 1 | 2 |
2 | 2 | 3 |
3 | 2 | 3 |
4 | 3 | 3 |
As the visualization shows, edges 2, 3, and 4 are all incident to vertex 3 via the target map:
As a convenience, the Graphs
module wraps the generated methods with an interface inspired by Graphs.jl, defining methods such as nv
, ne
, add_vertex!
, add_edge!
, inneighbors
, and outneighbors
. For example, the following code reduces to the code constructing the graph g
above:
Graphs as C-sets
Graphs are an example of a class of structures called \mathsf{C}-sets. This was just illustrated by code and will now be explained mathematically.
\mathsf{C}-sets are defined relative to a small category \mathsf{C}. In this context, the category \mathsf{C} is interpreted as a schema, describing the kinds of parts in the structure and the subpart relations between them. For example, the schema for graphs is the category \mathsf{Sch}(\mathsf{Graph}) presented by:
You may be used to thinking about large categories, such as the category \mathsf{Set} of all sets and functions, but the category \mathsf{Sch}(\mathsf{Graph}) is very small! It has exactly two objects, E and V, and two non-identity morphisms, \operatorname{src}: E \to V and \operatorname{tgt}: E \to V. The interpretation is that there are two kinds of parts, of type V and E. Parts of type V have no subparts, while parts of type E have two subparts of type V, called \operatorname{src} and \operatorname{tgt}.
Given a small category \mathsf{C}, a \mathsf{C}-set is a functor X: \mathsf{C} \to \mathsf{Set}.3 A \mathsf{C}-set X therefore consists of, for every object c \in \mathsf{C}, a set X(c), and for every morphism f: c \to d in \mathsf{C}, a function X(f): X(c) \to X(d), subject to the functoriality axioms. In particular, when \mathsf{C} = \mathsf{Sch}(\mathsf{Graph}), a \mathsf{C}-set is exactly a graph as previously defined.
Symmetric graphs
The formalism of \mathsf{C}-sets would be rather heavy if all we wanted to do was define a graph, but many other structures, graph-like and otherwise, can be defined as \mathsf{C}-sets (Spivak 2009). As a second example, we consider symmetric graphs, closely related to what graph theorists call “(undirected) multigraphs.”
The schema for symmetric graphs is the category \mathsf{Sch}(\mathsf{SGraph}) generated by the objects and morphisms
subject to the equations expressed by the commutative diagram
or equivalently the equations
\operatorname{inv}\operatorname{⨟}\operatorname{src}= \operatorname{tgt}, \qquad \operatorname{inv}\operatorname{⨟}\operatorname{tgt}= \operatorname{src}, \qquad \operatorname{inv}^2 = 1_E.
A symmetric graph is a \mathsf{Sch}(\mathsf{SGraph})-set. Unpacking this, a symmetric graph is a graph G together with an orientation-reversing involution on edges, G(\operatorname{inv}): G(E) \to G(E), which in turn means that
e \cdot \operatorname{inv}\cdot \operatorname{src}= e \cdot \operatorname{tgt}, \qquad e \cdot \operatorname{inv}\cdot \operatorname{tgt}= e \cdot \operatorname{src}, \qquad e \cdot \operatorname{inv}\cdot \operatorname{inv}= e, \qquad \forall e \in G(E).
Intuitively, a symmetric graph assigns every edge to another edge in the opposite direction, effectively making the edges undirected.
In Catlab, the schema for symmetric graphs is defined by extending the schema for graphs, and then the new schema is used to create a Julia data type for symmetric graphs:4
@present SchSymmetricGraph <: SchGraph begin
inv::Hom(E,E)
compose(inv,src) == tgt
compose(inv,tgt) == src
compose(inv,inv) == id(E)
end
@acset_type SymmetricGraph(SchSymmetricGraph, index=[:src])
Similarly to graphs, the Catlab Graphs
module provides a convenient Graphs.jl-style interface for symmetric graphs. The interface takes care of adding edges in pairs related by the involution, allowing the symmetric graphs to be treated as undirected graphs. For example, the symmetric 4-cycle is constructed like this:
E | src | tgt | inv |
---|---|---|---|
1 | 1 | 2 | 5 |
2 | 2 | 3 | 6 |
3 | 3 | 4 | 7 |
4 | 4 | 1 | 8 |
5 | 2 | 1 | 1 |
6 | 3 | 2 | 2 |
7 | 4 | 3 | 3 |
8 | 1 | 4 | 4 |
Symmetric graphs are visualized as undirected graphs, where the drawn edges are labeled by directed edge pairs.
In the next installment we’ll see how more exotic notions of graph are realized as \mathsf{C}-sets and data structures in Catlab.
References
Footnotes
Under the standard definition of a simple graph, the category of simple graphs does not even have a terminal object! If the definition is relaxed to allow or even require loops, the new categories of simple graphs are better behaved—yet still not as well behaved as the category theorist’s notion of graph.↩︎
The keyword argument
index = [:src,:tgt]
requests that the source and target functions of the graph be indexed, allowing efficient lookup of the edges incident to a vertex as source or target. The implementation of \mathsf{C}-sets in Catlab is the topic of another post.↩︎A category theorist is more likely to call a \mathsf{C}-set a copresheaf on \mathsf{C}. The dual concept is a presheaf on \mathsf{C}, which is a functor \mathsf{C}^{\mathrm{op}} \to \mathsf{Set}. The name “\mathsf{C}-set” is short for a category action by \mathsf{C} (Reyes, Reyes, and Zolfaghari 2004), which generalizes a group action or G-set.↩︎
For efficiency, only the source map is indexed via the keyword argument
index = [:src]
. The target map can be indexed on the fly by composing the source index with the edge involution.↩︎