## Code

```
using DataFrames
using Distributions
using JuMP, HiGHS
using Catlab, DataMigrations
using BenchmarkTools
```

$$

$$

In this blog post, we will look at how AlgebraicJulia can bring new tools to mathematical programming, using JuMP.

Mathematical programming, and more specifically, linear (LP) and mixed-integer (MIP) programming are a staple for process optimization in dozens of industries, from electricity distribution, public transportation planning, airline scheduling, employee shift scheduling, supply chain planning, manufacturing, and more. For all but the simplest models, it is a necessity for users to be able to generate the LP/MIP problems via the use of an algebraic modeling language. These languages allow the user to write down a model close to how one might see it presented in an operations research or management science textbook. The language then applies simplifications, before passing it off to a dedicated solver program once it is in a standard form. There are a lot of modeling languages, some of the best known commercial ones being GAMS and AMPL.

In all algebraic modeling languages, *sets* are a fundamental way to organize variables and constraints. For example, in the famous diet problem, the set of foodstuffs may be used to index the decision variables corresponding to the amount of each food item in the diet. Both the GAMS documentation and AMPL documentation contain significant sections related to the creation, traversal, and manipulation of sets and subsets. While the similarity of many tasks in model building to database operations has been noticed for several decades, most clearly in Fourer (1997), support for more complex operations including n-ary products and relations has remained limited, and modelers often use ad hoc techniques which increase model generation times, sometimes prohibitively. In this post, we show how to use acsets, a categorical data structure described by Patterson, Lynch, and Fairbanks (2021), and categorical operations to formally and efficiently generate mathematical programming models.

In a blog post on the GAMS blog, a comparison was made between several open source modeling languages including JuMP, and GAMS. The initial JuMP implementation saw poor performance due to inefficient Julia code. The JuMP dev team responded with their own blog post on the JuMP website, using a fast solution based on DataFrames.jl. In this post we will see how to use tools from AlgebraicJulia to accomplish the example modeling task. The original data and code is at justine18/performance_experiment.

The model is given as:

The GAMS blog post calls subsets of Cartesian products “maps”, but we will use the standard term “relations” for subsets of n-ary products.

First we load some packages. `DataFrames`

for data frames, `Distributions`

for sampling binomial random variates, `JuMP`

to set up the model, and `HiGHS`

for a solver. `Catlab`

and `DataMigrations`

are the two AlgebraicJulia packages we will use.

```
using DataFrames
using Distributions
using JuMP, HiGHS
using Catlab, DataMigrations
using BenchmarkTools
```

The first step is to generate synthetic data according to the method from the original repo. The probability of all zeros with the given model sizes is incomprehensibly small but there is a check for it anyway. Maybe a cosmic ray will pass through your processor at a bad time.

Like the original code, the sets are vectors of strings. There are 3 relations which are “sparse” subsets of the corresponding products, .

```
SampleBinomialVec = function(A,B,C,p=0.05)
vec = rand(Binomial(1, p), length(A) * length(B) * length(C))
while sum(vec) == 0
vec = rand(Binomial(1, p), length(A) * length(B) * length(C))
end
return vec
end
n=100 # something large
m=20 # 20
# Sets IJKLM
I = ["i$x" for x in 1:n]
J = ["j$x" for x in 1:m]
K = ["k$x" for x in 1:m]
L = ["l$x" for x in 1:m]
M = ["m$x" for x in 1:m]
# make IJK
IJK = DataFrame(Iterators.product(I,J,K))
rename!(IJK, [:i,:j,:k])
IJK.value = SampleBinomialVec(I,J,K)
filter!(:value => v -> v != 0, IJK)
select!(IJK, Not(:value))
# make JKL
JKL = DataFrame(Iterators.product(J,K,L))
rename!(JKL, [:j,:k,:l])
JKL.value = SampleBinomialVec(J,K,L)
filter!(:value => v -> v != 0, JKL)
select!(JKL, Not(:value))
# make KLM
KLM = DataFrame(Iterators.product(K,L,M))
rename!(KLM, [:k,:l,:m])
KLM.value = SampleBinomialVec(K,L,M)
filter!(:value => v -> v != 0, KLM)
select!(KLM, Not(:value))
```

As given in the original GAMS blog post, this is the naive formulation that relies on nested for loops. As remarked in the JuMP blog, this is equivalent to taking two inner joins. Another way to look at it is as finding “paths” through the relations, such that they match on common elements. Each formulation will be benchmarked.

```
@benchmark let
x_list = [
(i, j, k, l, m)
for (i, j, k) in eachrow(IJK)
for (jj, kk, l) in eachrow(JKL) if jj == j && kk == k
for (kkk, ll, m) in eachrow(KLM) if kkk == k && ll == l
]
model = JuMP.Model()
set_silent(model)
@variable(model, x[x_list] >= 0)
@constraint(
model,
[i in I],
sum(x[k] for k in x_list if k[1] == i) >= 0
)
end
```

```
BenchmarkTools.Trial: 10 samples with 1 evaluation.
Range (min … max): 524.072 ms … 543.738 ms ┊ GC (min … max): 2.33% … 2.33%
Time (median): 533.789 ms ┊ GC (median): 2.31%
Time (mean ± σ): 534.375 ms ± 6.926 ms ┊ GC (mean ± σ): 2.42% ± 0.61%
▁▁ ▁ █ ▁ ▁ █ ▁
██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█▁▁▁▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁█ ▁
524 ms Histogram: frequency by time 544 ms <
Memory estimate: 253.95 MiB, allocs estimate: 6603758.
```

The JuMP blog authors used a version based on two inner joins to vastly improve computation speed.

```
ijklm_df = DataFrames.innerjoin(
DataFrames.innerjoin(IJK, JKL; on = [:j, :k]),
KLM;
on = [:k, :l],
)
@benchmark let
ijklm = DataFrames.innerjoin(
DataFrames.innerjoin(IJK, JKL; on = [:j, :k]),
KLM;
on = [:k, :l],
)
model = JuMP.Model()
set_silent(model)
ijklm[!, :x] = @variable(model, x[1:size(ijklm, 1)] >= 0)
for df in DataFrames.groupby(ijklm, :i)
@constraint(model, sum(df.x) >= 0)
end
end
```

```
BenchmarkTools.Trial: 3506 samples with 1 evaluation.
Range (min … max): 1.214 ms … 4.442 ms ┊ GC (min … max): 0.00% … 45.74%
Time (median): 1.331 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.423 ms ± 365.885 μs ┊ GC (mean ± σ): 4.65% ± 10.30%
▆█▆▃▁ ▁
▇▅█████▇▆▆▆▅▅▁▁▁▄▄▁▃▅▃▃▁▁▄▅▅▇▇▆▅▅▁▁▁▁▁▁▁▁▁▁▁▃▃▄▅▅▆▆▇▅▆▆▆█▆▆ █
1.21 ms Histogram: log(frequency) by time 3.17 ms <
Memory estimate: 1.84 MiB, allocs estimate: 17137.
```

Acsets (Attributed C-Sets) are a categorical data structure provided in the ACSets.jl library, and imported and extended with machinery from applied category theory in Catlab.jl. One of the advantages of using acsets and categorical machinery in general is that they have a natural graphical presentation, which in many cases is very readable and illuminating. If this is your first exposure to acsets, a gentle introduction can be found at Graphs and C-sets I: What is a graph?.

We use `@present`

to make a schema for the acset which will store the data. Note that each “set” has turned into an object in the schema, and that the relations are also objects. There are projections (homs) from the relations into the sets which are involved in each relation.

```
@present IJKLMSch(FreeSchema) begin
(I,J,K,L,M,IJK,JKL,KLM)::Ob
IJK_I::Hom(IJK,I)
IJK_J::Hom(IJK,J)
IJK_K::Hom(IJK,K)
JKL_J::Hom(JKL,J)
JKL_K::Hom(JKL,K)
JKL_L::Hom(JKL,L)
KLM_K::Hom(KLM,K)
KLM_L::Hom(KLM,L)
KLM_M::Hom(KLM,M)
end
Catlab.to_graphviz(IJKLMSch, graph_attrs=Dict(:dpi=>"72",:ratio=>"expand",:size=>"8"))
```

Using `@acset_type`

will programatically generate a data type and methods specific to the schema provided. We then use `@acset`

to build a data instance on our schema.

```
@acset_type IJKLMData(IJKLMSch, index=[:IJK_I,:IJK_J,:IJK_K,:JKL_J,:JKL_K,:JKL_L,:KLM_K,:KLM_L,:KLM_M])
ijklm_acs = @acset IJKLMData begin
I = n
J = m
K = m
L = m
M = m
IJK = nrow(IJK)
IJK_I = [parse(Int, i[2:end]) for i in IJK.i]
IJK_J = [parse(Int, j[2:end]) for j in IJK.j]
IJK_K = [parse(Int, k[2:end]) for k in IJK.k]
JKL = nrow(JKL)
JKL_J = [parse(Int, j[2:end]) for j in JKL.j]
JKL_K = [parse(Int, k[2:end]) for k in JKL.k]
JKL_L = [parse(Int, l[2:end]) for l in JKL.l]
KLM = nrow(KLM)
KLM_K = [parse(Int, k[2:end]) for k in KLM.k]
KLM_L = [parse(Int, l[2:end]) for l in KLM.l]
KLM_M = [parse(Int, m[2:end]) for m in KLM.m]
end
```

The critical thing that the JuMP devs did to speed thing up was to replace the for loops with 2 inner joins, to get the “paths” through the relations. How to do this with acsets? Well one thing to do is execute a conjunctive query on the acset to get the same thing. This is described in a C-sets for data analysis: relational data and conjunctive queries.

```
connected_paths_query = @relation (i=i,j=j,k=k,l=l,m=m) begin
IJK(IJK_I=i, IJK_J=j, IJK_K=k)
JKL(JKL_J=j, JKL_K=k, JKL_L=l)
KLM(KLM_K=k, KLM_L=l, KLM_M=m)
end
Catlab.to_graphviz(connected_paths_query, box_labels=:name, junction_labels=:variable, graph_attrs=Dict(:dpi=>"72",:size=>"3.5",:ratio=>"expand"))
```

While the blog post should be consulted for a complete explanation, the conjunctive query is expressed using an undirected wiring diagram (UWD) which is visualized above. Nodes (labeled ovals) in the UWD correspond to tables (primary keys) in the acset. Junctions (labeled dots) correspond to variables. Ports, which are unlabed in this graphical depiction, are where wires connect junctions to nodes. These correspond to columns of the table they are connected to. Outer ports, which are wires that run “off the page”, are the columns of the table that will be returned as the result of the query. Conceptually, rows that are returned from the query come from filtering the Cartesian product of the tables (nodes) such that variables in columns match according to ports that share a junction.

The JuMP blog post notes that while the data frames version doesn’t resemble the nested summation it is arguably just as readable, especially if the columns were related to the process that was being modeled. We suggest that the acsets version is also just as readable, if not more, as the data schema and query diagram directly represent the data that parameterizes the optimization model. Furthermore because the schema of the acset is known at compile time, incorrect queries (or other operations) on acsets will be caught as compile time errors.

The query can then be evaluated on the specific acset instance. We can confirm that both the acsets and data frame methods return the same number of rows.

```
ijklm_query = query(ijklm_acs, connected_paths_query)
size(ijklm_query) == size(ijklm_df)
```

`true`

Now we can go ahead and see how fast the acsets version is. The fact that the acsets based query is right on the tails of the `DataFrames`

version is a performance win for the acsets library, as it is usually a generic conjunctive query engine across very general data structures (i.e., acsets are in general much more complex than a single dataframe, due to presence of multiple tables connected via foreign keys).

```
@benchmark let
ijklm = query(ijklm_acs, connected_paths_query)
model = JuMP.Model()
set_silent(model)
ijklm[!, :x] = @variable(model, x[1:size(ijklm, 1)] >= 0)
for df in DataFrames.groupby(ijklm, :i)
@constraint(model, sum(df.x) >= 0)
end
end
```

```
BenchmarkTools.Trial: 2759 samples with 1 evaluation.
Range (min … max): 1.549 ms … 9.245 ms ┊ GC (min … max): 0.00% … 20.42%
Time (median): 1.685 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.808 ms ± 446.566 μs ┊ GC (mean ± σ): 5.04% ± 10.74%
▂██▆▃ ▁▁▁ ▁
▅▅██████▅▅▄▆▅▅▄▁▃▅▄▄▁▃▁▁▁▃▁▃▁▃▄▃▁▆▅▅▄▆▄▁▁▁▁▁▁▁▃▁▃▆▇█████▇▆▆ █
1.55 ms Histogram: log(frequency) by time 3.38 ms <
Memory estimate: 2.74 MiB, allocs estimate: 23500.
```

While the query execution in the previous section is already quite useful for practical application, it does have one downside, and that is the type of the return object is a `DataFrame`

. While this is appropriate for many cases, one of the benefits of acsets is that, from one point of view, they are in-memory relational databases, and are therefore capable of representing data more complex than can be expressed in a single table. Therefore, it would be nice if one could execute a *data migration* from one type of acset, where “type” means the schema, to another, that is able to carry along further data we need. For more details on data migration, please see Evan Patterson’s Topos Institute colloquium talk “Categories of diagrams in data migration and computational physics”.

In this context, if the “set” objects (, etc) were further connected to other tables, maybe, say, a list of suppliers, or a list of materials, or even a process graph of downstream work, it would be inconvenient at least, if we lost that relational information during a query. In that case, we’d really want to return *another acset* on a different schema that is precisely the right shape for what we want to do.

In this simple case, the schema we want is shown below. We’ll think of the object as being mapped to the subset of the n-ary product that has those “paths” through the relations we are seeking.

```
@present IJKLMRelSch(FreeSchema) begin
(IJKLM,I,J,K,L,M)::Ob
i::Hom(IJKLM,I)
j::Hom(IJKLM,J)
k::Hom(IJKLM,K)
l::Hom(IJKLM,L)
m::Hom(IJKLM,M)
end
@acset_type IJKLMRelType(IJKLMRelSch)
Catlab.to_graphviz(IJKLMRelSch, graph_attrs=Dict(:dpi=>"72",:ratio=>"expand",:size=>"3.5"))
```

Now we formulate the data migration, using AlgebraicJulia/DataMigrations.jl. While we will not be able to rigorously explain data migration here, if one has -Set (instance of data on schema ) and wants to migrate it to a -Set, a data migration functor needs to be specified.

Here, is our schema `IJKLMSch`

and is `IJKLMRelSch`

. The functor is a mapping from to the category of diagrams on ; formally we denote it . Each object in gets assigned a diagram into , and morphisms in get assigned to contravariant morphisms of diagrams.

```
M = @migration IJKLMRelSch IJKLMSch begin
IJKLM => @join begin
ijk::IJK
jkl::JKL
klm::KLM
i::I
j::J
k::K
l::L
m::M
IJK_I(ijk) == i
IJK_J(ijk) == j
JKL_J(jkl) == j
IJK_K(ijk) == k
JKL_K(jkl) == k
KLM_K(klm) == k
JKL_L(jkl) == l
KLM_L(klm) == l
KLM_M(klm) == m
end
I => I
J => J
K => K
L => L
M => M
i => i
j => j
k => k
l => l
m => m
end;
```

A diagram is itself a functor , where is (usually) a small category, and will point at some instance of the diagram in . We can plot what the largest diagram looks like, that which object `IJKLM`

in is mapped to. Note the similarity to the conjunctive query visualized as a UWD previously. In particular, note that “relation” elements must agree upon the relevant “set” elements via their morphisms. The object in that each object in corresponds to is given by the text after the colon in the relevant node.

```
F = functor(M)
to_graphviz(F.ob_map[:IJKLM],node_labels=true)
```

Because of the simplicity of the schema `IJKLMRelSch`

, the contravariant morphisms of diagrams simply pick out the object in associated with the source of the morphism. Likewise, the natural transformation part of morphisms of diagrams simply selects for each object its identity morphism.

We run the data migration to move data from the schema `IJKLMSch`

to `IJKLMRelSch`

using the function `migrate`

, and check that the result has the same number of records as other methods.

```
ijklm_migrate_acset = migrate(IJKLMRelType, ijklm_acs, M)
nparts(ijklm_migrate_acset, :IJKLM) == size(ijklm_query,1)
```

`true`

Once again, let’s benchmark. The data migration is slightly slower than the conjunctive query method, but data migrations can express a much richer language of data manipulation than conjunctive queries are capable of.

```
@benchmark let
ijklm = migrate(IJKLMRelType, ijklm_acs, M)
model = JuMP.Model()
set_silent(model)
@variable(model, x[parts(ijklm, :IJKLM)] >= 0)
for i in parts(ijklm, :I)
@constraint(model, sum(x[incident(ijklm, i, :i)]) >= 0)
end
end
```

```
BenchmarkTools.Trial: 1380 samples with 1 evaluation.
Range (min … max): 3.212 ms … 10.213 ms ┊ GC (min … max): 0.00% … 18.52%
Time (median): 3.347 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.621 ms ± 654.179 μs ┊ GC (mean ± σ): 6.28% ± 10.98%
▃▇██▇▅▄▂ ▁ ▁ ▂▂▁▁ ▁
█████████▇▅▅▄▅▅▁▁▅▁▄▅▁▅▁▄▁▁▄▁▁▄▁▁▁▄▁▁▁▄▄▅▆██▇████████▆▆█▇▇▆ █
3.21 ms Histogram: log(frequency) by time 5.36 ms <
Memory estimate: 7.33 MiB, allocs estimate: 42773.
```

To summarize, acsets and tools for working with them provided by AlgebraicJulia can greatly ease development of complex mathematical programming models. Relations, unions, and other concepts from sets are elegantly generalized by category theory, and can help develop more correct, formal, and expressive models. While the optimization “model” examined in this article is fairly abstract, we will investigate model of more practical interest in future installments on the blog.

I would like to thank Kevin Carlson for his assistance regarding DataMigrations.jl.

Fourer, Robert. 1997. “Database Structures for Mathematical Programming Models.” *Decision Support Systems* 20 (4): 317–44.

Patterson, Evan, Owen Lynch, and James Fairbanks. 2021. “Categorical Data Structures for Technical Computing.” https://doi.org/10.32408/compositionality-4-5.

Part of the AlgebraicJulia vision for scientific computing is that a scientific model should be piece of data that can be inspected, analyzed, passed between programming languages, and saved in a database.

In order to do this, we need to make sure that different languages can load and save the models.

One way to do this would be to define a data type for “all scientific models”, and then implement that data type in each programming language we care about. But this is clearly ridiculous; there is no one data type that can encompass every single scientific model. Moreover, often we want to specify that we only want a certain type of scientific model.

Another approach would be to manually implement, for each type of scientific model, types in every language we care about. However, this is an problem, where is the number of languages and is the number of types of scientific models. Moreover, it is error-prone (because there are subtle differences between the type systems of different languages), and would be a massive drag on rapid iteration; any new type of model or change to a modeling framework needs to be implemented across many different languages.

The better way to do it would be to define your types *once*, in a language-agnostic way, and then generate the types in each language automatically along with serialization/deserialization code. This sort of system has been done before: see

The relevant XKCD is, of course,

So why make a new one? Well, I want to support ACSets natively, as many of our scientific models are built on top of them (Patterson, Lynch, and Fairbanks 2021). And it seemed that modifying an existing system would be more work than building a new one from scratch. But more importantly, I find that building this kind of thing from scratch gives you a much better picture of the kind of design decisions that go into this, and thus if I end up trying to modify a pre-existing one later down the line I’ll have a better idea of how to go about it.

One difference between InterTypes and these other formats is that I don’t intend (at least at first) to have a custom serialization format that goes along with it. The main feature of InterType is to generate the data structures in each programming language. Currently we have serialization/deserialization to JSON (including 64-bit integer support!), but we could also support other serialization formats. The whole point of InterTypes is that you shouldn’t have to think about the underlying serialization details. Along with this, from an intertype specification one can generate a JSONSchema file that describes the JSON produced by the automatically generated serialization.

WARNING: InterTypes is *alpha-quality* software, and not only are there certainly bugs but also the interface to it may change radically.

The core of InterTypes is the intertype schema, which declares a collection of types that can refer to one another. An intertype schema is a file ending in `.it`

. Currently, we use the Julia parser to parse the `.it`

file, and we use Julia to generate code for all languages. However, we hope to in the future to produce a standalone binary that will parse the `.it`

file and generate the code in other languages. So although `.it`

files look like Julia, most features of Julia will not work with in an intertype schema; for instance, you cannot define functions in an intertype schema, or refer to types that are defined outside of an intertype schema.

There are 4 fundamental building blocks of InterTypes.

- Primitive types.

`Int32`

/`Int64`

/`UInt32`

/`UInt64`

for integer numbers. We have 32 bit and 64 bit integers, because only 32 bit integers are safe to put in JSON numbers and 64 bit integers must be put in JSON strings.`Bool`

for booleans.`Float64`

for floating point numbers.`String`

for strings.`Symbol`

for symbols. In languages that don’t have symbols, this is the same as`String`

.`Vector{T}`

for representing sequences (arrays and lists) of type`T`

`Binary`

for sequences of raw bytes without a numeric interpretation. In Julia, this maps to`Vector{UInt8}`

, but we think of it as a binary blob rather than a sequence of values.`Dict{K,V}`

for representing dictionaries with key type`K`

and value type`V`

.

- Structs. A struct has a list of fields and each field has a name and a type. This looks like:

```
struct Point2D
x::Float64
y::Float64
end
```

- Sum types, also known as “tagged unions”. A sum type has a list of variants, and each variant is a record containing fields. This looks like

```
@sum Op begin
Plus
Mul
end
@sum Term begin
Constant(val::Float64)
App(op::Op, arg1::Term, arg2::Term)
end
```

- ACSets. ACSets are handled a little differently than they work in Julia, in order to paper over the fact that I have yet to fully figure out Python’s (and pydantic’s, which is the validation/serialization framework) support for generic types. When you declare the schema, you have to specify a concrete type for every AttrType. Then, when you declare an instance of the schema, you do not get a generic instance like you do in Julia; you get an instance with attribute types fixed to the supplied types. This looks like:

```
struct EdgeData
name::Symbol
length::UInt64
end
@schema SchGraph begin
(E,V)::Ob
(src, tgt)::Hom(E, V)
end
@schema SchWeightedGraph <: SchGraph begin
Weight::AttrType(EdgeData) # note that we provide a type here
weight::Attr(E, Weight)
end
@abstract_acset_type AbstractGraph
@acset_type EDWeightedGraph(SchWeightedGraph,
generic=WeightedGraph, index=[:src, :tgt]) <: AbstractGraph
```

This is equivalent to the Julia code

```
struct EdgeData
name::Symbol
length::UInt64
end
@schema SchGraph begin
(E,V)::Ob
(src, tgt)::Hom(E, V)
end
@schema SchWeightedGraph <: SchGraph begin
Weight::AttrType # note that there is no type here
weight::Attr(E, Weight)
end
@abstract_acset_type AbstractGraph
@acset_type WeightedGraph(SchWeightedGraph, index=[:src, :tgt]) <: AbstractGraph
const EDWeightedGraph = WeightedGraph{EdgeData}
```

However in the Python code, no data structure with the name `WeightedGraph`

is produced; only `EDWeightedGraph`

. This is because the Python and Julia ACSets code were written pre-intertype, so their handling of attrtypes weren’t fully compatible, and we had to get something working; hopefully in the future Python and Julia will be more congruous. This is a good first issue for someone familiar with types in Python/pydantic!

To use an intertype schema, one “declares an intertype module” like so:

`@intertypes "weightedgraph.it" module weightedgraph end`

Then `weightedgraph`

is a module that contains an export for each type defined in `weightedgraph.it`

. It also contains a `Meta`

variable, which stores the parsed intertype definition. This can then be used to write out generated python code, via

`generate_python_module(weightedgraph, ".")`

which writes a python file called `weightedgraph.py`

in the current directory. This python file imports both `acsets`

and `intertypes`

, so in order to use it one must have the py-acsets library installed, and also a copy of `intertypes.py`

, which can be produced with

`write("intertypes.py", InterTypes.INTERTYPE_PYTHON_MODULE)`

In a similar manner, a JSONSchema definition for the json produced by intertypes can be produced with

`generate_jsonschema_module(weightedgraph, ".")`

which writes a JSONSchema file called `weightedgraph_schema.json`

in the current directory. This is a file which has a JSONSchema `def`

for each type in the intertype definition file.

Intertype modules can refer to one another. For instance, we could write another file called `twoweightedgraphs.it`

with contents of:

```
struct TwoWeightedGraphs
g1::weightedgraph.EDWeightedGraph
g2::weightedgraph.EDWeightedGraph
end
```

and then import it like:

```
@intertypes "twoweightedgraphs.it" module twoweightedgraphs
import ..weightedgraph
end
```

In fact, `weightedgraph.it`

and `twoweightedgraphs.it`

could be in completely different packages; as long as the first package exports the `weightedgraph`

Julia module this will work fine.

For more examples of how to use intertypes, it would probably be best to refer to the test file.

There are a lot of directions I’m excited to take intertypes in.

First of all, I need to write more documentation beyond this blog post.

After that, I plan to add support for Scala, TypeScript, and Rust. Scala in particular would solve a lot of problems between AlgebraicJulia and Semagrams, so I’m going to tackle that next.

Thirdly, I’d like to think about integrating GATlab with InterType, so that scientific models with algebraic expressions in them can be first-class.

Longer down the road, I want to investigate combinatorial data structures beyond acsets, as laid out in array systems and combinatorial data structures via finite existential types, and also think about *structured version control* for intertypes in line with chit.

But finally, I want to use intertypes to make the vision at the beginning a reality, a vision where scientific models can be passed around between programming languages and stored in databases. This is beyond a technical vision; this is a *social* vision; I hope to reshape how people think about scientific models. If you are interested in this, please reach out on the category theory zulip, julia zulip, localcharts, or github issues.

Patterson, Evan, Owen Lynch, and James Fairbanks. 2021. “Categorical Data Structures for Technical Computing.” https://doi.org/10.32408/compositionality-4-5.

$$

$$

*Note: the code in this post is not kept in sync with the latest developments of AlgebraicRewriting.jl. Please see the official documentation for examples of code that runs with the latest versions of AlgebraicJulia libraries.*

Scientists and engineers are often interested in representing the state of the world, . We might decide to encode this as the set of possible instances of a class,^{1} or as the set of terms of an algebraic data type,^{2} or as possible databases on a schema (i.e. -sets). We are also often interested in encoding the transition , a relationship between one state of the world and the *next* one. If we can communicate this relationship to a computer, then we can execute simulations to see how hypothetical states of the world evolve over time. The question we explore in this post is: **what is a good syntax for representing these kinds of functions?** Some virtues that we will care about:

- It should be easy for an engineer to
*construct*such functions from simpler ones. - It should be easy to implement the simulator in a clean, elegant way that does not have complicated edge cases to handle.
- The syntax should be
*general*, such that we can use the same code to perform, e.g., chemistry, robotics, or epidemiology simulations.

- It should be natural to generalize the output of the simulation (e.g. many possible outcomes, , or a probability distribution of outcomes, ) without changing much code.
- The structure of the representation should be transparent and introspectable:
- We can write programs to check whether properties of interest hold (e.g. will the program terminate on all inputs?).
- Most importantly, if our representation of the world is fundamentally changed (suppose the world is better modeled by , rather than ), we can
*automatically*convert our -simulator to a -simulator, given some high-level description of the relationship between and .

It is difficult to obtain many of these properties when one’s syntax for is an arbitrary datatype and one’s syntax for the relationship is just an arbitrary function in your programming language. However, we propose that the syntax of **directed wiring diagrams** organizing **rewrite rules** makes these properties more tractable. Although this post will focus on engineering, we can use category theory to describe in more detail what it really *means* to be a good syntax and show how we can interpret our directed wiring diagrams as agent-based models.

This schema (with objects: Wolf, Sheep, Edge, Vertex, and attribute types: Dir and ) characterizes states of the world where there are wolves and sheep moving around a directed graph.^{3} These animals have a position, direction, as well as an energy level. Furthermore, the vertices have grass growing which is also represented by an energy level (let `grass=0`

mean the grass is ready to eat, while `grass=n`

means there are days left for the grass to grow). The legend on the right shows how we informally represent instances of this schema, although for simplicity we neglect to visually depict the direction of edges or animals. Using Catlab.jl, we can define this schema:

```
using Catlab, AlgebraicRewriting
@present SchLV <: SchGraph begin
(Sheep,Wolf)::Ob
spos::Hom(Sheep, V); wpos::Hom(Wolf, V)
(Dir,Eng)::AttrType
grass::Attr(V, Eng)
seng::Attr(Sheep, Eng); weng::Attr(Wolf, Eng)
dir::Attr(E, Dir)
sdir::Attr(Sheep, Dir); wdir::Attr(Wolf, Dir)
end;
@acset_type LV_Generic(SchLV) # Dir and Eng are abstract
const LV = LV_Generic{Symbol, Int} # Dir is a Symbol, Eng is an Int
const yLV = yoneda_cache(LV) # precompute the 'generic' sheep/wolf/grass etc.
```

Rewrite rules are good candidates for simple building blocks, as they can express interesting dynamics while nevertheless being easier objects to work with than general purpose code.^{4} This has the data a partial map which says, for any pattern match of into one’s world of interest, a possible way for the world to update is to replace with . For example, the below rule says that, if a wolf and sheep are in the same location, the wolf can eat the sheep and gain its energy units.

Or, in code:

```
# pattern we're looking for: wolf + sheep on the same V
L = @acset_colim yLV begin s::Sheep; w::Wolf; spos(s)==wpos(w) end
# what we're replacing with: just a wolf
R = @acset_colim yLV begin w::Wolf end
# Get the ID's of the relevant energy variables
L_wolf, L_sheep, R_wolf = [only(x).val for x in [L[:weng], L[:seng], R[:weng]]]
# Combine into a rule: L <- R -> R
wolf_eat = Rule(homomorphism(R,L), id(R);
expr=(Eng=Dict(R_wolf => engs -> engs[L_wolf]+engs[L_sheep]),))
```

As another rule, consider two sheep which reproduce if the vertex they share is grassy, *unless* there is a wolf within striking distance. We can add to the data of a rewrite rule a Negative Application Condition (NAC) which embeds the matched pattern into a forbidden pattern with the implied semantics of: the rule cannot be applied to a given pattern if also matches.

Or, in code:

```
# Pattern we're looking for: sheep + sheep on green grass (i.e. grass = 0)
L = @acset_colim yLV begin
(s1,s2)::Sheep;
spos(s1) == spos(s2); grass(spos(s1)) == 0
end
# Replacing with: new sheep facing North @ 5 eng
R = @acset_colim yLV begin
(s1,s2,s3)::Sheep;
spos(s1) == spos(s2); grass(spos(s1)) == 0
spos(s2) == spos(s3); seng(s3) == 5; sdir(s3) == :N
end
# Negative application condition
N = @acset_colim yLV begin
(s1,s2)::Sheep; w::Wolf; e::E
spos(s1) == spos(s2); grass(spos(s1)) == 0
src(e) == wpos(w); tgt(e) == spos(s1)
end
NAC = AppCond(homomorphism(L, N; monic=true), false)
# Overall rule
sheep_reprod = Rule(id(L), homomorphism(L,R; monic=true), ac=[NAC])
```

Rewrite rules are nifty ways of expressing some basic data operations (merging, copying, deleting, adding) and basic logic in a graphical / combinatorial way, i.e. requiring only *data* rather than general purpose *code*. But one quickly hits limits in expressivity for what can be accomplished by a single rewrite rule, and there remains the problem of how one structures the execution of many rewrite rules in an organized fashion.

If we view the rewrite rule as system that can be entered and exited in various ways, we might draw it like this:

During a simulation, we imagine that the ‘world state’ moves along wires in a diagram like above and is possibly altered by the boxes it passes through. For a deterministic simulation, after entering a box, we will then exit exactly one of the box’s outwires. Although the above box is an example of something that can alter the world state living on the wire, it has no *internal* state itself. In contrast, we can consider control flow boxes which cannot affect the state of the world yet can have their own state and choose which outwire to exit through as a function of their internal state and the world state.

For example, we can make a `repeat3`

box which sends the world out the first wire the first three times it run, and subsequently outputs on its second wire. Another box, `coin`

, flips a pseudorandom coin to decide where its output goes. The following diagram communicates a program which first flips a coin, and (if heads) it attempts three times to apply the Reprod rule, ending immediately if at any point it is successfully applied.

There are many tools at our disposal to construct wiring diagrams like these using Catlab and AlgebraicRewriting. These include calling `A⋅B`

to compose diagrams (which match head-to-tail) in sequence, and calling `A⊗B`

to compose diagrams in parallel. Another powerful tool, which constructs the above diagram in one step, uses the fact that *acyclic* wiring diagrams can be expressed very naturally using a simple programming-like syntax, where variables correspond to wires, ‘calling a function’ corresponds to feeding wires into a box, and the special syntax `[x₁,...,xₙ]`

corresponds to merging wires together:

```
mk_sched(
# 1 *looped* argument
(r3_loop,),
# 1 normal argument
(coin_in,),
# Bind these names to boxes / wiring diagrams defined elsewhere
(C = coin, R3 = repeat3, R = sheep_reprod),
# Construct an acyclic wiring diagram
quote
coin_yes, coin_no = C(coin_in)
R3_input = [r3_loop, coin_yes]
repeat3_looping, repeat3_finished = R3(R3_input)
reprod_suc, reprod_fail = R(repeat3_looping)
output = [reprod_suc, repeat3_finished, coin_no]
# The 1st argument is fed back into `r3_loop`.
return (reprod_fail, output)
end)
```

Our desired program was not, in fact, acyclic, but by asking for the first outputs to be looped back into the first inputs (where is the length of the first argument to `mk_sched`

), we have a general strategy for making diagrams with loops.

Simulations are often **agent-based**: instead of thinking of the world state as a monolithic thing and rewrite rules as functions that operate on the entire world state, we think of the world state as a collection of agents operating in a shared environment. Updates are *relative to a particular agent* performing the update.

Thus, we must now consider a state *and* a particular choice of agent as living on wires, and we must think of rewrite rules as executed “from the perspective of that agent”. Let’s be more precise about what an agent is: given a world state , we want to pick out a particular substructure of , meaning our agents are actually maps *into* , where describes the *shape* of the agent. Note that the empty ACSet, denoted as , picks out nothing in particular and corresponds to our earlier perspective of a monolithic view of the entire world, .

So how does this change our notion of a rewrite rule ? We are no longer rewriting a state but rather a state with an agent: . We need an extra map to show, given the agent, how it must relate to our pattern. Furthermore, we need an agent (possibly different) from which to exit the rule application, given by a map . Adding the data in the `wolf_eat`

rule transforms the it from “Some wolf eats some sheep” into “*This* wolf eats some sheep”.

Most often, the incoming and outgoing agents for a rewrite rule are the same. An example where this is not possible is the rule that says “*This* sheep starves if its energy reaches .”

`Query`

boxThe *trajectory* of the world state while executing a program looks like this:

This means it’s possible to take an agent and interpret it in the ‘current’ state of the world, , by composing it with the partial maps between and . This updating of an agent could result in a map which is total (i.e. the agent ‘survived’ the update process) or partial, which would happen if some part of the agent was deleted between step and step . We take advantage of this ability to update agents with the last major kind of box, the **Query** box.

These are yellow boxes which execute a subroutine for each agent of a particular shape. Once this set of matches is found, the query box stores them in its internal state. Each time we return through the second in port, we pop off the next agent in the queue and exit the second out port. Once this queue is empty, we exit out the first port with our original agent. (There is an edge case to consider: what if, while executing the actions of all the sub-agents from the query, our original agent is deleted? In this case we exit a third output door, which has no agent.)

For example, we may wish to organize our wolf-sheep-grass model as a `while`

loop wrapping three `for`

loops, doing some actions per sheep, per wolf, and per vertex.

In the figure above, ‘Daily’ refers to another program which performs the actions that are common to both sheep and wolves (e.g. rotating, moving, starving). These are originally defined for just sheep, but by observing the symmetry of wolves and sheep in our schema, we can use a data migration to automatically translate a Sheep Daily program into a Wolf Daily program.

We may consider other ways of changing our agent type other than the `Query`

box. A simple one is the **Weakening** box, which is specified by a morphism between agent shapes: . It converts agents into agents without changing the state of the world. This is because we can precompose this morphism with an agent . In particular, for any agent we can use a weakening to discard focus on the current agent.

Here is a sort of idealized characterization of 2-D drawings of wiring diagrams:

- A wiring diagram can be one of some collection of primitive icons, e.g.:
- perfectly horizontal wires and a
- a pair of crossing wires
- an element from primitive set of boxes with various in ports and out ports

- A wiring diagram can be made by placing simpler ones side by side

- A wiring diagram can be made by placing simpler ones one on top of the other.

The rigid, grid-like diagrams generated by the above rules can be understood as in correspondence to certain expressions in a mathematical theory. Note that the theory comes with its own notion of equality (e.g. ), *and* the image has its own notion of equality too: we consider images “up to planar isotopy”, which makes rigorous the notion that it doesn’t matter if you deform the precise locations of things so long as you preserve the same connectivity. Under very special circumstances we can obtain a *coherence theorem* for a class of wiring diagrams and a particular theory: this says that all equations of the diagrams correspond to equations in the theory (soundness) and all equations of diagrams are a consequence of the theory axioms (completeness). This allows us to relax how rigid our diagrams are depicted while still retaining the formality of terms in a rigorous, mathematical theory.

For example, consider the correspondence of wiring diagrams (without feedback loops) to a theory of symmetric monoidal categories (SMCs) with monoidal unit and a supply of monoids.

Once we have interpreted our wiring diagram as a morphism in a certain kind of category, applied category theory seeks to interpret the morphism in some real world context, provided that real context can be given the structure of a category (including the satisfaction of required axioms). For a certain toy model of programs with control flow, the third column describes how one can interpret the mathematical expressions in the setting of programs, such that compositions of diagrams can rigorously be interpreted as compositions of programs.

To show the correspondence of the first two columns in action, consider the below composite wiring diagram, which takes for granted a morphism . Dotted lines are added to visually guide you in parsing the various icons, which are composed vertically and horizontally.

Thanks to a coherence theorem, we can rest easy knowing that the diagram has an unambiguous formal meaning, even if some of the lines are a bit wiggly and don’t *exactly* match one of our icons.

Some great follow-up reading for what it means for these diagrams to be *formal* are Baez and Stay (2011) and Selinger (2011). Also, a great application of graphical languages is pedagogically described in Pawel Sobocinski’s blog Graphical Linear Algebra.

To connect the above ideas to the graphical language explored in the bulk of this post: the syntax of directed wiring diagrams (*with* feedback loops) can be given the semantics of a *traced SMC*, though unpacking what that means theoretically is beyond the scope of this post. In a programming-like setting (the final column in the table above), this corresponds to `while`

-loop style iteration.

In Brown and Spivak (2023), a theoretical underpinning of this work is proposed. The primary contributions are formulating a general theory of discrete dynamical systems, making progress towards showing that the relevant category is traced monoidal, and then showing how this informs the implementation of the graph rewriting programs described in this post. This involves *enriched* category theory, the language of polynomial functors, and is parameterized by a polynomial monad, such as:

where . It is beyond the scope of this post to rigorously explicate this formalism, which is gently introduced in Niu and Spivak (n.d.). However, to provide some intuition for how, in practical ways, this formalism informed the design of the software infrastructure that represents and executes these graphical programs: consider a box `swap-x2`

which has two inputs. Initially, it behaves by swapping the inputs, but once the left inport has been entered twice it behaves like a pair of parallel wires.

The math dictates that what actually should go into this box is a “behavior tree”, where the branching of the tree is dictated by the number of input wires. For every input, we specify how the behavior changes by pointing to a new behavior tree, whose nodes indicate how the input is transformed into outputs.

All of the rewriting primitives (e.g. `Rewrite`

, `ControlFlow`

, `Query`

, `Weaken`

) are implemented in this generic manner, meaning just one implementation (i.e. behavior trees) needs to be written, rather than many special cases for language primitives. Furthermore, the language can be extended, with the formalism allowing us to not worry about unforeseen consequences that follow from *ad hoc* extension of the language - we have a precise specification for what structure a new primitive needs to satisfy.

The setting of polynomial functors is so expressive that the interfaces of boxes, their internals, their infinite behaviors, and monadic effects on their outputs (like and ) are all encompassed in the same formalism, which leads to a cleaner and more general implementation than what would have been arrived upon if the problem were approached head-on in a traditional engineering style.

The virtues of this approach largely come from the fact that neither code nor arbitrary functions are in principle required to specify our dynamical system . Rather than arbitrary functions (which are hard to manipulate and reason about) being the input, we automatically generate our simulator program from static, graph-like data. While we happen do this in Julia, this could in principle be done in any language because our understanding of a graph rewriting program is mathematical, rather than dependent on implementation details. In collaborations involving multiple teams, it can be helpful that the data of these programs can be serialized in a language-agnostic way.

I have also personally found it immensely useful to be able to apply functorial data migration (see Spivak (2012)), e.g. swapping sheep and wolves, to allow high level reuse of programs in different contexts, which can be done cleanly and rigorously when one’s model is expressed via combinatorial data rather than low-level code.

Although there was not space to describe how all of the virtues listed in the introduction are facilitated by this formalism, I hope you’re intrigued by the possibility of structuring your model of the world’s dynamics in this formalism. Furthermore, I hope this case study in applied category theory encourages you to look into other kinds of scientific or engineering tasks can be made more scalable and maintainable through these kinds of abstractions.^{5}

Baez, John, and Mike Stay. 2011. *Physics, Topology, Logic and Computation: A Rosetta Stone*. Springer.

Brown, Kristopher, Evan Patterson, Tyler Hanks, and James Fairbanks. 2023. “Computational Category-Theoretic Rewriting.” *Journal of Logical and Algebraic Methods in Programming*, 100888. https://doi.org/10.1016/j.jlamp.2023.100888.

Brown, Kristopher, and David I. Spivak. 2023. “Dynamic Tracing: A Graphical Language for Rewriting Protocols.” https://arxiv.org/abs/2304.14950.

Niu, Nelson, and David I Spivak. n.d. “Polynomial Functors: A General Theory of Interaction.”

Selinger, Peter. 2011. “A Survey of Graphical Languages for Monoidal Categories.” *New Structures for Physics*, 289–355.

Spivak, David I. 2012. “Functorial Data Migration.” *Information and Computation* 217 (August): 31–51. https://doi.org/10.1016/j.ic.2012.05.001.

E.g., if we live in an object-oriented paradigm:

↩︎`class WorldModel { int NumIDs() const; // id's for objects in the world int LoadRobot(const string& fn); int AddRobot(const string& name,Robot* robot=NULL); void DeleteRobot(const string& name); int main(int argc,const char** argv) { WorldModel world; //create a world double dt = 0.1; //between printouts while(sim.time < 5) { //run the simulation sim.Advance(dt); //move the sim fwd sim.UpdateModel(); //update the world cout<<sim.time<<'\t'<<world.robots[0]->q<<endl; } return 0; } }`

E.g., if we live in a functional paradigm:

↩︎`data Particle = Particle { idx :: Index, pos :: Position, vel :: Velocity} type Model = [Particle] simulate :: Display -- Window config -> model -- Model -> (model -> Picture) -- Draw function -> (Float -> model -> model) -- Update function -> IO ()`

This is inspired by Netlogo’s wolf-sheep predation model. A more faithful reproduction of that model’s dynamics is found in the docs of AlgebraicRewriting.jl; in this post, we’ll focus on showcasing a more diverse set of AlgebraicRewriting’s features.↩︎

For more background and computational details, see Brown et al. (2023). See also Angeline’s description on this blog.↩︎

This work came about through conversations very much akin to the example conversations between the domain expert (me) and ACT expert (David Spivak) in David’s What Are We Tracking talk, so I encourage you to watch that to get inspiration for other potential fruitful collaborations with mathematicians.↩︎

$$

$$

A central data structure in AlgebraicJulia is the acset, short for *attributed -set*. For background on acsets, see this blog post, or the original paper (Patterson, Lynch, and Fairbanks 2021). -sets store *combinatorial* data, which consists of sets of indistinguishable objects (such as the vertices of a graph) related by functions. Acsets extend this to include *non*-combinatorial data, which consists of things with intrinsic meaning such as integers, strings, and so on.

To recall notation, in order to give a category of acsets we first give a *schema*, a profunctor Then an acset is a copresheaf on the collage of with a fixed restriction to giving the attribute types, for which reason we’ll call the “typing map.” The morphisms of acsets thus fix everything in exactly. For example, a morphism between -weighted graphs can in principle send any vertex in to any vertex in , but we are *required* to send an edge of weight in to an edge of weight in .

But for many purposes this is too restrictive. This post is about a generalization of attribute types that allows us to have special attribute values, called *variables*, which can freely map to concrete values. A major motivation for developing this generalization is to support rule-based rewriting for acsets. We’re implementing these ideas today in our library Algebraic Rewriting.

As an example scenario, suppose we are trying to program a robot to assemble objects of various shapes from raw materials. Our first job in modeling this scenario is to pick a schema to represent the state of the world, as perceived by the robot. We’ll use a schema for *mechanical linkages*, which we’ll model as an attributed symmetric graph, where edges represent links using a length attribute, and vertices represent positions using a coordinate attribute.^{1}

As an illustrative rewriting rule, let’s suppose that, whenever our robot sees a rectangular figure, it is allowed to tear the top off and fold the remaining open shape into an isosceles triangle. We can draw this fact as a rewriting rule; however, a well-formed acset requires us to put concrete attribute values for all the lengths and positions:

Having been forced to pick particular values, we are then unable to apply the rule to any rectangles of different dimensions (or located at different coordinates). The notion of a variable which can map into an arbitrary concrete value will allow us to write the rule we intended to write.

*Varacsets*, short for “variable-equipped acsets”, allow for a supply of distinguishable *variables* from attribute types, which can be mapped to constants under acset morphisms. This should leave our rewrite rule looking something like this (with position variables omitted for brevity):

By allowing a set of distinguishable variables, which can explicitly be equal or not equal to each other, rather than merely extending the attribute type to include a wildcard element, , we can model a *parallelogram* with the leftmost acset, rather than an arbitrary quadrilateral.

Although we can visualize these varacsets as undirected graphs, varacsets on any schema can be viewed as a set of tables in its database representation:

Although the objects of acsets are the same as relational databases, they are much more powerful because we understand them as a category and can perform categorical constructions. Thus it’s important to understand how varacsets are a category such that we can find homomorphisms and compute (co)limits.

Let’s figure out what the category of varacsets looks like more precisely. Given a schema we start with a typing map which gives the values of each attribute type. Our full varacset should augment this data with combinatorial data over and a supply of variables over plus the actual attribute functions. Furthermore, we want to end up in a category where the morphisms fix the constants of each attribute type but move variables freely. This defines a category well enough, but its nature isn’t very clear.

In the original acsets paper, the authors proved that the category of acsets with attribute types determined by is equivalent to the slice category of over a certain functor closely related to . This characterization is very helpful in seeing that the category of acsets with fixed typing is friendly (indeed, a topos), but it’s not computationally usable, because the functor (a certain right Kan extension) is generally something badly infinite. But the most interesting computations we can do in AlgebraicJulia, such as acset homomorphism search, depend on looking at acsets whose values over are finite sets. In any case, this trick can’t be readily imported to our situation, because the analog of would depend on the variables present in each particular acset and not only on the schema.

While the collaginess of sure makes us want to think of being *over* in some sense, this is largely an illusion. What’s much more straightforward to formalize is a view of as *under* a relative of namely, the functor which adds empty values over to extend to all of

Indeed, the *coslice* category is awfully close to the category of acsets. Specifically, if you restrict to the full subcategory of the coslice spanned by maps from which are *invertible* over then this is equivalent to the category of acsets as defined above. Indeed, maps in have to leave the image of , i.e. our attribute values, put, but are just any old natural transformations on the rest of the functor, i.e. our combinatorial data.

For varacsets, we don’t want a coslice object which is *iso* over though. Instead, since we want to be able to add variables to our attribute types, we’d better relax at least to a mono. And that’s really all you have to do! The key observation is that *variables* in the attribute types behave precisely like combinatorial data: we can permute them around however we want without changing the meaning of an acset. This is basically the category we have implemented in AlgebraicJulia today: the full subcategory of on monomorphisms.

The nicest definition of a category of varacsets with attribute types given by though, is simply the *entire coslice category* without the monicity restriction. One can imagine applications of the case where the coslice morphism is non-monic; for instance, such a varacset on the weighted graph schema would allow us to mix ordinary weighted graphs, with edges weighted in with “angled graphs”, with edges weighted in the circle, In this case, the component of the coslice structure map at the weights object would be the canonical projection from the line to the circle.

Is this good? Maybe! In any case, the full coslice category has much better mathematical properties than the mono-coslice category. We will show in the next section precautions we can take if we wish to maintain the monicity restriction in practice.

We can also define *wandering variables* as elements over some that are not in the image of any attribute function. Varacsets with no wandering variables have some good properties; for example, only if has no wandering variables can we enumerate morphisms (In database theory lingo, we could say that such an acset has all its variables in the *active domain*.)

To illustrate, when each weight variable in a weighted graph has an associated edge, the regular acset search algorithm (which finds all compatible assignments of edges and vertices) will homomorphism determine where the edge weights must be sent via the naturality condition associated with the weight attribute (). However, a wandering variable’s image is not determined by this constraint, and in principle it could map to *any* weight variable in as well as any concrete value in . This means there would be an infinite number of morphisms We’ll see further below that the category of acsets without any wandering variables is better-behaved than you might expect.

The coslice is complete and cocomplete, and more: it’s not a topos anymore, but it *is* the category of models of a multi-sorted algebraic theory, which is pretty good.^{2} The monos-only category won’t even be cocomplete, since for instance pushouts will screw up mono-ness. In contrast, has colimits computed mostly as in that is, levelwise; you just have to replace coproducts with pushouts under and similarly for other colimits over non-connected diagrams.

In terms of varacsets, passing to the full coslice category allows us to consider something like the pushout of a span whose legs send a variable-weight edge to edges of two different constant weights.

In the pushout, we just won’t be able to distinguish those two weights from each other anymore. If they don’t *need* to be distinguished, then this is great! That said, we generally only expect to be computing colimits that don’t require gluing together terms of the type of constants in an attribute, and so far we do not support any means of representing the resulting infinite sets with some elements marked as equivalent. So, with the caveat that we throw a runtime error in case concrete attribute values get identified, the colimits coming from are doing exactly what we want.

Varacsets are an unusual category in which limits are trickier than colimits. To be sure, the limits in a coslice of are computed exactly as in that is, levelwise. However, these limits have some issues, semantically. For instance, suppose you take the *product* of, say, two weighted graphs, with weights valued in Then the product in is going to have the weights valued in and which is a whole different kind of situation!

While this is categorically-correct behavior, it’s not what we want in practice. Instead, we’d really like to make a construction that’s at least product-*like* that does not change the datatypes of attributes. It’s actually harder than you might guess to find a category in which to do this. The mono-coslice category we discussed earlier has the same products as the full coslice, and the need to allow for variables stops us thinking about the iso-coslice.

For further comparison, in the original slice model of acsets, the products are also a bit questionable: a product in a slice is a pullback in the base, so you end up getting a construction such that, for example, the “product” of two weighted graphs only gets those edges in the product of the underlying graphs that have the *exact same weight* under both projections! The user often won’t want to throw away so much information just to get an object projecting nicely onto the two given objects.

So the product in the coslice has the advantage of not throwing away information, but the disadvantage of changing the attribute types every time you take a product, while the product in the slice reverses these advantages and disadvantages. We’d like to at least have the option of stabilizing the situation, so that we can take products or do something similar in a category of acsets with *fixed* attribute types.

Given two varacsets on a schema the product-like thing you can build from them in AlgebraicJulia is constructed as follows, in elementary terms:

- Replace and with their
*abstractions*copresheaves on (or varacsets with empty typing functor) that move every constant attribute value occurring in and to a new variable. (So if weights and occur in then will have three new variable weights representing those values.) - Take the product of and over and provide them with variables over for every pair of attribute values in the two factors.
- Add the original ’s constants back into the attribute types in this product-ish object to get a varacset over

Let’s call the resulting construction the “faux product.” For instance, if and are weighted graphs, then the faux product has underlying graph the product of the underlying graphs of and with the type of constant weights, and variable weights on *every* pair of edges from and These variables coincide if and only if the weights of their two edges coincide.

`x`

is We can extend this to limits more generally, such as the following example of stratification which more closely matches our geometric intuitions for how products of graphs ought work.

This faux product gets us most of what we wanted. While it’s odd to have turned all the weights into variables, the user can *evaluate* the weights in the faux product in any way they like after the construction; for instance, by taking the sum or product of the weights from the two factors, or even by the tautological evaluation to a pair in Since what evaluation, if any, will be appropriate depends on the application at hand, is a good place for the general code to leave off.

It turns out that has a pretty pleasing categorical description. In general, let be a coreflective subcategory, that is, is fully faithful and left adjoint to If has finite products, then there’s always an induced monoidal structure on given by and with as the unit. That is, take the product, and then hit it with the idempotent comonad associated to the coreflective subcategory.^{3}

In our case, the construction above can be modeled by defining the *abstraction functor* that sends a varacset on schema with typing to the copresheaf on that coincides with over and such that, on we have That is, over an attribute type, has one combinatorial piece of data for each value of that attribute achieved by some element of

Note that always lacks *wandering variables*. And in fact, while abstraction is not a right adjoint when valued in , it *is* a right adjoint when we take the codomain to be the full subcategory spanned by those copresheaves with no wandering variables. (Limits in this subcategory take limits in and then throw away wandering variables.) The left adjoint to abstraction simply sends to This is fully faithful (but only because lacks wandering variables!) and so we’re in the situation of the previous paragraph.

That is, to summarize, we have an idempotent comonad on the category of varacsets on with typing sending to the varacset with the same set of constants as at every attribute type, but with all the attribute *values* from replaced with new variables. Then the faux product defined above is precisely ^{4}

Varacsets, originally born out of practical engineering demands for rewriting with attributes, are put on much more stable footing by working out the above theoretical considerations. Just as acsets were not implemented as a slice category, varacsets demonstrate that the right mathematical model doesn’t have to be isomorphic to your implementation: understanding varacsets as a coslice (despite implementing them as a database) is helpful for informing us which operations make sense or do not make sense to perform on varacsets.

Although we focused on rewriting rules in this post, there are lots of other ways to apply attribute variables, e.g. when modeling systems in which some attribute data is unknown. This allows us to -migrate data with attributes (via a left Kan extension), as variables give us a universal way to assign attribute values to new data. We’re excited to push varacsets in other directions like this in the future!

Note that the length of an edge must equal the Euclidean distance between its source and target to give a semantically valid instance of this schema. This can only be handled perfectly via a Cartesian schema, so we’ll ignore the constraints for now.↩︎

I really just mean to say the forgetful functor is monadic, which is easy to check from the monadicity theorem. But you can explicitly realize such a theory by augmenting seen as a theory in the sad logic of a plain category (so only unary operations), with a terminal object and maps from it giving nullary operations for every element of ↩︎

That this is a monoidal structure follows from the natural isomorphism ↩︎

Limits of other shapes can be constructed in an entirely analogous way: abstract the diagram, take the limit in and then apply the adjoint of abstraction to get your typing functor back.↩︎

$$

$$

We begin this post by talking about open dynamical systems. Open dynamical systems have been studied extensively within applied category theory (see Myers (2022)), and additionally have been a part of AlgebraicJulia for a while, with AlgebraicDynamics, accompanying blog post.

In this post, I sketch out an approach to doing open dynamical systems *symbolically*. This means that the vector fields for open dynamical systems are given by symbolic expressions, rather than arbitrary Julia functions.

This has the following advantages.

- Expressions can be serialized and stored in a database.
- Expressions can be compared for equality modulo the laws of a theory.
- The execution of expressions can be optimized in different ways, or translated into other formats.
- Expressions can have metadata attached.
- Expressions can be input from non-Julia programs.
- Expressions can be displayed in a nice format to the user.

In my previous blog post, I lay out the mathematics behind symbolic representation of functions between spaces. Here I now apply that theory to the specific case of open dynamical systems.

We start out by reviewing open dynamical systems.

In this section we use the word “space” and don’t define what a space is. This is because in a certain sense, we are agnostic as to the exact definition of space. All we really care is that the space supports some notion of derivative, as we are doing continuous-time dynamical systems. In order of increasing generality, a space could mean:

- for some
- open subsets of
- a manifold
- a ring
- an object of any tangent bundle category

Pick whichever level of generality you are comfortable with, and mentally replace every time I say “space” with that choice. I’ll generally give examples with ; if you know about the more general types of spaces then you should know enough to generalize what I am saying.

The important thing is that any space supports a notion of *tangent space at a point *, denoted by , and *tangent bundle* denoted by . The tangent bundle consists of all of the tangent spaces put together, i.e. as sets

If , then and .

A **vector field** on a space consists of a **section of the tangent bundle** , i.e. a function such that for all . This gives a tangent vector at each point; you can visualize this by searching “vector field” in google images.

Finally, I will behave like a physicist in that I will not overburden myself with saying precisely how differentiable all my functions are. If a function needs to be differentiable, then you can assume that it is.

With these preliminaries out of the way, we can start talking about dynamical systems.

**Definition 1** A **closed dynamical system** consists of:

- A space , called the
**state space** - A vector field , with for all .

We often refer to as the “dynamics” of the system. Applied mathematicians and physicists might be more used to seeing such a system written using the equation

However, as a mathematical object, the “data” of this equation is simply the function .

**Example 1** Given any Petri net (where is the set of species and is the set of transitions), with fixed positive rates , there is a closed dynamical system , where

- is given by the mass action formula

In the real world, systems are rarely closed; other parts of the world have influence on the system. Classically, we might capture this with the equation

where is some other variable. However, this only captures part of “openness”; this system might also affect other systems, or we may only observe some part of a system. So we also have an “output” equation

Our system might affect other systems through .

We now state this more formally.

**Definition 2** An **open dynamical system** consists of three spaces and two functions. The spaces are:

- A
**state space** - An
**input space** - An
**output space**

The functions are:

- , such that .

**Example 2** Given a Petri net , there is an open system with

and

- given by mass action with rate parameters
- given by the identity

**Example 3** An open dynamical system with , is a closed dynamical system.

In this section, we learn about how to compose dynamical systems with a construction from category theory called *lenses*. If you have learned about lenses before from functional programming, you may have some preconceived notion that lens are a way of accessing nested fields in a data structure. However, here we are using lens in a very different way; it might be useful to just think about “lens” as a new word. The two uses of lenses are formally the same, but have a different feel.

Lenses are morphisms in a certain category, so before we can define lenses we have to define the objects in that category.

**Definition 3** An **arena** consists of a space and a **bundle** over it, which is a space and a map .^{1}

We think about an arena as a generalized specification of the inputs/outputs of a system. At each output , there are allowed inputs . We write an arena as . In a special case, the “output” of a system is its state, and the “input” is the direction that you tell it to go in, i.e. a tangent vector.

**Example 4** Given any space , there is an arena , where sends to (recall that ).

**Example 5** Given any two spaces and , there is an arena , where is the projection . This is known as a **simple arena**, because the inputs don’t depend on the outputs at all; they are the same everywhere. A lens between two simple arenas is known as a **simple lens**.

Lenses are then a way of mapping *outputs forward* and *inputs backward*.

**Definition 4** Suppose that and are arenas. Then a **lens** between them consists of a function , and then for every , a function . We visualize this as

**Example 6** An open dynamical system is a lens of the form

It takes a bit of unpacking here to see exactly why this is true. The forwards map is the same, but the backwards map is in a slightly different form. Namely, if we unpack the definition of a lens, then we have for every , . This looks different from our earlier definition of open dynamical system, but we recover that earlier definition when we recall that , because is a trivial bundle. Thus, for every we have a map . We can then rearrange the parameters to get , with the additional condition that , which is precisely what we said an open dynamical system was!

There are two ways of composing lenses: composing in parallel and in series. We start with composition in parallel, because that has a more immediate application in terms of open dynamical systems.

**Definition 5** Given two arenas and (with projections and ), there is an arena defined by

where the map is defined by .

**Definition 6** Given two lenses

their **parallel composite** is given by the following lens.

The functions and have types

and are defined via

**Example 7** We can use parallel composites to take two open dynamical systems and “run them in parallel”. That is, suppose that we have open dynamical systems

Then we can make an open dynamical system

This corresponds to the ODE consisting of two equations:

and output , . This may seem like a “trivial” operation, but the first step to making two open dynamical systems interact is to produce this parallel composite; we then use serial composition to make the systems interact.

Serial composition of lenses, roughly speaking “just composes the backwards and forwards maps”. But let’s spell that out in more detail.

**Definition 7** Suppose that we have the following setup of arenas and lenses:

We can compose them to form

On the bottom, the function is just the composite of and . Then on top, we have

**Example 8** Let , , , be spaces, and suppose that we have two open dynamical systems

We can then make a lens

where sends to the single element , and sends the single element to .

When we compose and in parallel, and then compose in series with , we get

This is a closed dynamical system, which can be written as a pair of coupled ODEs in the following way:

The point is that after composing in parallel, composing with further lenses can couple two systems together.

In the first blog post in this series, we learned how to make a “symbolic category of spaces” via algebraic theories, and the mantra “algebra is dual to geometry”.

We are now going to use this to build symbolic models of dynamical systems. Essentially, this boils down to “do the constructions of the above section starting from a symbolic category of spaces”, but there’s some elaboration that should take place.

Let’s start by taking the simplest algebraic theory: the theory with one type and no operations. The category of finitely presented algebras of this theory is just . So then the question become, what are lenses in ?

For the sake of brevity, we will only cover simple lenses. This will suffice to do dynamical systems with basic state spaces, because .

A simple arena is something of the form , with the injection map . This is because the product and projection of lenses become coproduct and injection when we dualize.

Then, when we dualize our recipe for a simple lens, we get that a simple lens in from to consists of a function along with a function .

Such a lens is also known as a **wiring diagram with one box**.

In this wiring diagram:

- is the set of output ports for the inner box
- is the set of input ports for the inner box
- is the set of output ports for the outer box
- is the set of input ports for the outer box

We can see that each element of (there is only one in this case) is assigned an element in , and each element of is assigned an element in .

A model of this algebraic theory is just a set. Suppose that is such a model, i.e. a set. Then from a lens in ,

where , , we get a lens in

where , and are given by precomposition.

Now suppose that we work in a richer algebraic theory. For example, suppose that we work with the theory of -algebras. For simplicity, let’s just consider free -algebras. The category of finitely generated free -algebras has as objects finite sets, and a morphism from to consists of a polynomial for each .

A lens in the dual category to this from to consists of a polynomial for every , and a polynomial for every . You can think of as a description of “how to compute” the variable , given values for all the variables in , i.e. it’s a dual map . Likewise is a description of “how to compute” the variable , given values for the variables in and the variables in , i.e. it’s a dual map .

**Example 9** The lens for mass-action semantics corresponding to a Petri net can be written in such a form, because the ODEs corresponding to mass-action semantics have polynomial right hand sides.

Just like with wiring diagrams, if we take any model of the theory of -algebras, for instance , then we can turn any lens in the opposite category of free -algebras into an actual geometric lens, by interpreting a map as a map to get:

It is no great stretch from here to consider similar constructions where we allow more operations than just multiplication, addition, and scalar multiplication. For instance, we could also allow . The most extreme extension of this is to allow any smooth function as an operation; models of such a theory are known as -rings, and are very nice from a mathematical standpoint (though completely impractical to represent on a computer).

The complications start to come when we consider

- Non-free finitely presented algebras, such as , which allow us to look at spaces that are not just Euclidian space
- Dependent lenses, where the map in the arena is not just a projection. This allows us to look at non-trivial vector bundles.

This delves farther into algebraic geometry than we wish to go in this blog post, but the interested reader is encouraged to look into the subject themselves.

One thing I want to emphasize in this post is that there were very few “choices” to make. It was some work to make this all fit together and work out the details, but starting from the premise of “I want to do symbolic dynamical systems” and knowing the two slogans

- algebra is dual to geometry
- open dynamical systems are lenses

was sufficient to come up with this. This is one of the things I like about category theory; sometimes it really can feel like discovery not invention, because once you know what you are looking for, there’s often a canonical way of doing things.

I’m excited to be working on implementing this, and I hope to talk about that in a future post where I will talk about Gatlab, a rewrite of the core of Catlab to be more “morphism-first” with respect to computer algebra.

Myers, David Jaz. 2022. *Categorical Systems Theory*. https://github.com/DavidJaz/DynamicalSystemsBook.

There is also an additional technical requirement on arenas which is that the bundle is

**locally trivial**. This requirement will not be relevant for the level of detail we work in here, so we will not go into precisely what this means.↩︎

$$

$$

A traditional approach to algebraic geometry requires a course in commutative algebra, a great deal of patience for abstract nonsense, and then an innate love for elliptic curves. However, once one has scaled these imposing walls of what seems like pure math for its own sake, one finds a subject which has a great deal of algorithmic and philosophical merits.

Algebraic geometry is both the seed of the revolution in category theory developed by Grothendieck, and also is at the core of the algorithms in computer algebra that power systems like Mathematica, SAGE, or Macaulay2.

In this post, I aim to “pull back the curtain” on some of the things behind this wall of commutative algebra, and show why it is worth studying even if you don’t really care about elliptic curves or toric varieties or any of the other strange creatures that typically pull people into algebraic geometry from pure math. I will instead motivate the development of algebraic geometry from the perspective of someone trying to develop a symbolic algebra system.

There is also a point to all of this apart from pedagogy; this is a warmup for some new ideas for which a background in algebraic geometry is needed.

Computer algebra as generally construed is centered around the manipulation of syntactic expressions that are intended to represent mathematical formulas.

We might represent such an expression with a *tree*, where each node is either a function symbol, a variable, or a constant. For instance, the expression would be represented as:

In a lisp, we would write down this tree as

`(+ (* a a) (* 4 b))`

In this notation, each matched pair of paretheses denotes a subtree, where the value at the root at the subtree is the first thing in the parentheses, called the **head**, and the rest of the parentheses denotes the subtrees attached to that root, called the **arguments**. We also allow numbers and symbols as arguments, which are the leaf nodes of the tree. We can express this in Julia with the following data structure.

```
struct SymExpr
head::Any
args::Vector{Union{SymExpr, Symbol}}
function SymExpr(head, args=SymExpr[])
new(head, args)
end
end
ex = SymExpr(:+, [SymExpr(:*, [:a, :a]), SymExpr(:*, [SymExpr(4), :b])])
```

`SymExpr(:+, Union{Symbol, SymExpr}[SymExpr(:*, Union{Symbol, SymExpr}[:a, :a]), SymExpr(:*, Union{Symbol, SymExpr}[SymExpr(4, Union{Symbol, SymExpr}[]), :b])])`

Disclaimer: the code in this post will be optimized for brevity and clarity, not necessarily maintainability or usability. If you want to use the techniques in this post, there is a substantial amount of work to get something production-quality, including but not limited to parsing a nicer syntax and error-checking all computations.

Now, there are a great number of meaningless expressions that one can write down using the previous data structure. Generally, the first thing one would want to know about an expression is “does this make sense”? Of course, sense-making is relative. So the real question is “does this expression make sense relative to a certain signature”, and in order to answer that, we must define what a signature is. The rest of this section is based on a field of math called “universal algebra”, a good reference for which is Goguen (2021).

A (single-sorted) **algebraic signature** consists of a set , along with a function . We call the elements of **function symbols**.

The signature of **rings** consists of , with , , , .

We say that an expression is well-formed with respect to the signature if the head of is , has arguments, and all of those arguments are also well-formed, or if is just a symbol, representing a variable. The tree in Figure 1 is *not* well-formed with respect to because there are numbers in it, however the left subtree is.

In order to deal with numbers, we add a nullary function symbol for every number.

The signature of **-algebras** consists of , and for , for .

With this signature, Figure 1 is well-formed.

One of the most natural things that one might want to do with an expression is *evaluate it*. However, we have allowed variables in our expressions; what does it mean to evaluate a variable? The answer is that we must talk about evaluating an expression in a **context**, which is an assignment of variables to values. But what is a value? One definition for a value would simply be a real number. But we can be more general than that.

If is a signature, a **model** of consists of a set along with a function for every .

We can evaluate an expression in any model, if we have an assignment of the variables in that expression to values in that model.

For a fixed , the set of real matrices is a model of the signature of -algebras, where is interpreted as matrix addition, is interpreted as matrix multiplication, and is interpreted as the matrix where is the identity matrix.

Evaluation is a recursive function, which is most naturally expressed with code.

```
abstract type Model{T} end
# Whenever we have a Model, we assume that we have a function of type
# interpret(m::Model{T}, f, args::Vector{T})::T
# evaluating a symbol just looks it up in the context
function evaluate(v::Symbol, ::Model{T}, ctx::Dict{Symbol, T}) where {T}
ctx[v]
end
# evaluating an expression first evaluates the arguments, and then applies the
# interpretation of the head to those arguments
function evaluate(e::SymExpr, m::Model{T}, ctx::Dict{Symbol, T}) where {T}
args = map(arg -> evaluate(arg, m, ctx), e.args)
interpret(m, e.head, args)
end
struct MatrixModel <: Model{Matrix{Float64}}
n::Int
end
# Here is the implementation of interpret for MatrixModel
function interpret(m::MatrixModel, f, args)
if typeof(f) <: Number
f * identity(m.n)
elseif f == :+
args[1] + args[2]
elseif f == :*
args[1] * args[2]
end
end
evaluate(ex, MatrixModel(2), Dict(:a => [2. 0; 1 1], :b => [0 1.; 1. 0]))
```

```
2×2 Matrix{Float64}:
4.0 8.0
11.0 1.0
```

We are now going to embark onto some category theory. But if you feel lost, know that the material that we are going to cover in the next sections is all just elaborations of the above 20 lines of code. If you understand what that code is doing, then just stare down the category theory until you can see that it’s doing the exact same thing.

We can get evaluation of expressions “for free” once we set up some categorical technology. We start by defining a category of models for a signature.

Given a signature , there is a category whose objects are models of and whose morphisms are functions such that for each with and each ,

There is a functor that sends a model to its underlying set. This functor has a left adjoint that sends a set to the model of , where an element of is a well-formed expression with respect to with variables only from the set .

The adjointness condition is that for any model , maps in correspond bijectively to morphisms in . A map is what we called a context before; it assigns elements of to values in . Then the map simply sends an expression with free variables in to its evaluation with context .

Thus, evaluation is given by the adjoint transpose!

Now, whenever you have an adjunction there is a monad and comonad associated with it. In this case, we care about the monad, which is given by . This monad takes a set and returns the set of well-formed expressions .

If is a monad on a category , then a **monad algebra**^{1} of consists of an element and a morphism , such that the following diagrams commute

The algebras of form a category , where a morphism from to is a map such that the following commutes.

Whenever you have a monad coming from an adjunction , there is a functor from to which sends to the algebra , where is the map given by applying the adjoint transpose to the identity .

In the case of our monad , this says that for any model there is a function . That is, we can take an expression where the “variables” are elements of , and evaluate that expression directly.

This gives another way of evaluating an expression with variables. We can take an expression , then use functorality of to map across it and get an expression , and then evaluate this to get an element of .

If we have an arbitrary -algebra, then we can use a similar trick to evaluate expressions in a context as well. This implies that -algebras are like models of , and in fact this is exactly right! The functor from to is an equivalence. being a model of is equivalent to being able to evaluate well-formed expressions with respect to in . This is because the monad algebra laws ensure that the algebra is determined solely by its action on the expressions which are non-recursive, i.e. which consist of a function symbol applied to values.

It turns out that there are general conditions under which this is true; that for an adjunction , is equivalent to the category of algebras; this is known as a monadicity theorem.

We’re now going to move on, and return to the computer algebra story. But the main takeaway that you should get from this section is that “models of a theory” and “algebras of a monad” are intimately connected; “models of a theory” tell you how to evaluate the function symbols, and “algebras of a monad” tell you how to evaluate arbitrary expressions, but the monad laws say that this evaluation is “generated” by just looking at the function symbols.

Now that we can evaluate expressions, the next thing one might want to do is construct functions using these expressions.

Suppose that one has an expression in signature with free variables contained in the set , and that is a model of . Then given a context , or in other words, an element of , we can evaluate with context to get an element of .

Thus, there is a function from to for any model . We can therefore think of an expression in variables as a “symbolic -ary” operation.

But what if we want to consider functions ? This is the same as many maps . Thus, we need an expression in for every element of , i.e. a function . It’s counterintuitive that the morphism is going in the opposite direction, but it comes directly from how we might write down such a function. I.e., if , then

Thus, the *dual* of the Kleisli category of is the category of multivariate symbolic functions for a signature . Let’s unpack that. The Kleisli category of has as objects sets, and a morphism from to is a function . So the dual has as objects sets, and a morphism from to is a function . By what we showed earlier, for any model of , there is a functor from to , which sends to the map given by evaluating each expression for each with context .

Composition in this Kleisli category corresponds to *substitution*. I.e., if

and

then the composite is defined by

So, for instance, if is the signature of -algebras, then there is a functor from to , which sends to . This turns symbolic functions into real functions.

What are the advantages of working with symbolic functions instead of regular functions? Well, for one, they make it possible in the first place to work with functions at all! Ultimately, we are never working with “real functions” on a computer; we’re always working implicitly with symbolic functions.

Explicitly working with symbolic functions has other benefits however. Classically, one benefit is that we can compute the derivative of symbolic functions easily. Note that we can do this not just with polynomials; we can throw in function symbols , , into our signature, and do all the exact same tricks as we’ve developed up until now! It might annoy algebraists, but who cares.

But the main benefit is that we might be able to find simpler representations of our functions, “canceling out” a large chunk of unnecessary computation so that our functions run faster in less memory.

In order to talk about this, however, we need to add *laws* to the picture.

An **algebraic theory** consists of a signature along with a collection of tuples which we call **laws**.

The theory of monoids has a signature with and , along with laws

A **model** of a theory consists of a model of the signature , such that for all , for all , the evaluation of with context is the same as the evaluation of with context .

Other examples of algebraic theories include the algebraic theories of groups and the algebraic theory of rings. Additionally, for any ring , there is an algebraic theory of rings along with maps , which is given by adding a nullary function symbol for every element of , along with appropriate laws about how those elements add and multiply with each other.

A similar trick can be done to get an algebraic theory of *modules* over a ring ; take the theory of abelian groups, and then add a unary function symbol for every element of representing scalar multiplication by that element. If happens to be a field, then we get an algebraic theory of vector spaces. Note that there is no algebraic theory of fields, because multiplicative inverse is not defined on zero; but there is an algebraic theory of vector spaces, because we don’t need to worry about universally quantifying division; it is just implicit in how the unary function symbols interact.

The exact same story that we told for signatures can be told for theories. I.e., there is an adjunction between the category of models for a theory and , and this adjunction induces a monad on . The left adjoint sends a set to the *free model* on that set. This is given by taking the model of the signature given by the well-formed expressions with variables taken from , and then essentially just quotienting out until the laws hold. This models our ability to “rewrite” terms in a theory; the term can be rewritten to . We might prefer the first one if we are trying to minimize multiplications, and the second one if we are trying to “normalize”.

In the case of -algebras, the free functor sends a set to a -algebra which is easy to describe; it sends it to the -algebra of polynomials with variables taken from and coefficients in .

The canonical -algebra is just . Thus, we can think of a map as a “symbolic” map from to . In this particular case, it is just a polynomial map, but I say symbolic to emphasize that in the more general case when we aren’t talking about -algebras, we think about this as a symbolic map.

But by the adjunction, a map is the same as a map . We now arrive at the punchline of this section: maps of -algebras from to can be thought of as symbolic maps from to .

Thus, the *dual* of the category of free -algebras can be understood as the category of multivariate symbolic functions. We can see that this result should be interpreted more generally; the dual of the category of free models of an algebraic theory can be understood as a category of multivariate symbolic functions.

But why stop at free models? What about more general models? How might we interpret the dual of the category of algebraic theories as a category of “spaces and symbolic functions between them”? That is the subject of the next section.

In this section, we discuss symbolic functions between spaces which are more interesting than just . For the sake of concreteness, we will continue to work with -algebras, because these are more traditional than other algebraic theories, but a similar story could be told with other algebraic theories.

This is more traditional algebraic geometry material, and thus there are a wealth of references which cover similar material. For a classical algebraic geometry approach, the reader can refer to Eisenbud and Harris (2000), and a more modern approach relying more heavily on category theory can be found in Vakil (2017).

The zero set of a function is the set

The 2-sphere is the zero set of the function given by

As a historical note, the definition of manifold in differential geometry used to be precisely zero sets of smooth functions.

Categorically speaking, a zero set of a function is given as the *equalizer* of the maps

Recall that the equalizer is the universal object with a map such that . If we are working in a good category of spaces^{2}, this is given precisely by the zero set of , with appropriate structure.

Now, let’s compute this equalizer in our category of multivariate symbolic functions. A symbolic function is a -algebra homomorphism . We want to compute the “equalizer” of that function with the zero function (quiz: what is the zero function symbolically?). Because algebra is dual to geometry, this turns into computing the coequalizer in the category of free -algebras. But this category doesn’t have all coequalizers! Fortunately, there’s a convenient category hanging around that *does* have coequalizers: the category of all -algebras.

The coequalizer of with in the category of all -algebras is the quotient

This is the set of polynomials in , “modded out” by the equivalence relation that if there exist polynomials such that

Why does this make sense? Recall that is an expression in . When we assign values to the such that , then in fact, evaluates to . So if we are staying in the zero set of , it makes sense to rewrite to .

Consider the 2-sphere again. Following our construction, the -algebra that we would assign to the 2-sphere is . In this -algebra, the polynomial is equal to . This makes sense, because as functions on the 2-sphere, always has the same value as , just like always has the same value as . So modding out by is basically stating the assumption that we are living on the zero-set of .

In general, models of algebraic theories can always be expressed as quotients of free models. So this gives us an interpretation of the dual of the category of all models; it represents subspaces of product spaces, and symbolic functions between them.

We will finish with an observation. In a category with a terminal object , we think of maps as points of . In the category of -algebras, there is an *initial object*, namely . So for a general -algebra , we might think of maps as “points” of .

In the case of , a map is generated by where are sent. So such a map is precisely an element of . If we instead take , then a map is again generated by where are sent. However, we also need to send to . So such a map is precisely an element , where . That is, it is an element of the zero set of . So the “points” of the “symbolic space” are just the points of the actual space!

The contravariant functor from the category of -algebras to is called ; it takes an -algebra to the “set of points” in the space that the -algebra represents. This functor is contravariant because (once more, say it with me) algebra is dual to geometry!

Eisenbud, David, and Joe Harris. 2000. *The Geometry of Schemes*. Vol. 197. Graduate Texts in Mathematics. New York: Springer-Verlag. https://doi.org/10.1007/b97680.

Goguen, Joseph A. 2021. “Theorem Proving and Algebra.” *arXiv:2101.02690 [Cs]*, January. http://arxiv.org/abs/2101.02690.

Vakil, Ravi. 2017. “The Rising Sea.” https://math.stanford.edu/~vakil/216blog/FOAGnov1817public.pdf.

We defined the signature of -algebras before. It turns out that these two uses of the word “algebra” are actually compatible in a certain sense, but we will not get into exactly why right now.↩︎

For reasons we won’t go into here, the category of manifolds is

*not*good↩︎

$$

$$

*This post is cross-posted at the Topos Institute blog.*

”Engineers are not the only professional designers. Everyone [or thing] designs who devises courses of action aimed at changing existing situations into preferred ones.” – Herbert Simon (Simon 1988)

It’s breakfast time! You wake up and walk to your kitchen and notice a loaf of bread, a knife, a raw egg (in its shell), a skillet, and a stove burner sitting on the counter. You’re hungry and your preferred state of existence is to, instead, have an egg sandwich sitting on your counter. You are saddened by the situation, but feel empowered to change it! You compare what you have and what you want, recall what cooking skills you have, and devise the following steps:

- Slice the bread twice with a knife
- Put the slices of bread on a plate
- Put the skillet on the stove burner
- Crack the egg on the skillet
- Wait until the egg is cooked
- Put the egg on a slice of bread
- Close the bread

In this example, and in all planning problems, you can notice three conceptual notions: a plan, a planner, and a planning problem. The plan is the sequence of steps you devised. The planner is your cognitive reasoning activity. And the planning problem is the comparison of the states and the knowledge of skills you have. Planning is fairly automatic for most everyday tasks, so much so that we rarely think about these distinctions or even acknowledge that we’re doing any planning. However, if we wish to transition this activity to a computer, making all three concepts computable is necessary.

Automated planning is the domain of artificial intelligence (AI) aimed at identifying a *sequence of actions*, or a *plan*, that changes the current state of the world to a preferred state, namely one that meets some goal criteria. A planner takes a planning problem and produces a plan. One choice, the most common, is to construct a language syntax that can accommodate the semantics of actions, action requirements, and action effects. It’s then up to the planner to devise its own syntax and semantics for how to interpret, manage this information, and present a plan. For example, in practice, architectures that involve planning usually call on a PDDL planner (Ghallab et al. 1998) as an external service. The way they manage and update data about the world is handled independently by either translating the plan steps into database updates, or (more likely) re-sampling the world during or after the plan is executed. Evidently, having differing languages could result in conflicts and reduced inteoperability between planners, databases, and plan consumers.

Another choice would be to define a common abstraction, modeling for syntax and semantics, for the planner, planning problem, and the plan to reduce friction between representations. In this blog post, I will explain how a category-theoretic method called double-pushout (DPO) rewriting in the category of copresheaves can operationalize these concepts.

Rewrite rules are the atomic operations that translate data from one state to another. A rewrite rule contains three parts: an **input**, a **keep**, and an **output**. The **input** portion describes the types of things and relationships that are required before the rule can be applied to a world state. The **output** portion describes the types of things and relationships that exist in the world after the rule has been applied. The **keep** portion describes the types of things that remain consistent between the input and the output. For example, if we want to design a rule about slicing a piece of bread, we might want an input to contain a loaf of bread and a knife. At the end of this action, what we would want is a loaf of bread, a slice of bread, and a knife. We also want to say that the loaf of bread and knife we end up with is the *same* loaf of bread and knife we started with.

The terms “input” and “output” are intentionally reminiscent of pre-conditions and post-conditions/effects in the traditional planning literature. However, the term “keep” is a novel concept that was introduced to track entities that persist between the two states. “Why,” you might wonder, “can’t I just construct a map directly from the input to the output?” Interestingly, the **keep** portion gives us useful information about what elements have *permission* to disappear. If I had an element in the input that did not appear in the output, a map directly from the input to the output would force me to assign that element to something (assuming our maps are total) which would not be conceptually accurate. However, if it did not exist in my **keep**, then I would not need to account for it in my output state and it would be free to disappear. In automated planning, the frame problem is concerned with how to axiomatically account for information that remains unchanged. While this method does not exactly provide a set of axioms, it does provide a mechanism to declare what things remain unchanged when a rule is applied.

Now I’ve not said anything about the nature of the **input**, **keep**, and **output** portions of my rules. What are they? Graphs? Sets? Manifolds?! Well, formally, a rule is a span in the category of copresheaves, or -sets.

Let be a small category, which we think of as a *schema*. A *-set*, also called a *copresheaf on *, is a functor from the schema to the category . The schema is a category whose objects are types and whose morphisms describe “is-a” and other functional relationships between types. You can consider it to be a denotational semantics for ontologies. The category is the category of sets and functions. Thus, a -set is a functor that sends types to sets and type relationships to functions. -sets are a simple model of *categorical databases* (Spivak 2012) and have a full-featured implementation in Catlab.jl (Patterson, Lynch, and Fairbanks 2021).

Morphisms of -sets are natural transformations between functors. With this definition, for any schema , there is a category of -sets and their homomorphisms.

To take an example, we can examine a rule that tells us what happens when we `:slice_bread`

. More precisely, this block of code is constructing a span of copresheaves by taking colimits of representable functors. A proof of this statement can be found in (Lane 1978, chap. III.7). The important thing to know is that a representable functor keeps track of all the morphisms that are involved with an object , often where the functor is denoted as .

In this code, you can see that there are three objects, `I`

, `O`

, and `K`

, that define an assignment map between things like `Knife`

and `knife`

(note the difference in capitalization).

```
:slice_bread => @migration(SchDB, begin
I => @join begin
loaf::BreadLoaf
knife::Knife
end
O => @join begin
loaf::BreadLoaf
slice::BreadSlice
food_in_on(bread_slice_is_food(slice)) == food_in_on(bread_loaf_is_food(loaf))
knife::Knife
end
K => @join begin
loaf::BreadLoaf
knife::Knife
end
end),
```

In particular, we can see that for our input we have a functor that sends the objects explicitly as follows:

The more useful aspect of this functor, however, is the implicit assignment of morphisms and other objects. `BreadLoaf`

and `Knife`

are involved in morphisms in whose codomains involve other seemingly hidden objects like `Food`

, `Kitchenware`

, and `Entity`

(described in `SchDB`

). Being a representable functor, `I`

has the important role of accounting for the assignments of these morphisms and objects in the target category, . Because of this, it can be seen that this feature automatically manages implicit inputs (pre-conditions) and outputs (effects) provided the schema sufficiently encoded relationships to other object. This handling of *implicit* pre-conditions and effects are termed as the *qualification problem* and the *ramification problem* of the frame problem, respectively (Ghallab, Nau, and Traverso 2004).

In this category, you also have the ability to glue things together by declaring objects as being equal, as is done in the line:

`food_in_on(bread_slice_is_food(slice)) == food_in_on(bread_loaf_is_food(loaf))`

The gluing can be thought of as a colimit in the category of copresheaves. The result of this particular gluing can be seen below.

Now that we have a sense of how to construct rules, we can see how to use them to derive new world states.

As we’ve seen, rules are represented by spans in the category of -sets. In the setting of automated planning, we think of these rules as actions in our plan that transform aspects of a world from one state to another. The *world state* is just another object in our category of copresheaves. In planning, an action can only be applied to the world if the pre-conditions, or inputs, are met in the world state. Therefore, in our framework, we consider a rule to be *applicable* if there exists a monomorphism from the rule input, , to the world state in question, . The term *monomorphism* refers to a generalization of an injective function and is denoted by a hooked arrow.

This applicability criteria will be useful in deciding what rules should be considered when building out a task plan.

Once we’ve decided that a rule is applicable, how can we use it to induce changes in our world states? Well, we know that the rule is statement about what should be in the world after the rule is applied and what should remain consistent between the input world state and the resulting world state. We also know that the category of copresheaves is an elementary topos, which in particular gives us the freedom to take limits and colimits. This is a convenient fact given that we have been spent much of this exposition talking about spans. So to resolve changes in our world based on a rule, we can consider using the **double pushout (DPO) rewriting** method (Ehrig, Pfender, and Schneider 1973; Brown et al. 2022).

The general procedure of DPO rewriting is

- Find a pushout complement*
- Complete the left pushout
- Complete the right pushout

*

A few notes on pushout complements: A pushout complement is a map that manages the deletion of entities that form the complement . Because is a monomorphism, the pushout complement is unique up to isomorphism, if it exists. Extra conditions, theidentificationanddanglingconditions, are needed to ensure that the pushout complement exists.

Finding the map that gives the *match* of the rule to the world state is done using backtracking search. More information about this procedure can be found in the Catlab documentation on finding -set homomorphisms.

We can demonstrate what this might look like using our `:slice_bread`

rule and our chosen world state (a well-equipped kitchen!) in the following cartoon.

You can see, in the top-left, a depiction of the information we have about our world and our rule. In our world, we have a refridgerator, a loaf of bread, a pear (?), a knife, a bowl, and a skillet. We want to take the action of `:slice_bread`

. Using DPO, we can first identify a map from the rule-keep and the world-keep that is the pushout complement. Recall that a pushout glues the target objects of the span along its apex. Therefore, we can check that our chosen pushout complement produces our world-input. Once we’ve determined this is valid, we can construct the pushout on the right as normal. From this we get a world where there is now all the same entities with the addition of a slice of bread participating in some relationship (gray line) to the loaf of bread. Note: the abbreviated light blue maps in the image should send each entity to “itself” in the world-input and the world-output.

Now that we know how to choose rules and apply rules, we can begin planning. Recall that a *plan* is a sequence of actions that change an initial world state to one that satisfies some goal criteria. To set up a planning problem in our framework, we need to point out two objects in the category of copresheaves that are the initial state and the goal state. We can then consider a plan to be a sequence of rules that when applied to the initial state constructs an object such that a monic map exists from the goal state to that object. We borrow from successful methods in the field of automated planning to implement a forward search algorithm with backtracking (Ghallab, Nau, and Traverso 2004, chap. 4). The exit criteria involves checking whether a monic map exists from the goal into the current world state.

Pseudocode this planning algorithm is as follows:

**Algorithm**: Forward Planning with Backtracking

**Procedure**: ForwardPlan( world, goal, `r`

rules, `r_usage`

rule usage, `r_limits`

rule limits, `p`

plan)

(

*Exit criteria*)**If**monomorphism exists1a.

**Return**Plan`p`

Initialize applicable rules list,

`applicable`

**For**`rule`

in`r`

**do**3a. Get the input object of

`rule`

,3b. Check if monomorphism exists

3c.

**If**exists, append`rule`

to`applicable`

(

*Backtrack criteria*)**If**`applicable`

is empty, “No applicable rules!”**ThrowException****For**`a`

in`applicable`

**do**5a. (

*Backtrack criteria*)**If**`r_usage[a]`

>=`r_limits[a]`

, “Rule limit reached!”**continue**5b. = DPO(, representable(

`a`

))5c. Append

`a`

to`p`

5d. ForwardPlan(, ,

`r`

,`r_usage`

,`r_limits`

,`p`

)

The deliberate type-setting choice is made to show what data is mathematically rigorous, those in math notation, and what data is a heuristic or workaround not derived from the categorical formalism, those in verbatim font. In particular, we say that , , and are objects in the category of copresheaves.

As with other planning algorithms, this method is subject to issues related to cycles and non-termination. This occurs in scenarios where a rule might be applicable indefinitely if the world specification does not capture a way of destructing an object when a rule is applied. For example, in the present model, slicing the bread does not reduce the bread loaf in any manner which means our planner could potentially slice the bread loaf infinitely many times. The integration of attributed -sets, or “acsets”, is in the future of this framework which would allow users to specify attributes for each entity, such as . This structure would provide a well-defined way to do arithmetic and other manipulations with the attributes which could help keep track of resource limits. For now, we use an ad hoc method, a `rule_limit`

dictionary that specifies the maximum number of times a rule can be applied in a plan.

You might be wondering what are the benefits of all this formalism when there already exist working systems for automated planning. In fact, the current practice suffers from a number of limitations that I think a categorical point of view can help address.

**A method for propagating implicit pre-conditions and effects**. As mentioned earlier, the frame problem is concerned with accounting for implicit world conditions in light of explicit ones. And, as we saw, tracking implicit effects (the so-called ramification problem) and implicit preconditions (the qualification problem) is taken care of because of our use of functors from to Set. This gives rule designers to freedom to only model the changes are most important and trust that related changes will be dragged along.**A common abstraction for actions and events**. Existing planners are not able to handle external events. In this framework, actions and events are things of the same type, namely rewrite rules for categorical databases. This means that we can support two modes of operation within a dynamic planning environment: we can (a) apply a rewrite rule that captures some external event and update our current state of the world, or we can (b) search for a plan between the current world state and the goal. This shared abstraction gives us the ability to take in new information and conduct planning without having to state a new planning problem.**A more structured language than first-order logic**. For practioners trying to use automated planning in applications, expressing your planning problem in terms of first-order logic formulas may feel awkward and unstructured. The propositional atoms that comprise these formulas require careful modeling of the semantics with little guidance. For example, some example atoms could be`breadloaf_on_table=True`

,`slice_on_table=False`

, and`knife_in_hand=True`

; however, they could also be`breadloaf_sitting_on_table=True`

and`knife_in_left_hand=True`

which could serve an equivalent purpose depending on how my actions use these atoms. The ability to capture knowledge under the guidance of an ontology, or schema, provides a more natural way of expressing conditions of the world for planning problems.**A way to handle hierarchy and concurrency**. There is currently no way to handle equivalences between permutations of actions that are independent of each other. Forward and backward planning assume totally ordered sequence of actions. Plan-space planning allows a partially ordered set of plans, but does not handle actions that depend on one another. Current planners also do not have ways of dealing with simultaneously occurring actions (Brachman and Levesque 2004, sec. 15.3.1). Expressing a plan, planner, and planning problem using a categorical lanugage would provide a gateway to other structures like operads (hierarchy) and monoidal categories (concurrency).

Despite these benefits, significant work for us remains. This post described a way of stating a planning problem in a categorical way and adapting an existing planning algorithm to work with this abstraction. However, we have still not explained how to describe a plan in a more structured way, beyond just a sequence of rules. Furthermore, we would like to investigate how category theory could help in devising planning algorithms, in particular hierarchical planners. If you have any ideas, please feel free to share your thoughts below!

Brachman, Ronald, and Hector Levesque. 2004. *Knowledge Representation and Reasoning*. Elsevier. https://doi.org/10.1016/b978-1-55860-932-7.x5083-3.

Brown, Kristopher, Evan Patterson, Tyler Hanks, and James Fairbanks. 2022. “Computational Category-Theoretic Rewriting.” In *Graph Transformation*, 155–72. Springer International Publishing. https://doi.org/10.1007/978-3-031-09843-7_9.

Ehrig, H., M. Pfender, and H. J. Schneider. 1973. “Graph-Grammars: An Algebraic Approach.” In *14th Annual Symposium on Switching and Automata Theory (Swat 1973)*. IEEE. https://doi.org/10.1109/swat.1973.11.

Ghallab, Malik, Adele E. Howe, Craig A. Knoblock, Drew McDermott, Ashwin Ram, Manuela M. Veloso, Daniel S. Weld, and David E. Wilkins. 1998. “PDDL-the Planning Domain Definition Language.” In.

Ghallab, Malik, Dana Nau, and Paolo Traverso. 2004. *Automated Planning*. Elsevier. https://doi.org/10.1016/b978-1-55860-856-6.x5000-5.

Lane, Saunders Mac. 1978. *Categories for the Working Mathematician*. Springer New York. https://doi.org/10.1007/978-1-4757-4721-8.

Simon, Herbert A. 1988. “The Science of Design: Creating the Artificial.” *Design Issues* 4 (1/2): 67. https://doi.org/10.2307/1511391.

Spivak, David I. 2012. “Functorial Data Migration.” *Information and Computation* 217 (August): 31–51. https://doi.org/10.1016/j.ic.2012.05.001.

$$

$$

One can view the task of modeling as trying to capture a phenomenon in terms of structure (*what* exists) as well as properties that hold of that structure, expressed in form of logical constraints. The chase is an algorithm which enforces logical constraints that trades off well for expressivity and computational tractability. We will give a description of the algorithm at a high level, explore the language of constraints it allows us to enforce, and describe the details of its implementation in Catlab.jl. The applications below will show that this tradeoff is well-suited for tasks in scientific computing and knowledge representation.

Let’s create a simple model of chemistry with atoms, molecules, and bonds. Atoms have a functional relationship to molecules, , i.e. every relates to exactly one ( can be thought of as a *function* from to ) . Bonds are a kind of entity that can exist between pairs of atoms. We can relate a bond to its two atoms via functional relationships . If we write , then implicitly we mean that there exists a bond and atoms and such that and . Intuitively, bonds ought satisfy the following properties: - Bonds are symmetric, expressed as . - Atoms in different molecules can’t be bonded: .

So far we have *entities* (e.g. , , ) and *functional relationships* (e.g. ). Relational databases are an ubiquitous technology that has been heavily optimized to work with data of this form. In the language of relational databases, entities are viewed as *tables*, with their functional relationships viewed as *columns* (also called *foreign keys*). The structure of a model that is implemented as a relational database is given by a *schema* which lists the entities and functional relationships. This specification of schemas is typically handled by SQL, which has been incredibly successful in many domains; however, this framework leaves something to be desired in terms of declaring and enforcing properties of our model. *Ologs* (ontology logs), on the other hand, are a category-theoretic interpretation of knowledge bases, which are structurally identical to relational databases yet have a rich logic of constraints that can further encode knowledge (Spivak and Kent 2012). Being able to enforce these constraints in a computationally tractable way is a difference between databases and knowledge bases.

For example: supposing we have a database of atoms, bonds, and molecules along with a set of constraints (hereafter referred to as ), we can ask whether the database satisfies . If not, we can imagine there being the best or nearest-relative of the database that *does* satisfy . The *chase* is an algorithm that repairs our data to find this other database (Benedikt et al. 2017). The next two sections respectively show how to define this notion of a “best” relative and describe the language in which constraints for the chase are specified.

Given a chase engine, what practical things can we *do*? Firstly, we have a general means of propagating information implied by the properties of a model, allowing for databases to be used as *ologs*. This will be described at the end of the post, along with other applications such as model enumeration (exploring logically-constrained search spaces). A future post will describe applications to e-graphs (equational reasoning that can be used for program optimization).

In AlgebraicJulia, we represent relational databases as -sets.^{1} -set homomorphisms (also called morphisms) are a natural relation to consider between two -sets with the same schema. In the special cases of -sets that correspond to the categories of sets and directed graphs, the notion of a -set homomorphism lines up precisely with the notions of function and graph homomorphism, respectively. Viewing -sets as databases, these morphisms also correspond to database homomorphisms as studied in the literature.

At a high level, a morphism is a particular way of locating *within* . If a morphism exists, it is like saying there is a match of pattern within , and the morphism itself contains the data for how to locate . Elements of may be merged together when located within , but connections may never be split apart. Consider the following visualizations of two -sets with schema .^{2}

```
using Catlab, Catlab.Theories, Catlab.CategoricalAlgebra
@present SchChem(FreeSchema) begin
(Molecule, Atom, Bond)::Ob
(b₁, b₂)::Hom(Bond, Atom)
mol::Hom(Atom, Molecule)
end
@acset_type Chem(SchChem)
# Initialize empty C-sets
c₁, c₂, monotomic, diatomic, triatomic = [Chem() for _ in 1:5]
mols = [monotomic, diatomic, triatomic]
# Create 1, 2, and 3 atom molecules
[add_part!(x, :Molecule) for x in mols]
[add_parts!(x, :Atom, i, mol=1) for (i, x) in enumerate(mols)]
add_parts!(diatomic, :Bond, 2, b₁=[1,2], b₂=[2,1])
add_parts!(triatomic, :Bond, 6, b₁=[1,2,1,3,2,3], b₂=[2,1,3,1,3,2])
# Create the left C-set
[copy_parts!(c₁, diatomic) for _ in 1:2]
# Create the right C-set
[copy_parts!(c₂, x) for x in [triatomic, monotomic]]
```

There are homomorphisms^{3} from the left -set (domain) to the right (codomain), one of which is indicated by coloring here:

`@assert length(homomorphisms(c₁, c₂)) == 36`

However, there is no homomorphism in the opposite direction. If we try to do this, we find that we require for there to be an atom with a bond to itself in the domain (otherwise, there is nowhere to send B₃₄ that satisfies the homomorphism constraint).

`@assert length(homomorphisms(c₂, c₁)) == 0`

If there exists a homomorphism from to , one might say that can be interpreted within . For this reason, the binary relation (that there exists some homomorphism from to ) is crucial in defining a notion of a “best” or “closest” -set instance, in the context of data repair: we want to consider all databases which can both interpret our starting instance (all ’s for which ) as well as satisfy . Call this set of -sets . Which element is the *universal* (or best) one, if any? It would have to be one such that, for all , it is also the case that . Put another way, is the minimal way of getting to satisfy : *any* modification to to make it satisfy will have to make *at least* the modifications one makes to get .

As an example, consider the instance on the far left below that doesn’t adhere to the original constraints we provided (notably, this means we must draw bonds as directed):

```
I = Chem()
[copy_parts!(I, monotomic) for _ in 1:2]
add_part!(I, :Bond, b₁=1, b₂=2)
@assert nparts(I, :Atom) == 2
@assert nparts(I, :Bond) == 1
@assert nparts(I, :Molecule) == 2
U = chase(I, Σ) # Σ will be described in the following section
@assert nparts(U, :Atom) == 2
@assert nparts(U, :Bond) == 2
@assert nparts(U, :Molecule) == 1
```

Chasing the original instance with the symmetry rule produces the first homomorphism, and then further chasing that result with our second rule (that states bonded atoms are in the same molecule) produces the second homomorphism. Although we could push further with homomorphisms (e.g. merging the two atoms together, embedding this molecule in the context of other molecules), those further transformations would not be *universal* with respect to the original instance and .

We cannot put arbitrary logical expressions as constraints for the chase. Rather, we are limited to expressions of the form , where and can contain , , , , and atomic formulas. This is called regular logic ^{4}. Interestingly, these are precisely the constraints which can be encoded as homomorphisms between a source and target , as we have only the ability to merge things together (with ) or to add new things (with ), which are the restrictions we have with homomorphisms. To demonstrate this correspondance, we will show five constraints expressed as logical formulas as well as homomorphisms.

- Our formula is:

- Our formula is:

- Let us temporarily add to the schema and say there is a functional relationship which points from any bond to its symmetric dual .
- We can prevent atoms being bonded to themselves by asserting that this is injective.
- The formula that asserts that is injective is:

- Our formula is:

- Again, we temporarily add to the schema an entity whose
*meaning*is to identify all of the elements of that have the same source and target. - The
*structure*is an entity with one functional relationship - The
*property*is - Visually, each element of will be a square and the bond that it picks out as a self bond will be indicated with an arrow labeled .
- Once we further constrain to be injective, becomes a limit for the diagram of a bond with the same source and target.

This correspondance is not a mere curiosity, as it has rather practical consequences. The language of the constraints can be implemented in the language of the model, so a chemist can express their knowledge in terms of molecules and atoms rather than adding a new flavor of logic to their vocabulary. More profoundly, although expressing knowledge in a logical syntax is incredibly powerful and general, this actually makes it *harder* to reason about and more challenging to process in an automatic way. When the same information is encoded in combinatorial (or graph-like) data, we can easily perform analyses and manipulations that are not feasible to automate for arbitrary logical formulas, mathematical expressions, or code. Some example manipulations: 1.) if the of our -sets changes because our model of the world updates, functorial data migration can faithfully migrate constraints into our new perspective when they are expressed as morphisms, 2.) the morphism representation is more clearly visualizable, further raising the accessibility and transparency of constraints as well as making a GUI for constraints possible, 3.) as will be discussed in the next section, the chase algorithm itself can be implemented in a few lines of code when thinking of constraints as morphisms, whereas it is a tedious and error-prone process to implement when working with the syntax of logical formulas.

In the database literature, logical constraints are known as “embedded dependencies” (EDs). Viewed as homomorphisms , each ED’s domain is thought of as a potential *trigger*: it is a pattern that, if matched, obligates us to do something in order to be consistent with the constraint (either merging elements together or adding new elements). If the work to be done has not yet been done, we call the trigger *active*. At a high level, then, the chase merely scans for triggers, and ‘fires’ the active triggers to update the database. This process continues iteratively until there are no active triggers, if it terminates at all.

One strategy for computing the chase derives from Quillen’s small object argument, which is a general strategy for solving *lifting problems*, of which the chase is a special case. Given a trigger , whether or not it is active depends on whether or not there exists a homomorphism that makes the following triangle commute:

Our goal is to make the triangle commute and assume no more than necessary to make this happen; this corresponds to a categorical operation called a *pushout*, which produces the best possible for which the trigger will no longer be active; however, the new material in may have activated other triggers which then need to be fired. Thus, the chase can be implemented as an iteration of chase steps, each of which has the following sequence: 1. Compute *all* triggers of all dependencies, filtering the inactive triggers 2. Terminate the chase, if no triggers are active. 3. Construct maps from union of active triggers into the current database instance as well as into the union of trigger codomains. 4. Construct the next database instance via pushout.

Pushouts are a type of colimit that generalize unions of sets, providing a notion of “gluing pieces together along a common interface”. In this case, the interface is the top left corner of the square, which is used to connect the current database to the consequents of all triggers that fired in order to yield the next iteration of the database.

Given Catlab’s preexisting implementations for homomorphism finding and colimits, the code required to implement the chase matches closely to the mathematics.

In scientific practice, data is usually recorded, whereas *knowledge* that ties data together and contextualizes it is consistently represented only informally through documentation, publications, file names, or within the behavior of obscure code. Ologs are a model of explicit knowledge representation that can be viewed in many ways (as database schemas, as ontologies, as categories). It is worthwhile to consider the pros and cons of adding formality to one’s knowledge representation. Cons are generally related to the tedium of constructing and maintaining the representation over time, whereas the pros are related to the sorts of knowledge-based tasks that can be automated in virtue of the formalization.

For example, we ask questions such as “Which reactions in my reaction network rapidly produce some molecule with a carboxyl group, according to all of the simulation software we’ve tested so far?”. How do we represent the procedure which retrieves this information? A benefit of formally-structured knowledge is that this this procedure can be written in a declarative query language, which is much more interpretable than the alternative if the data were informally distributed throughout a file system. In that case, the ‘query’ would likely be an imperative function in a scripting language - this is adequate only when limited to personal use and low-complexity scenarios.

Let’s consider a specific example that extends the schema of molecules we saw earlier into a much larger olog of simulations of chemical kinetics. Below the dotted line we give examples of elements that might live in the sets corresponding to each type in the schema. The internal triangles and squares all happen to commute, emphasized by the green checkmarks.

Many subtle distinctions were made at the discretion of the author of this olog, such as H considered as a part of a particular reaction (has a well-defined coefficient, but doesn’t have a well-defined concentration) vs H considered as a species that appears somewhere in a chemical reaction network that has been simulated (well-defined concentration, no well-defined coefficient).

The pullback constraint in the upper right corner insists that there is precisely one reaction rate corresponding to each pair of a simulation and reaction that appears in that simulation. More complicated constraints can be built from these building blocks, and the explicit (i.e. machine enforceable, via the chase as described above) nature of these constraints is powerful. This means that if one were to add a new simulation for reaction network **Rxns**, we can automatically add elements to the reaction rate table with their foreign keys correctly filled out. It also means we can be automatically notified of a conflict if there are two reaction rates for the same reaction and simulation pair.

Here, the chase is alleviating the general formalized knowledge problem of a scientist who is afraid to interact with their model out of fear of invalidating some subtle or intricate assumptions. As AlgebraicJulia continues to build tools to facilitate working with categories and leveraging them for scientific computing, the pros of formalization become stronger and cons become weaker.

Model enumeration computes queries of the form “Show me all Petri nets up to such-and-such size” or “Show me all molecules up to such-and-such size”. This can be difficult when there are many equations at play: enumerating all *groups* of order 10 is difficult if we want to be more efficient than filtering a list with possible binary functions (after all, there are only two such groups, and ). Why might we be interested in such queries? - Gaining evidence for a conjecture by showing it holds for *all* cases up to a fixed size. - Finding a counterexample in the context of property-based software test suites - Gaining intuition of complex structures through their simplest *nontrivial* examples. (e.g. the five simplest Kan extensions) - Exploring scientific model spaces: finding a model that best fits some data within a logically-constrained space.

By modifying one aspect of the chase, we can obtain an algorithm for model enumeration. In order for the instance that the chase computes to be universal, when functional relationships force us to introduce a new object, it must be completely unconstrained. For example, if we fire an ED that introduces a new , then the fact that are functional relationships to means that there are two additional atoms that must be added, too. Normally, we pick fresh IDs for these atoms (, if there are atoms already). During model enumeration, we can also consider other possibilities, where we try all . Some of these new choices will *not* satisfy the axioms we seek to satisfy, but future chase steps will work towards correcting these. Ultimately, although we are searching through a combinatorial space of -sets (which are merely *premodels* with no notion of EDs), the fact we are using the chase to navigate this space allows us to spend effort finding models rather than enumerating the much larger set of premodels.

Semigroup order |
1 |
2 |
3 |
4 |
5 |
---|---|---|---|---|---|

# of binary operations | 1 | 16 | 19,683 | 4.3×10⁹ | 3.0×10¹⁷ |

# of semigroups | 1 | 8 | 113 | 3,492 | 183,732 |

# of semigroups (up to isomorphism) | 1 | 5 | 24 | 188 | 1,915 |

# of -sets explored to enumerate them all | 1 | 7 | 399 | 6,420 | 109,151 |

This preliminary data from our unoptimized prototype implementation shows that search space of possible -sets can be massively pruned using this modified chase. Being able to work with -sets up to isomorphism is essential for this strategy to avoid duplicating work.

Regular logic is both a computationally simple and yet expressive subset of first order logic. It is deeply connected to -set homomorphisms, allowing us to perform the chase in a broad array of modeling settings and to represent properties of our models as transparent combinatorial data.

Many scientists do not presently work with formal representations for their models of their domain, opting for a mix of text files, Python scripts, and spreadsheets distributed through a filesystem, only linked into a coherent whole within the mind of the scientist. This can be convenient for the individual, but, at a larger scale, science benefits from these models being explicit. This can be challenging work, thus there is no incentive for scientists to do this without tooling that allows individuals to reap benefits from this work investment. An important step along this path involves allowing scientists to take their formally-stated properties computationally and make them tractable to enforce.

Benedikt, Michael, George Konstantinidis, Giansalvatore Mecca, Boris Motik, Paolo Papotti, Donatello Santoro, and Efthymia Tsamoura. 2017. “Benchmarking the Chase.” In *Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems*. ACM. https://doi.org/10.1145/3034786.3034796.

Bonchi, Filippo, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski, and Fabio Zanasi. 2020. “String Diagram Rewrite Theory i: Rewriting with Frobenius Structure.”

Spivak, David I., and Robert E. Kent. 2012. “Ologs: A Categorical Framework for Knowledge Representation.” Edited by Chris Mavergames. *PLoS ONE* 7 (1): e24274. https://doi.org/10.1371/journal.pone.0024274.

Databases with attributes, such as

`String`

for`Float`

-valued columns, are represented as attributed -sets. These are not covered in this post.↩︎Because these bonds are symmetric, for brevity we can visually represent them as undirected.↩︎

1 has three choices where to go (it is forced to go to 1 in the codomain) and for each of these choices, there are two choices for 3. Each molecule in the domain is matched independently, thus the total number of morphisms will be the square of matching a single molecule.↩︎

This has limitations: we do not have the ability to state that bonds must occur between

*different*atoms: . Beyond inequalities, disjunctions are also not directly expressible. There is a computational justification for this limitation: if two things are*not*equal, it isn’t clear which one needs to change (and what it should change to). Likewise, if we merely know we must assign a value of*or*to something, we don’t have a straightforward automatic procedure that can pick which one we should use.↩︎

$$

$$

When is one thing equivalent to another? It turns out that answering this depends on a *context*, in some informal sense. Category theory offers a language by which this can be made precise, which has implications for a variety of technical challenges encountered in scientific computing.

Familiarity with C-sets in Catlab is helpful for understanding technical aspects later on in the post; however, readers new to AlgebraicJulia will benefit from reading about the motivation, the problem of graph equivalence, and the concluding thoughts.

A common feature of both everyday conversation and esoteric arguments is the negotiation of whether or not two things can be called ‘the same’. These arguments are often fundamentally confused because it is difficult to be explicit about what we mean by ‘the same’. Consider these examples: - If Alice is in a political argument against Bob, she can charge Bob with hypocrisy for endorsing the *same* action as one he has previously condemned. Bob will then have to make a distinction (e.g. “that action is *different* in the context of miltary conflict!”), and the argument progresses from there. Likewise, Alice can defend herself by claiming her action is the *same* as that which Bob has previously endorsed. - Many debates are secretly about how to treat context: an artist taking a urinal or a Brillo soap box and presenting it *as art* will provoke controversy, given that one could argue that we have the *same* objects in our bathrooms (yet those are clearly not art). Likewise, one could argue about whether a novel written today that is word-for-word identical to *Don Quixote* could actually be a truly *different* work of art. And one could argue about whether a novel that is word-for-word identical to a meaningful work like *Hamlet* could actually be meaningless, in the context of a library that contains *every possible* book. - Does the Star Trek teleporter kill you? Is Theseus’ ship the same ship as before? It turns out questions of sameness and difference are also closely related to questions of *identity*: we often assert an identity of an object (e.g. a rock, a river, ourselves) across different moments in time, but we’re then forced to awkwardly say two things are “the same, but different”. This awkwardness and the ensuing confusion is the price paid in exchange for the brevity, clarity, and convenience we often gain by treating an equivalence relation *as if* it were an identity relation.

We also suffer when we are not explicit about what we mean by ‘the same’ while programming:

```
mutable struct Color
name::String
wavelength::Int
end
my_blue = Color("blue", 470)
my_green = Color("green", 530)
my_blue == my_green # returns "false", as expected
your_blue = Color("blue", 470)
my_blue == your_blue # returns "false"!
```

The computer is not taking a philosophical stance, but rather it was asked to tell whether two things were equal without being told *how* to compute this. It guessed, using a default `==`

method that is unsatisfactory for our use case. Given that we actually *would* like to say these two blues are the same, we can manually define an equality test method to improve things:

```
# Ask yourself: is this reasonable?
Base.:(==)(x::Color, y::Color) = x.wavelength == y.wavelength
my_blue == your_blue # now returns "true" as desired
aqua = Color("aqua", 500)
cyan = Color("cyan", 500)
cyan == aqua # returns "true", but should it?
grue = Color("grue", if (CURRENT_YEAR < 2100) 530 else 470 end)
grue == my_green # returns "true", but should it?
```

So even in technical situations, whether or not two entities are equal can be nebulous. The dilemma is this: providing explicit `==`

algorithms is often undesirable because it feels tedious and mechanical, like something that *should* be automated; however, there is no unopinionated way to pick a default looking at the data alone. A solution to this dilemma is make it convenient to work with data *alongside* their contexts, which provide enough information for sameness to be determined algorithmically. Namely, when an entity is regarded as the object of a *category*, we obtain a formal notion of *equivalence* that, in many scenarios, captures what we intend by the `==`

operator.^{1}

In Catlab.jl, we use a data structure called an attributed C-set (or ACSet) to manipulate data that is situated within a category. This generalizes a broad class of data structures, including many generalizations of graphs (e.g. directed, symmetric, reflexive), tabular data (e.g. data frames), and combinations of the two (e.g. weighted graphs, relational databases). Here, we’ll discuss the benefits of working with *isomorphism classes* of ACSets, i.e. ACSets up to isomorphism. A working implementation of the code presented below can be found at CSetAutomorphisms.jl.

A key insight of modern mathematics is that equality is often more strict of a notion than what we want in practice. When reasoning about complex mathematical structures we need to shift our perspective to *interchangeable in this context*. Let’s consider directed graphs as an example:

Of course, the *pictures* of the graphs are different, but it’s crucial that graphs are different from images of graphs. Mathematicians regard the two graphs above as equal in the *context* of graph theory because the graphs cannot be distinguished in the *language* of graph theory. However, a computer must go beyond this language by *labeling* vertices and edges in order to efficiently represent a graph:

The three edges of this graph can be represented by two vectors of length three: `src=[1,1,2], tgt=[2,3,3]`

,`src=[2,1,1], tgt=[3,3,2]`

.

And should it be the same graph if we keep the arrow order but identify the vertices differently? We want the answers to these questions to be **yes**.

However, this is harder than it seems at first; the computer has direct access *only* to the underlying vector representations, which are not strictly equal in the above cases. In general, it is a tough computational problem (the graph isomorphism problem) to answer whether two graphs are equal in this richer, more meaningful sense because it involves searching over all possible reorderings of the labels.

The problem is even worse if we are frequently searching whether a graph is contained in some database of 1,000,000 graphs: each time we query, we’d have to solve the graph isomorphism problem up to 1,000,000 times! A solution to this problem is to find a *canonical* labeling, i.e. a specific labeling (out of all isomorphic labelings of a given graph) which is designated to be the representative. Given a method to compute this, we turn a hard problem (are two graphs isomorphic?) into an easy one (are the graphs’ canonical labelings strictly equal?). This labeling can then be used as a fingerprint to quickly check if the graph of interest is in the database: at worst, 1,000,000 string equality tests. The only challenge is to compute the canonical labeling of just our *one* graph of interest.

Much time and effort has been spent crafting efficient algorithms to solve the canonical graph labeling problem, with the most popular software being Nauty (McKay and Piperno 2014), written in C. Most high-level languages that solve this problem do so by constructing input that is passed to the original Nauty program. However, what should be done if we care about things that aren’t graphs? For example, an ACSet representing chemical reactions, specified by the following schema:^{2}

This is the declaration in Catlab.jl (full code in CSetAutomorphisms.jl):

```
using Catlab, Catlab.Theories, Catlab.CategoricalAlgebra
@present SchRxn(FreeSchema) begin
(Molecule, Atom, Bond)::Ob
inv::Hom(Bond,Bond)
atom::Hom(Bond, Atom)
mol::Hom(Atom, Molecule)
(Float, Num)::AttrType
atomic_number::Attr(Atom, Num)
coefficient::Attr(Molecule, Float)
compose(inv, inv) == id(Bond)
end
@ACSet_type RxnGeneric(SchRxn)
Rxn = RxnGeneric{Float64, Int}
```

This ACSet schema defines a *category* of chemical reactions, and viewing a mathematical entity as the object of a category gives us a natural context for determining when things are equivalent. This particular category captures many domain-specific features of the data that are not captured by representing the reaction as a simple `struct`

of various tuples and lists of data: - Neither the ordering in which atoms are labeled, the ordering of atoms within bonds, nor the ordering of bonds themselves is relevant to the identity of a molecule. - The ordering in which the reactant molecules or product molecules are listed is not relevant to the identity of the reaction: we want HO 2 H + O to be *the same* as HO O + H - The atomic numbers *are* relevant to molecule identity: CO is not HO because atomic number is an *attribute* rather than a piece of combinatorial data. Likewise for the coefficients on the reactants and products. - Stoichiometric coefficients distinguish the reactants (negative coefficients) from the products (positive coefficients), which is particularly important if we wish to characterize reactions with properties such as exothermicity.

In Catlab, we can declare *both* HO 2 H + O *and* HO O + H with the following code:

```
H2 = @ACSet Rxn begin
Molecule = 1
coefficient = [2.0]
Atom = 2
atomic_number = [1, 1]
mol = [1, 1]
Bond = 2
atom = [1, 2]
inv = [2, 1]
end
O2 = deepcopy(H2)
set_subpart!(O2, :coefficient, [1.0])
set_subpart!(O2, :atomic_number, [8,8])
H2O = @ACSet Rxn begin
Molecule = 1
coefficient = [-2.0]
Atom = 3
atomic_number = [1, 1, 8]
mol = [1, 1, 1]
Bond = 4
atom = [1, 3, 2, 3]
inv = [2, 1, 4, 3]
end
rxn₁, rxn₂ = Rxn(), Rxn()
[copy_parts!(rxn₁, x) for x in [H2O, O2, H2]]
[copy_parts!(rxn₂, x) for x in [H2O, H2, O2]]
println(rxn₁ == rxn₂) # false
println(is_isomorphic(rxn₁, rxn₂)) # true
```

Catlab’s `is_isomorphic`

function offers a solution to the graph isomorphism problem that works generically for any ACSet. However, returning to the paradigm use case for canonical isomorphisms, a scientist requires a canonical labeling in order to efficiently query a large database of reactions and see if a particular reaction is inside. This problem is addressed in CSetAutomorphisms.jl, where we generalize the Nauty algorithm beyond graphs to ACSets. The brief summary below, as well as the implementation, are based on this lucid expository paper (Hartke and Radcliffe 2009) on the Nauty algorithm. Formal statements can be found there for aspects of the algorithm that are shown here only by example.

Considering data up to isomorphism is easy for a mathematician (it’s just forgetting information!) but difficult for a computer, requiring a complicated algorithm. This algorithm has three main ingredients: color saturation, search tree exploration, and automorphism pruning. These will be briefly explained and shown to require only minor modifications to generalize to arbitrary ACSets, with our earlier ACSet `rxn₁`

as a running example. We ignore attributes, which are discussed later, meaning that we should now think of `rxn₁`

not as `⬤-⬤-⬤ -> ⬤-⬤ + ⬤-⬤`

.

While one can think of vertex labels `1`

,`2`

,`3`

,`4`

as natural numbers, it can also be helpful to think of the labels as colors: 🔵, 🟢, 🟠, 🔴 (ordering could come from alphabetizing). This shift is helpful when thinking about labelings as *partitions* of a set. We can *refine* a partition by making it more specialized (splitting up colors, never merging colors).

This refinement process can go no further when we reach a *singleton* partitioning (or *discrete* partitioning), where each element has its own color. Because such a coloring can be translated back into distinct natural numbers, it can be thought of as a *permutation*. While a permutation is an isomorphism of an object with itself in the category **Set**, we use the word *automorphism* to refer to the general case of an isomorphism from an object of some category to itself. When we are in an C-set category, the required data of an automorphism is a set permutation for each object in the indexing category.

Once we find all refinements that are automorphisms, we have identified the isomorphism class of our input C-set and can select a canonical element by taking the *lexicographic minimum*.^{3}

Below is a graphical depiction of `rxn₁`

.^{4} Our starting point has no information as to what the automorphisms are, which is reflected by the fact that all three partitions are as unrefined as possible.

Color saturation is a method of refining colorings of a graph without ruling out automorphisms. For graphs, we first collect data on the “local environment” of each vertex (how many of each color is adjacent). Because we can canonically order local environment data, we can canonically recolor the vertices by their local environment. This procedure can be repeated until convergence. Color saturation brings us closer towards our goal of identifying all automorphisms, but it does not completely solve our problem (sometimes it cannot make progress at all, e.g. starting with a complete graph).

With C-sets, there is an analogy to the “local environment” of graphs, demonstrated for the green box with the asterisk in the following figure:

Each iteration of color saturation is a refinement. The refinement of `rxn₁`

has three steps, as shown below.

This is relatively cheap to compute and can greatly reduce the search space, so we color saturate not only at the beginning but also throughout the algorithm, each time we learn new information.

Suppose that color saturation has refined our partition as far as it can. How do we identify which subset of the singleton refinements are valid automorphisms? In theory, we must try all combinations of each individual color’s permutations. These are all possible ways of completely breaking each symmetry. To avoid this combinatorial explosion, the search tree approach breaks one symmetry at a time (branching on the possible ways to do this) and runs color saturation immediately, which allows us to take advantage of the graph’s connectivity structure to do most of the refinement heavy lifting.

More precisely, our tree has graph colorings as nodes and edges to children nodes are “artificial distinctions”: all possible refinements that break the symmetry of a *particular* color (which can be chosen canonically) that add *one* new color.^{5} Finding all of the *leaves* of this tree is our goal: these are all possible singleton colorings that are valid automorphisms.

These ideas straightforwardly generalize to C-sets, where we maintain colorings for each component, rather than just for vertices. We still break symmetry for just one color (of a single component) when branching in the search tree.

The search tree can be extremely large, even when applying color saturation whenever possible. Thankfully, because we explore the search tree depth-first, we can use the leaves we’ve seen already to sometimes determine that there is no new information below a certain node, and thus its children do not need to be explored.

Referring to the previous figure for intuition: our initial split effectively differentiated the two diatomic molecules. In each branch remains much work remaining to be done in order to reach a singleton coloring; however, we have an intuition that there will be some redundant work if we consider these two branches as fully independent (after all, we know we’re making a distinction between two molecules that are truly interchangable). This intuition can be formalized by the following rule:

In this image, corresponds to our initial color saturation result, and and are the two results for our initial branching. In this scenario, we have fully explored the search tree beneath . The black nodes have already been explored and we’re contemplating branching on node . So far we have discovered automorphisms (leaf nodes) and . Despite the fact that there is an unexplored leaf node under (in blue), we can actually prune from the search tree *if* it turns out that preserves (the nearest ancestor of and ) and maps into . This is the formal specification for our intuition that the two diatomic molecules being the same means we oughtn’t have to explore both branches: the subtree under is isomorphic to the subtree under if this property holds. Note that this reasoning is purely categorical - because it is purely defined in terms of morphisms, it works equally well in ACSet categories as it works in **Grph**.

Attributes that come with an inherent ordering make this problem easier rather than harder, as the non-combinatorial data they provide can be used to distinguish elements during color saturation. All typical attributes have this property: `Int`

, `Float`

, `String`

. In the color saturation example above, would immediately allow us to distinguish the O molecule from the H molecule.

In the case of an attribute without an ordering to use, we can create a pseudo object with a number of elements equal to the number of distinct values found in the ACSet, e.g. if the datatype of were hypothetically not inherently ordered, we could create a object with three elements inside and then then run the algorithm as normal. This is less preferable because it requires computational work to come up with a canonical labeling of the pseudo C-set object.

Viewing a particular piece of data in the context of a category provides us enough context to resolve tricky issues of determining what is effectively the same as what. Due to the generality of the applied category theory perspective, we automatically bypass the tedious process of defining `__eq__`

and `__hash__`

methods for each new data structure we invent.

This is an exciting development because it enables hashing C-sets, a crucial operation for many data structures (e.g. `Dict{ACSet, String}`

) and algorithms. In particular, for dynamic programming algorithms, one can now incorporate C-sets into a state space and quickly check whether a state has already been seen. For example, suppose we have experimental data for H, O, and HO and hypothesize that `rxn₁`

is right model for this. An iterative algorithm could evaluate the proposed reaction against the data and propose modifications to `rxn₁`

that would improve the quality of fit. This yields a search space of objects with intersecting paths. In such a scenario, we can greatly reduce computations that need to be done by checking if a new is actually new when considered *up to isomorphism*.

Furthermore, the notion of a ‘random’ C-set instance crucially depends on which notion of equality we consider: picking a random instance of the C-set data structure (in contrast to picking a random C-set isomorphism class) will heavily overrepresent isomorphism classes with many elements. Using the isomorphism class notion of randomness is important in statistical mechanics applications and more closely matches what one means when discussing a random directed graph or a random Petri net.

As a final thought, the difficulty of implementing new algorithms often leads one to reduce one’s problem into the form of another problem which has a ready-made algorithm. Although the ACSet of reactions is a richer, more complex structure than a graph, it is possible to encode it *as a graph* such that existing Nauty code can be used to perform the computation. Call this “bringing the data to the algorithm”. This process takes time, can create large constant factors (by encoding rich structures in a simpler language), and prevents us from taking advantage of higher-order structure lost in the translation. We avoid these pitfalls by working at a higher level of abstraction: implementing an algorithm once for C-sets and letting Julia generate specialized code for each concrete data structure that is a C-set instance. In our experience, most graph algorithms (including complicated ones, such as Nauty) straightforwardly generalize to ACSets, making us optimistic about a future where we “bring the algorithm to the data”.

Hartke, Stephen G., and A. J. Radcliffe. 2009. “McKay’s Canonical Graph Labeling Algorithm.” American Mathematical Society. https://doi.org/10.1090/conm/479/09345.

McKay, Brendan D., and Adolfo Piperno. 2014. “Practical Graph Isomorphism, II.” *Journal of Symbolic Computation* 60 (January): 94–112. https://doi.org/10.1016/j.jsc.2013.09.003.

We recommend referring to this context-sensitive equality as ‘equivalence’, in contrast to ‘literal’ or ‘strict’ equality. As Eugenia Cheng advocates, “All equations are lies…or useless.”↩︎

The relation between bonds and atoms is taken from half-edge graphs, which encode symmetric relationships and also allow for the representation of ‘dangling bonds’.↩︎

For example, the graph with two vertices and an arrow going between them has two elements in the automorphism class: and . Because the underlying data of a C-set is essentially of type

`Dict{String, Vector{Int}}`

, we require a method to tell whether two elements of this type are less than, greater, or equal to each other. One way to do this is to order the vectors (in this case, let’s use alphabetization to order the set ) in order to get two elements of type`Vector{Vector{Int}}`

which we can compare to see that .↩︎Although the elements of (triangles), (circles), and (squares) live in different sets, we visualize them all in one graph via the category of elements construction.↩︎

For example, with 🔴🟡🟡🔴🟡🔴🔴, we branch on 🟡 and obtain three child nodes:

🔴🟢🟡🔴🟡🔴🔴, 🔴🟡🟢🔴🟡🔴🔴, and 🔴🟡🟡🔴🟢🔴🔴. ↩︎

$$

$$

Every proposition is either true or false. This famous tautology is called the *law of excluded middle*. As the words “tautology” and “law” suggest, the law of excluded middle is often taken to be obviously true and attempts to deny it as bizarre and esoteric. But it is easy enough to imagine everyday situations that challenge the obviousness of this principle. Suppose I loiter in the doorway of your office, with one foot in the room and the other in the hallway. Am I in your office, or not? The law of excluded middle says that we must regard exactly one of the two propositions as true, but the choice of which seems arbitrary. Such conundrums are typical of the “logic of space.”

In this post, we will see that any graph, and more generally any -set, can serve as the domain of discourse of a logical system. Unless the schema satisfies certain special conditions, the logic will be intuitionistic, which means that the principle of excluded middle does not hold. Far from being paradoxical, such logics offer a natural and powerful language to reason about spaces whose parts are connected together.

To understand this post, it will helpful to know the definition of a -set (Part I) and of a morphism of -sets (Part III). The textbook by Reyes, Reyes, and Zolfaghari, now out of print but freely available online, is recommended for further reading (M. L. P. Reyes, Reyes, and Zolfaghari 2004). Complete code for similar examples is available in the Catlab documentation.

Logic supplies rules for arguing that certain propositions hold, given other propositions that are assumed to hold. But not all propositions concern all things; a person might be “good at baseball,” but cannot be “prime”, and a number can be “prime”, but cannot be “good at baseball”. Thus, propositions are understood relative to a *domain of discourse*, the set of things of which the proposition may be true or false. Each proposition determines a subset of the domain of discourse, namely the subset of things of which the proposition is true. Conversely, any subset of the domain determines a proposition, the proposition “is in the subset.” So, once we have chosen a domain of discourse, propositions become interchangeable with subsets of the domain. Logical connectives then translate into set-theoretic operations; for example, conjunction (“and”) and disjunction (“or”) of propositions become intersection and union of subsets.

A point of departure for categorical logic is to replace “set” or “domain” with “object in a category” and replace “subset” with “subobject.” What is a subobject? Given a category , a **subobject** of an object is a monomorphism into . A monomorphism in a category is, loosely speaking, a morphism in that behaves like an injection in the category of sets. In particular, a subobject of a set is an injective function into , whose image picks out a subset of .^{1}

We can now begin to generalize propositional logic from sets to -sets, for any schema . Fix a -set , which will serve as our domain of discourse. A **sub--set** of is a subobject of , i.e., a monomorphism into . As one might expect, a morphism of -sets is monic if and only if every component is an injective function. Sub--sets of define propositions relative to the domain of discourse .

Taking to be the terminal category recovers the classical world of sets, but other choices lead to less familiar logics. As usual, our running example will be graphs, where is the schema for graphs. A monomorphism is then a graph homomorphism whose vertex and edge maps are injective, and a sub--set is a **subgraph**.

Throughout the post, we will take the following graph to be our domain of discourse.

```
using Catlab.Theories, Catlab.CategoricalAlgebra, Catlab.Graphs
using Catlab.Graphics # hide
X = cycle_graph(Graph, 4) ⊕ path_graph(Graph, 2) ⊕ cycle_graph(Graph, 1)
add_edge!(X, 3, add_vertex!(X))
X
```

Here are two typical subgraphs, where membership in the subgraph is indicated by highlighting.

`A = Subobject(X, V=1:4, E=[1,2,4])`

`B = Subobject(X, V=[2,3,4,7,8], E=[2,3,6,7])`

Notice that an edge can belong to a subgraph only if both its source and target vertices do.

We have claimed that a -set can serve as a domain of discourse and sub--sets as propositions about the domain, but we won’t have a logical system until we have a few logical connectives, such as conjunction and disjunction. How should we go about defining them? In classical logic, the logical connectives are usually taken as primitive, given by something like pure intuition, but in our generalized logic, it is not as obvious how to proceed. It was the insight of Lawvere that logical operations can be derived systematically by taking adjoints of still more primitive operations. This procedure can be carried out in any category for which the needed adjoints exist.

To set this up, we need to explain how the subobjects of a given object form a preorder and how to compute adjoints in a preorder. Given an object in a category , let denote the set of all subobjects of . A morphism from a subobject to a subobject is a morphism in such that . Since is monic, there exists at most one such morphism , and since is monic, is monic when it exists. With this definition, becomes a thin category or preorder.^{2} We write when there exists a (unique) morphism in . The logical interpretation of is that the proposition implies the proposition .

Adjointness is a fundamental concept of category theory that is easily understood in the special case of preorders. An **adjunction** between two preorders and consists of a pair of monotone maps and such that for any objects and , one has if and only if . The “if and only if” statement is sometimes denoted by a vertical bar:

One says that is **left adjoint** to , or that is **right adjoint** to , and writes . Left and right adjoints are unique up to isomorphism when they exist; when is a poset, they are unique without qualification. For more about adjunctions in preorders and posets, (see Fong and Spivak 2018, chap. 1; M. L. P. Reyes, Reyes, and Zolfaghari 2004, chap. 7).

Any preorder whatsoever supports the basic operations of duplication and deletion. Let be the cartesian product of with itself, so that if and only if and , and let be the unique preorder with a single element . Then the **diagonal** or **duplication** map is defined by , and the **deletion** map is defined by . Starting from these seemingly trivial operations, we shall derive all the connectives of propositional logic.

Letting be the preorder of subobjects of , **conjunction** of propositions, a binary operation , is defined to be the right adjoint of the duplication map . For any subobjects , , , we have the equivalences:

So if and only if is a lower bound of and , which means that is a *greatest lower bound* or *meet* of and . In logical terms, the conjunction of and is weakest proposition that implies both and .

The **true** proposition, a constant , is defined to be the right adjoint of the deletion map . For any subobject , we have:

So any subobject is a lower bound of , meaning that is a *maximal* or *top* element. In logical terms, is the proposition that is always true, i.e., the weakest of all propositions.

Dually, **disjunction** of propositions and the **false** proposition are defined to be left adjoint to the duplication and deletion maps, respectively, making them the *least upper bound* or *join* and a *minimal* or *bottom* element. In logical terms, the disjunction of and is the strongest proposition that is implied by both and , whereas is the proposition that is always false, i.e., the strongest proposition.

In the logic of sub--sets, all four operations have simple formulas. For any and , conjunction and disjunction satisfy

where, by a common abuse of notation, we are identifying a monomorphism into with its image in . Similarly, true and false are the full sub--set and the empty sub--set, respectively. As a concrete example, the conjunction and disjunction of the two subgraphs and defined above are:

`A ∧ B`

`A ∨ B`

The logical operations introduced so far have straightforward pointwise formulas reducing them to classical logic, but things become more interesting when we turn to implication and negation.

The left adjoint of conjunction is, by definition, the diagonal, but we can still ask whether conjunction has a right adjoint. To be more precise, fixing a subobject , we ask whether conjunction with , a map , has a right adjoint. The answer is yes whenever is the subobjects of a -set. The adjoint operation is called **implication** and is denoted . By definition, implication is characterized by the equivalence

for all . Logically, is the weakest proposition whose conjunction with implies . But recall that , or the existence of a morphism , already means that “ implies .” Thus, the proposition can be seen as “internalizing” implication within the logic itself. Indeed, implication is an example of an internal hom in a cartesian monoidal category, although that is slightly beyond the scope of this article.

As in classical logic, negation can defined using implication: the **negation** of is . Using the adjunction, we have:

So is the largest subobject whose meet with is empty, or in logical terms, the weakest proposition that is inconsistent with .

Unlike conjunction or disjunction, implication and negation of sub--sets generally cannot be computed by pointwise Boolean operations. Suppose we tried to define the negation of a subgraph by including an element in if and only if it is not included in . In our running example of a graph and subgraph , we would include edge 3 of in but not include its source or target vertices—which does not define a valid subgraph.

Instead, implication and negation of sub--sets satisfy the formulas

for every and (M. L. P. Reyes, Reyes, and Zolfaghari 2004, Proposition 9.1.5). The reader is encouraged to write out these formulas explicitly in the case of graphs. As concrete examples, here are and for the subgraphs and defined above:

`A ⟹ B`

`¬A`

Negation of sub--sets is conservative in the sense that an element is included in just when no element *reachable* from is included in . This suggests that there should be a more liberal notion of negation, say , such that an element is included in just when is reachable from any element not in . We can construct the “other negation” dually to negation: as the two-fold left adjoint to the diagonal map, rather than the two-fold right adjoint.

Fix a subobject and consider disjunction with , which is a map . When for a -set , this map has a left adjoint, called **subtraction** and denoted . Subtraction is characterized by the equivalence

for all . In logical terms, is the strongest proposition whose disjunction with is implied by . The **complement** of is then defined by . By the adjunction, we have:

So is the smallest subobject whose join with is all of , or in logical terms, the strongest proposition whose disjunction with is a tautology.

Following Lawvere, we read as “not-” and as “non-”. For any subobject , we have , i.e., the complement of is at least as large as the negation of .

As hinted above, subtraction and complement of sub--sets obey the formulas

for all and . In our running example, the subgraphs and are:

`A \ B`

`~A`

The principle of excluded middle says that every proposition is either true or false: that is, for any proposition . What else could it be? But as we saw at the beginning of this post, if there is connectivity between the parts of our domain of discourse—if I can loiter in the doorway of your office, not quite in but not quite out—then the law of excluded middle fails. We can now be more precise about how this happens. A graph satisfies the principle of excluded middle if and only if it is **discrete**, meaning that it has no edges whatsoever. A single subgraph satisfies precisely when it is disconnected from the rest of the graph, meaning that there are no edges connecting with . Since your office is connected to the rest of the building through the doorway, the proposition “I am in your office” will not satisfy the law of excluded middle.

The two negations of the logic of subgraphs lets us answer the question of what exactly the law of excluded middle is excluding. First, we can dualize the law of excluded middle to use the complement: instead of asking whether , we could ask whether . It turns out that one of these conditions holds for all if and only the other does. The latter condition, , is a nice formulation of the law of excluded middle because it excludes a proposition (asks it to be false). The “middle” being excluded here is the intersection of and its complement . Lawvere calls this the **intrinsic boundary** of :

The law of excluded middle can then be reformulated as: every subobject has an empty boundary.

For graphs, the boundary of a subgraph consists of all vertices in which are connected to the outside of .

```
∂(A) = A ∧ ~A
∂(A)
```

The logic of subgraphs offers a powerful language for constructing new subgraphs from old. The boundary operator is an example. Let us see some other examples.

If is a subgraph, then is the subgraph **induced** by : the subgraph having the same vertices as but containing all the edges between those vertices.

```
C = Subobject(X, V=1:4)
¬(¬C)
```

The expansion and contraction operators are also useful (G. E. Reyes and Zolfaghari 1996). The subgraph is but expanded by one degree outward. It includes all the edges that are incident to a vertex in .

`~(¬A)`

Dually, is but contracted by one degree, containing exactly the edges between vertices in that are not connected to the outside of .

`¬(~A)`

Iterating these two operations will expand or contract until it satisfies , so that there is nothing left to add or remove because is disconnected from the rest of the graph.

This post has introduced the propositional logic of a bi-Heyting topos, whose primary example is the topos of graphs or any other topos of -sets. We have seen that, while it may initially appear exotic, this generalized logic provides a natural language to reason about domains of discourse that exhibit connectivity. The topos of sets, a Boolean topos, exemplifies the very special case of a totally disconnected domain, where the two negations coincide and the principle of excluded middle holds.

The logic explored here is relative to a fixed domain of discourse. What if we wish to change the domain of discourse, say from one graph to another? Pursuing this line of thought ultimately leads to extending the propositional logic of a bi-Heyting topos to a full predicate logic, having existential and universal quantifiers. That will be the topic of a future post.

Fong, Brendan, and David I Spivak. 2018. “Seven Sketches in Compositionality: An Invitation to Applied Category Theory.”

Johnstone, Peter T. 2002. *Sketches of an Elephant: A Topos Theory Compendium, Volume 1*. Oxford, England: Clarendon Press.

Reyes, Gonzalo E., and Houman Zolfaghari. 1996. “Bi-Heyting Algebras, Toposes and Modalities.” *Journal of Philosophical Logic* 25 (1): 25–43. https://doi.org/10.1007/bf00357841.

Reyes, Marie La Palme, Gonzalo E. Reyes, and Houman Zolfaghari. 2004. “Generic Figures and Their Glueings: A Constructive Approach to Functor Categories.” In. https://marieetgonzalo.files.wordpress.com/2004/06/generic-figures.pdf.

To make a stronger analogy between subsets and subobjects, many authors define a subobject of an object to be an

*isomorphism class*of monomorphisms into . In practice, one tends to move freely between the two definitions as is convenient (Johnstone 2002, sec. A1.3).↩︎If one passes to isomorphism classes of subobjects, the preorder becomes a poset.↩︎

$$

$$

Continuing our tour of the many flavors of graphs and their manifestation as -sets, we discuss reflexive graphs, a seemingly minor variant of graphs that nonetheless has some interesting and distinctive features. A reflexive graph is a graph where every vertex has a distinguished self-loop, like this:

At first glance, the self-loops may seem an inconsequential addition. It is even customary to omit them from drawings, making reflexive graphs more or less interchangeable with graphs as *objects*. However, the *morphisms* of reflexive graphs differ significantly from those of graphs and as a result the category of reflexive graphs behaves differently than the category of graphs. We will see that reflexive graphs are more “geometric” than graphs in various ways. For example, if graphs and discretize continuous spaces and , then product will discretize the product space only if the graphs are reflexive.

The story of reflexive graphs will give us opportunity to discuss morphisms of -sets generally, which we have not yet done in this series. To understand this post, it will be helpful to have read Part I. It is not necessary to read Part II, which goes in a different direction.

A **reflexive graph** is a graph together with a function that assigns to each vertex a self-loop at . In other words, a reflexive graph is a -set, where , the **schema for reflexive graphs**, is the category generated by

subject to the equations of the commutative diagram:

In Catlab, the schema for reflexive graphs is:

```
@present SchReflexiveGraph <: SchGraph begin
refl::Hom(V,E)
compose(refl, src) == id(V)
compose(refl, tgt) == id(V)
end
```

Similarly, a **symmetric reflexive graph** is a symmetric graph together with a function that assigns to each vertex a self-loop at , which is fixed under the edge involution. Both reflexive graphs and symmetric reflexive graphs are included in the Catlab module for graphs (`Catlab.Graphs`

).

The key difference between graphs and reflexive graphs is between their morphisms, so let us review the notion of -set homomorphism.

Given a small category , a **homomorphism**, or simply a **morphism**, of -sets and is a natural transformation between the functors . Thus, a homomorphism consists of a function for each object , such that for every morphism in , the naturality square

commutes. The functions are called the **components** of the transformation.

The -sets and -set homomorphisms then form a category, suggestively denoted -, where homomorphisms are composed componentwise and the identity homomorphism has each component the identity function. A homomorphism of -sets is an **isomorphism** if it has an inverse in -, which turns out to be equivalent to each component being an invertible function (Awodey 2010, Lemma 7.11).

The schema for graphs has only two objects, and , and two morphisms, and , so a graph homomorphism consists of a vertex map and an edge map that preserves the source and target vertices:

When is a simple directed graph (has at most one directed edge between any two vertices), the edge map is completely determined by the vertex map and we recover the usual notion of simple graph homomorphism. In general, though, the edge map is a nonredundant part of the data of a graph homomorphism.

A reflexive graph homomorphism is a graph homomorphism that preserves the morphism , hence sends reflexive loops to reflexive loops. But there is no requirement that *only* reflexive loops be sent to reflexive loops. This has the effect of allowing reflexive graph homomorphisms to “collapse” edges onto vertices, which is not allowed in graph homomorphisms unless the target graph happens to have self-loops “by accident.” On the other hand, there is no appreciable difference between isomorphisms of graphs and isomorphisms of reflexive graphs since the latter can only send reflexive loops to reflexive loops.

The following computational example illustrates this contrast. For any finitely presented category , the functions `homomorphism`

and `homomorphisms`

in Catlab find one or all homomorphsims between two -sets.^{1} For example, the graphs

```
using Catlab.Graphs, Catlab.CategoricalAlgebra
g = path_graph(Graph, 3)
```

and

```
h = Graph(3)
add_edges!(h, [1,2], [3,3])
h
```

are not homomorphic:

`isempty(homomorphisms(g, h))`

`true`

There are seven different homomorphisms between the corresponding reflexive graphs, whose vertex maps are shown in the table below. The rows are homomorphisms , the columns are vertices of , and the cells are the assigned vertices of .

```
using DataFrames
g = path_graph(ReflexiveGraph, 3)
h = ReflexiveGraph(3)
add_edges!(h, [1,2], [3,3])
αs = homomorphisms(g, h)
df = rename!(DataFrame(Tuple(α[:V]) for α in αs), ["v$i" for i in 1:nv(g)])
```

```
7×3 DataFrame
α │ v1 v2 v3
───┼────────────
1 │ 1 1 1
2 │ 1 1 3
3 │ 1 3 3
4 │ 2 2 2
5 │ 2 2 3
6 │ 2 3 3
7 │ 3 3 3
```

A reflexive graph homomorphism can always collapse the entirety of onto a single vertex of , which accounts for three of the homomorphisms. Of the remaining four, two homomorphisms collapse vertices 1 and 2 of and the other two collapse vertices 2 and 3. Although reflexive graph homomorphism is clearly weaker than graph homomorphism, it should also be clear that reflexive graph homomorphism imposes significant constraints, as there are different functions but only 7 extend to homomorphisms.

When graphs are viewed as presentations of metric spaces, reflexive graph homomorphisms fit better with the metric geometry than graph homomorphisms. Any graph or reflexive graph generates a Lawvere metric space^{2} where the distance between two points and is the length of the shortest path from to in . Perhaps the most natural notion of morphism between metric spaces are the nonexpansive maps: a map between Lawvere metric spaces and is **nonexpansive** if for all . Every graph or reflexive graph homomorphism restricts to a nonexpansive map on the induced metric spaces. However, a nonexpansive map generally only extends to a homomorphism in the case of reflexive graphs (Hell and Nesetril 2004, 64). In other words, the functor is full but the functor is not.

In a category of -sets, the cartesian product or categorical product is computed by taking the cartesian product of sets “pointwise” with respect to the objects of . Specifically, the binary product of -sets and has elements for every . It is defined on morphisms of by

In particular, the product of graphs and , possibly reflexive or symmetric, has vertex set and edge set .

Let’s look at the product of the path graph with itself in four different categories. In the category of graphs, this product is:

```
n = 5
path = path_graph(Graph, n)
path2 = ob(product(path, path))
```

In the category of reflexive graphs, it is:

```
path = path_graph(ReflexiveGraph, n)
path2 = ob(product(path, path))
```

Note that the reflexive loops are omitted from the drawing. Although it is somewhat obscured by the drawings, the disconnected paths in the graph product correspond to the vertical paths in the reflexive graph product. The remaining edges in the latter product come from pairs where at least one of and are reflexive loops.

In the category of symmetric graphs, the product of a path graph with itself looks like:

```
path = path_graph(SymmetricGraph, n)
path2 = ob(product(path, path))
```

Restricted to simple undirected graphs, graph theorists call this the **direct product** or **tensor product**. The reader should consider why this example has two connected components. (It is related to the fact that the path graph is bipartite.^{3}) Finally, in the category of symmetric reflexive graphs, the product is:

```
path = path_graph(SymmetricReflexiveGraph, n)
path2 = ob(product(path, path))
```

Graph theorists call this the **strong product**. The two components in the previous drawing should be imagined superimposed on this one, comprising the horizontal and vertical paths.

Graphs and reflexive graphs also differ with respect to their generalized elements. In a category with a terminal object , a **generalized element** of an object is a morphism . Note that terminal objects are the nullary case of cartesian products. In a category of -sets, the terminal object has equal to a singleton set for every . In particular, the terminal graph is a self-loop. This means that a graph with no self-loops has no generalized elements:

```
I = ob(terminal(Graph))
g = path_graph(Graph, 5)
isempty(homomorphisms(I, g))
```

`true`

On the other hand, the generalized elements of a reflexive graph correspond exactly to its vertices, as one might expect.

```
I = ob(terminal(ReflexiveGraph))
g = path_graph(ReflexiveGraph, 5)
αs = homomorphisms(I, g)
df = DataFrame((v1=α[:V](1),) for α in αs)
```

```
5×1 DataFrame
α │ v1
───┼────
1 │ 1
2 │ 2
3 │ 3
4 │ 4
5 │ 5
```

Thus, reflexive graphs have a more geometric notion of “point” than graphs.

None of the four products considered above correspond to what graph theorists have traditionally considered the standard product of graphs. That would be the *box product*.^{4} From the categorical viewpoint, the box product is a bit curious and and its generalization to categories is affectionately called by category theorists the funny tensor product.

Given a reflexive graph , let be the discrete graph on the vertices of , whose only edges are the reflexive loops. The **box product** of reflexive graphs and , denoted , is defined as the pushout in :

Pushouts in a category generalize unions of sets. It is beyond the scope of this post to precisely describe pushouts of -sets, but Catlab can compute them, so we can at least implement the box product and verify that it gives the desired result on path graphs.

```
using Catlab.Theories
function box_product(g::T, h::T) where T <: ACSet
g₀, h₀ = T(nv(g)), T(nv(h))
incl_g = CSetTransformation((V=vertices(g), E=refl(g)), g₀, g)
incl_h = CSetTransformation((V=vertices(h), E=refl(h)), h₀, h)
proj_g₀, proj_h₀ = product(g₀, h₀)
ob(pushout(pair(proj_g₀ ⋅ incl_g, proj_h₀),
pair(proj_g₀, proj_h₀ ⋅ incl_h)))
end
path = path_graph(ReflexiveGraph, n)
path2 = box_product(path, path)
```

In the category of symmetric reflexive graphs, the box product of path graphs looks like:

```
path = path_graph(SymmetricReflexiveGraph, n)
path2 = box_product(path, path)
```

Restricting the box product of symmetric reflexive graphs to a monoidal product of simple undirected graphs, we recover the box product as understood by graph theorists.

A fundamental principle of modern mathematics, and especially of category theory, is that mathematical objects should be studied through their morphisms. This principle has been exemplified by the contrast between graphs and reflexive graphs, which appear similar as objects but form importantly different categories. As we have seen, reflexive graphs are closer to metric spaces and conform better with geometric intuition than graphs. For further reading, a topos-theoretic perspective is offered by Lawvere, who argues that the topos of reflexive graphs satisfies some axioms making it a “topos of spaces” but the topos of graphs does not (Lawvere 1986). Of course, in settings far removed from geometry, ordinary graphs may be preferred for their simplicity.

Awodey, Steve. 2010. *Category Theory*. 2nd ed. Oxford Logic Guides. London, England: Oxford University Press.

Hammack, Richard, Wilfried Imrich, and Sandi Klavžar. 2011. *Handbook of Product Graphs*. CRC Press. https://doi.org/10.1201/b10959.

Hell, Pavol, and Jaroslav Nesetril. 2004. *Graphs and Homomorphisms*. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198528173.001.0001.

Lawvere, F. William. 1973. “Metric Spaces, Generalized Logic, and Closed Categories.” *Rendiconti Del Seminario Matematico e Fisico Di Milano* 43 (1): 135–66. https://doi.org/10.1007/bf02924844.

———. 1986. “Categories of Spaces May Not Be Generalized Spaces as Exemplified by Directed Graphs.” In. http://tac.mta.ca/tac/reprints/articles/9/tr9abs.html.

The homomorphism finding procedure in Catlab uses backtracking search, which, despite being fairly simple, is still much faster than brute-force search. A future post will discuss the general correspondence between finding -set homomorphisms and solving constraint satisfaction problems.↩︎

A

**Lawvere metric space**is like a metric space except that the metric need not be finite, symmetric, or separate points (Lawvere 1973). The Lawvere metric generated by a graph will not be symmetric unless the graph is symmetric and will not be finite unless the graph is strongly connected.↩︎In fact, the tensor product of two connected, bipartite, undirected graphs always has exactly two connected components (Hammack, Imrich, and Klavžar 2011, Theorem 5.9).↩︎

Some graph theorists call the box product the “cartesian product,” but this terminology should be avoided because it is not the categorical product in the category of simple undirected graphs or, as far as I know, in any category of graph-like objects.↩︎

$$

$$

In this series of posts on open dynamical systems, we explore various theories of composition for open dynamical systems (shown in the diagram below) and use these theories to build complex systems out of primitive ones. In the previous post, we explored directed theories of composition (shown in purple). In this post, we will focus on undirected theories of composition (shown in blue) and give examples of composing dynamical systems using the AlgebraicDynamics.jl package.

Careful bookkeeping is endemic to implementing a complicated composition pattern. Previously we applied pullback functorial data migration in order to construct complex composition patterns more easily and with less room for error. In this post, we give two more strategies, which both rely on the rich structure of the composition syntax: 1. **Hierarchical composition**, in which a primitive system can itself be a composition of systems. Hierarchical composition is extremely useful when we want to divide and conquer or when we want to build upon a composite system without starting from scratch. This strategy is only available in open composition syntaxes such as , , . 2. **Taking (co)limits of composition patterns.** This strategy is available in all of the composition syntaxes discussed so far.

We will demonstrate these two strategies as we construct a complex ecosystem with six species: rabbits, foxes, hawks, little fish, big fish, and sharks.

First we will apply hierarchical composition so that we don’t have to simultaneously study all of the interactions between the six species. We will separately model the three sub-ecosystems — land, air, and water — which may themselves be compositions of even more primitive system. Once we have constructed these sub-systems, we will compose them according to the following pattern:

```
eco_pattern = @relation (rabbits, foxes, hawks, littlefish, bigfish, sharks) begin
land_eco(rabbits, foxes, hawks)
air_eco(hawks, littlefish)
ocean_eco(littlefish, bigfish, sharks)
end
```

In other words, there aren’t two independent hawk populations for the land and air sub-ecosystems. So in the final model we identify the hawk population from the land ecosystem and the hawk population from the air ecosystem. Likewise, for little fish.

Let’s zoom-in on the land ecosystem. We have five primitive systems which share variables: 1. rabbit growth: 2. rabbit/fox predation: 3. fox decline: 4. rabbit/hawk predation: 5. hawk decline:

Therefore, the desired composition pattern has five boxes and many ports and wires to keep track of. Instead of implementing this composition pattern by hand, we construct it as a pushout of two simpler composition patterns — one which represents the rabbit/fox interactions and one which represents the rabbit/hawk interactions.

For the rabbit/fox composition pattern note that there are not two independent rabbit populations — one that grows and one that gets eaten by foxes. Likewise, there are not two independent fox populations — one that declines and one that feasts on rabbits. To capture these interactions, we identify the two rabbit populations and identify the two fox populations via the following composition pattern.

```
rabbitfox_pattern = @relation (rabbits, foxes) begin
rabbit_growth(rabbits)
rabbit_fox_predation(rabbits,foxes)
fox_decline(foxes)
end
```

The rabbit/hawk composition pattern is identical.

```
rabbithawk_pattern = @relation (rabbits, hawks) begin
rabbit_growth(rabbits)
rabbit_hawk_predation(rabbits,hawks)
hawk_decline(hawks)
end
```

Now we construct the complete composition pattern for the land ecosystem by gluing these two composition patterns along the box corresponding to rabbit growth.

```
using Catlab.CategoricalAlgebra
# Define the composition pattern for rabbit growth
rabbit_pattern = @relation (rabbits,) -> rabbit_growth(rabbits)
# Define transformations between the composition patterns
rabbitfox_transform = ACSetTransformation((Box=[1], Junction=[1], Port=[1], OuterPort=[1]), rabbit_pattern, rabbitfox_pattern)
rabbithawk_transform = ACSetTransformation((Box=[1], Junction=[1], Port=[1], OuterPort=[1]), rabbit_pattern, rabbithawk_pattern)
# Take the pushout
land_pattern = ob(pushout(rabbitfox_transform, rabbithawk_transform))
```

Phew! After seeing the complexity of this composition pattern, we are grateful that we used a pushout instead of constructing it by hand. Lastly, we define the primitive systems and compose.

```
dotr(u,p,t) = p.α₁*u
dotrf(u,p,t) = [-p.β₁*u[1]*u[2], p.γ₁*u[1]*u[2]]
dotf(u,p,t) = -p.δ₁*u
dotrh(u, p, t) = [-p.β₂*u[1]*u[2], p.γ₂*u[1]*u[2]]
doth(u, p, t) = -p.δ₂*u
# Define the primitive systems
rabbit_growth = ContinuousResourceSharer{Float64}(1, 1, dotr, [1])
rabbitfox_predation = ContinuousResourceSharer{Float64}(2, 2, dotrf, [1,2])
fox_decline = ContinuousResourceSharer{Float64}(1, 1, dotf, [1])
rabbithawk_predation= ContinuousResourceSharer{Float64}(2, 2, dotrh, [1,2])
hawk_decline = ContinuousResourceSharer{Float64}(1, 1, doth, [1])
# Compose
land_sys= oapply(land_pattern, [rabbit_growth, rabbitfox_predation, fox_decline, rabbithawk_predation, hawk_decline]);
```

The resource sharer `land_sys`

models the land ecosystem. Although it will play the role of a primitive system in our total ecosystem, we see that it is a composite itself. This structure — primitive systems themselves being composites — is the essence of hierarchical composition.

The air ecosystem is straightforwardly defined by the following resource sharer which models hawk/little fish predation. Here we don’t model the decay of hawks or the growth of little fish because these processes are already accounted for in the land and water ecosystems. The air ecosystem is a pure coupling term between the two systems.

```
dothf(u,p,t) = [p.γ₃*u[1]*u[2], -p.β₃*u[1]*u[2]]
air_sys = ContinuousResourceSharer{Float64}(2, 2, dothf, [1,2]);
```

In the previous post, we defined a ocean ecosystem as the directed composition of machines. In this aquatic foodchain, sharks eat big fish and big fish eat little fish. We can turn the machine representing this ocean ecosystem into a resource sharer as follows.

`water_sys = ContinuousResourceSharer{Float64}(3, (u,p,t)->eval_dynamics(ocean_sys, u, [], p, t));`

Now that we have constructed resource sharers for the three sub-ecosystems, we are ready to plug them into the established composition pattern, `eco_pattern`

.

```
# Compose
eco_system = oapply(eco_pattern, [land_sys, air_sys, water_sys])
# Plot and solve
u0 = [100.0, 50.0, 20.0, 100, 10, 2.0]
params = LVector(
α₁ = 0.3, β₁ = 0.015, γ₁ = 0.015, δ₁ = 0.7, β₂ = .01, γ₂ = .01, δ₂ = .5, # land params
γ₃ = 0.001, β₃ = 0.003, # air params
α₄ = 0.35, β₄ = 0.015, γ₄ = 0.015, δ₄ = 0.7, β₅ = 0.017, γ₅ = 0.017, δ₅ = 0.35 # water params
)
tspan = (0.0, 75.0)
prob = ODEProblem(eco_system, u0, tspan, params)
sol = solve(prob, Tsit5())
plot(sol, lw=2, label = ["rabbits" "foxes" "hawks" "little fish" "big fish" "sharks"])
```

A successful example of hierarchical composition!

To really see the benefits of hierarchical composition, let’s take a look at how we could have achieved this composition without the layered approach.

```
# Define the (flattened) composition pattern
flattened_pattern = ocompose(eco_pattern, 1, land_pattern)
#Compose
eco_sys2 = oapply(flattened_pattern, [rabbit_growth, rabbitfox_predation, fox_decline, rabbithawk_predation, hawk_decline, air_sys, water_sys]);
```

While this strategy produces the same composite system,^{3} it lacks the clarity of the hierarchical approach which we can see from the flattened composition pattern:

You can find more composite dynamical systems (including a multi-city SIR model and a cellular automata) in the documentation for AlgebraicDynamics.jl. As this package develops, a large class of open projects is to implement black and grey boxing functors between operad algebras, such as: - A bridge from AlgebraicPetri to AlgebraicDynamics - Integrators (beyond Euler’s method), including ones that take advantage of hierarchical and compositional structure - An implementation of the claim “every machine is a resource sharer and every composition of machines is a composition of resource sharers”

Baez, John C., and Blake S. Pollard. 2017. “A Compositional Framework for Reaction Networks.” https://doi.org/10.1142/S0129055X17500283.

Spivak, David I. 2013. “The Operad of Wiring Diagrams: Formalizing a Graphical Language for Databases, Recursion, and Plug-and-Play Circuits.”

Undirected wiring diagrams form an operad where operadic composition is given by a pushout, shown in (Spivak 2013). Here you can learn more about the operadic composition of undirected wiring diagrams as well as the

`@relation`

macro for specifying undirected wiring diagrams.↩︎This interpretation of composition of dynamical systems is mathemetized by the cospan algebra defined in (Baez and Pollard 2017).↩︎

Functoriality of the cospan algebra implies that the resource sharers

`eco_sys`

and`eco_sys2`

have identical dynamics even though the dynamics are described by different expressions.↩︎

$$

$$

There is a discrepancy between how scientists conceive of their models and how they are implemented in a computer program. Informally, scientists represent their models as a composition of many primitive interactions. For example, an ecologist studying an ecosystem with one hundred species examines interactions between pairs of species and takes the full ecosystem to be a composite of these primitive systems. However, when the scientist sits down at a computer to encode the model, this modular structure is lost. The ecologist must simply write down a 100-variable ODE.

In AlgebraicDynamics.jl (see also the documentation), we introduce a modeling framework which allows the user to encode a complex dynamical system as the composite of primitive dynamical systems. Following the mathematics of operads and operad algebras, we explicitly represent the composition syntax itself as an algebraic object. While traditional modeling tools use a fixed syntax that is provided by the tool, the operadic approach provides a level-shift, where syntax is elevated to become flexible and programmable. Users can define new syntaxes adapted to their domains yet still interoperate between different syntaxes using the infrastructure built for operads.

So what is this process that builds complex systems out of primitive ones? First, start with rules for how to combine primitive systems. This *theory of composition* is an algebraic structure that specifies a composition syntax. Then, choose a pattern which follows the rules of the syntax — called a *composition pattern* — and primitive systems to compose. Lastly, the `oapply`

method returns the composite system. The meat of this post is to give examples this general approach.

The category theoretic parallel is: - a *theory of composition* implements an operad, - a *composition pattern* implements a morphism (also called a term or an expression) in the operad, and - the `oapply`

method implements an operad algebra.

We are interested in two distinct styles of composition (1) composition via directed communication and (2) composition via undirected communication.^{1} We call dynamical systems that compose via directed communication *machines* and dynamical systems that compose via undirected communcation *resource sharers*. A zoo of composition theories is given below with directed theories in purple and undirected theories in blue.

In the following series of blog posts, we will explore the different species of this zoo and their implementations. In this post, we begin with directed theories. We choose to demonstrate the simpler theories to make the code readable and the examples constrained. These posts also showcase several categorical features of attributed -sets such as functorial data migration, -set transformations, and (co)limits of -sets.

Machines are dynamical systems which compose via directed communication. Informally, a machine consists of five component parts: - inputs - states - outputs - dynamics for evolution - a readout function

We will assume throughout this blog post that the evolution rule is an ODE with exogenous variables (also called a driven or forced ODE). Exogenous variables drive the dynamics but their values are determined elsewhere. For example, in the differential equation is an exogenous variable which drives the system, while and are (fixed) parameters. Following the informal defintion, we implement a machine as a struct below.

```
struct Machine{T}
ninputs::Int
nstates::Int
noutputs::Int
dynamics::Function
readout::Function
end
nstates(m::Machine) = m.nstates
ninputs(m::Machine) = m.ninputs
noutputs(m::Machine) = m.noutputs
```

Following the general approach, we will first define a composition syntax. Here is our first set of rules for composing machines: for each machine (locally called the receiver), each input of that machine is wired to the output of a machine (locally called the sender). The sender transmits its output to the receiver which uses the information as input to its evolution function. In the special case where the sender and receiver are the same machine, the composition induces a feedback loop.

This composition syntax is captured by the category of single-input port graphs, which is defined by the schema below. \

A single-input port graph is a -set over , i.e. a functor This composition pattern consists of 1. a set of boxes . 2. Every box has a set of in-ports and a set of out-ports . 3. Every in-port is fed information by a unique out-port .

Implemented in Catlab, we have:

```
using Catlab, Catlab.CategoricalAlgebra
@present SchSIPortGraph(FreeSchema) begin
Box::Ob
InPort::Ob
OutPort::Ob
in_port_box::Hom(InPort, Box)
out_port_box::Hom(OutPort, Box)
wire::Hom(InPort, OutPort)
end
@abstract_acset_type AbstractSIPortGraph
@acset_type SIPortGraph(SchSIPortGraph,
index=[:in_port_box, :out_port_box, :wire]) <: AbstractSIPortGraph
```

A machine may fill a box^{2} of the composition pattern if 1. the number of in-ports of the box equals the number of inputs to the machine, and 2. the number of out-ports of the box equals the number of outputs of the machine.

The following method checks if a machine fills a chosen box of a composition pattern.

```
function fills(m::Machine, d::AbstractSIPortGraph, b::Int)
b ≤ nparts(d, :Box) || error("Trying to fill box $b, when $d has fewer than $b boxes")
return (ninputs(m) == length(incident(d, b, :in_port_box))) && (noutputs(m) == length(incident(d, b, :out_port_box)))
end
```

Now, given (1) a composition pattern and (2) a primitive machine filling each box, we can produce a new machine — the composite of the primitive machines — using the `oapply`

method defined below. This composite machine has no inputs and no outputs. Its dynamics are induced by the driven dynamics of the primitive machines where the wires define how to set the inputs of each primitive machine.^{3} We think of information of type `T`

flowing along the wires.

```
using Catlab.CategoricalAlgebra.FinSets
function oapply(d::SIPortGraph, xs::Vector{Machine{T}}) where T
for box in parts(d, :Box)
fills(xs[box], d, box) || error("$(xs[box]) does not fill box $box")
end
States = coproduct((FinSet∘nstates).(xs))
Outputs = coproduct((FinSet∘noutputs).(xs))
function internal_readout(u)
readouts = zeros(length(apex(Outputs)))
for box in parts(d, :Box)
view(readouts, legs(Outputs)[box](:)) .= xs[box].readout(view(u, legs(States)[box](:)))
end
return readouts
end
function v(u,p,t)
dotu = zero(u)
readouts = internal_readout(u)
for box in parts(d, :Box)
inputs = view(readouts, subpart(d, incident(d, box, :in_port_box), :wire))
vars = legs(States)[box](:)
view(dotu, vars) .= xs[box].dynamics(view(u, vars), inputs, t)
end
return dotu
end
return Machine{T}(0, length(apex(States)), 0, v, x -> 0)
end
```

We check that the machines have the correct signature for the box it fills when we construct the total system, not when we run it. This validation allows modeling software to reject malformed models and gives dynamical systems modelers the advantages associated with static type checking.

A standard Lotka-Volterra predator-prey model is the composition of two machines:

Evolution of a rabbit population — this machine takes one input which represents a population of predators, , that hunt rabbits. This machine has one output which emits the rabbit population . The dynamics of this machine is the driven ODE

Evoluation of a fox population — this machine takes one input which represents a population of prey, , that are eaten by foxes. This machine has one output which emits the fox population . The dynamics of this machine is the driven ODE

Since foxes hunt rabbit, these machines compose by setting the fox population to be the input for rabbit evolution. Likewise, we set the rabbit population to be the input for fox evolution. We depict this composition — both the composition pattern and the primitive machines — as

and implement this composition in Julia as

```
α, β, γ, δ = 0.3, 0.015, 0.015, 0.7
dotr(x, p, t) = [α*x[1] - β*x[1]*p[1]]
dotf(x, p, t) = [γ*x[1]*p[1] - δ*x[1]]
# Define the primitive systems
rabbit = Machine{Float64}(1,1,1, dotr, x -> x)
fox = Machine{Float64}(1,1,1, dotf, x -> x)
# Define the composition pattern
rabbitfox_pattern = SIPortGraph()
add_parts!(rabbitfox_pattern, :Box, 2)
add_parts!(rabbitfox_pattern, :OutPort, 2, out_port_box=[1,2])
add_parts!(rabbitfox_pattern, :InPort, 2, in_port_box=[1,2], wire=[2,1])
# Compose
rabbitfox_sys = oapply(rabbitfox_pattern, [rabbit, fox])
```

The machine `rabbitfox_sys`

now represents the complete Lotka-Volterra model. We can approximate a trajectory of the ecosystem using the solver in DifferentialEquations.jl and plot the result.

The innovation of composing dynamical systems as machines alleviates some of the bookkeeping endemic to implementing a complicated ODE. However, the mappings of ports can still be quite intricate in a composition of machines with many variables, boxes, and wires. In this section, we use directed graphs to present composition patterns more succinctly.^{4}

Recall the schema for graphs introduced in an earlier blog post:

We can transform an instance of `Graph`

into an instance of `SIPortGraph`

via pullback functorial data migration (see Appendix) induced by the functor defined on objects by - , -

and on morphisms by - - -

Single-input port graphs induced by ordinary graphs are a restricted class of single-input port graphs. Generically, a single-input port graph allows out-port splitting, i.e. two distinct in-ports may be wired to a single out-port. This syntactic feature implements the copying of output data and is not available in the more restrictive composition syntax defined by the theory of graphs.

Consider a ocean ecosystem containing three species — little fish, big fish, and sharks — with two predation interactions — sharks eat big fish and big fish eat little fish.

In order to save ourselves the bookkeeping of ports, wires, and boxes, we can model the composition pattern as the graph `ocean_graph`

and then migrate the data of `ocean_graph`

to an instance of `SIPortGraph`

.

```
using Catlab.Graphs: SchGraph, Graph
ocean_graph = Graph(3)
add_parts!(ocean_graph, :E, 4, src=[1,2,2,3], tgt=[2,1,3,2])
```

```
using Catlab.Theories: id
E = SchGraph[:E]
# Define the composition pattern via data migration
ocean_pattern = SIPortGraph()
migrate!(ocean_pattern, ocean_graph,
Dict(:InPort => :E, :OutPort => :E, :Box => :V),
Dict(:wire => id(E), :in_port_box => :tgt, :out_port_box => :src))
```

Main.Notebook.SIPortGraph {Box:3, InPort:4, OutPort:4}

InPort | in_port_box | wire |
---|---|---|

1 | 2 | 1 |

2 | 1 | 2 |

3 | 3 | 3 |

4 | 2 | 4 |

OutPort | out_port_box |
---|---|

1 | 1 |

2 | 2 |

3 | 2 |

4 | 3 |

We can now construct a model of the total ocean ecosystem by applying the primitive machines corresponding to little fish, big fish, and shark evolution into the composition pattern. Again, using a standard ODE solver we can compute and graph a trajectory of this ecosystem.

```
α, β, γ, δ, β′, γ′, δ′ = 0.3, 0.015, 0.015, 0.7, 0.017, 0.017, 0.35
dotfish(f, p, t) = [α*f[1] - β*p[1]*f[1]]
dotFISH(F, p, t) = [γ*p[1]*F[1] - δ*F[1] - β′*p[2]*F[1]]
dotsharks(s, p, t) = [-δ′*s[1] + γ′*s[1]*p[1]]
# Define the primitive systems
fish = Machine{Float64}(1,1,1, dotfish, f->f)
FISH = Machine{Float64}(2,1,2, dotFISH, F->[F[1], F[1]])
sharks = Machine{Float64}(1,1,1, dotsharks, s->s)
# Compose
ocean_sys = oapply(ocean_pattern, [fish, FISH, sharks])
```

In this post, we discussed two theories of directed composition — and — which fit into the larger zoo of directed composition syntaxes.

The functors between these theories define contravariant transformations between composition patterns in different syntaxes. The previous section exemplified this process applied to the functor from to .

In the following subsections, we briefly describe the remaining theories of directed composition.

Port graphs extend single-input port graphs by allowing in-ports to accept any number of incoming wires. In application, this means we can merge information () and generate trivial information (). The schema for is below.

Concretely, every port graph consists of a set of boxes. Each box has set of in-ports and a set of out-ports. A port graph also has a set of wires which connect out-ports to in-ports.

Circular port graphs^{5} similarly consist of a set of boxes and a set of wires which connect ports, but they do not distinguish between in-ports and out-ports — a machine may both receive and emit information through any port. The functor from to turns circular port graphs into port graphs by duplicating every port with one copy interpretted as an in-port and the other as an out-port. Circular port graphs are particularly useful for modeling systems where the composition pattern is given by a network, such as a mesh of points in a reaction-diffusion model or a network of cities in a transportation grid. Such networks often arise in physics modeling.

The theories discussed so far generate *closed* composition patterns because they produce composite machine that have no inputs and no outputs. The composite machine is closed in the sense that it cannot interact with other machines.

In AlgebraicDynamics.jl, we implement two *open* theories for directed composition (1) directed wiring diagrams and (2) open circular port graphs, along with corresponding `oapply`

methods.

In the next blog post of this series, we will investigate undirected theories for composition of dynamical systems as well as the strength of open composition syntaxes.

f Let be a functor between categories and . Then contravariantly induces a map from -sets to -sets by precomposition, which maps a -set to the -set We call this transformation pulling back along . We saw an example of migrating the data of a half-edge graph to the data of a symmetric graph and vice versa^{6} in a previous post.

DeVille, Lee, and Eugene Lerman. 2012. “Dynamics on Networks of Manifolds.” https://doi.org/10.3842/SIGMA.2015.022.

Libkind, Sophie. 2020. “An Algebra of Resource Sharing Machines.”

Schultz, Patrick, David I. Spivak, and Christina Vasilakopoulou. 2016. “Dynamical Systems and Sheaves.”

Vagner, Dmitry, David I. Spivak, and Eugene Lerman. 2014. “Algebras of Open Dynamical Systems on the Operad of Wiring Diagrams.”

A third style of composition allows both directed and undirected communication. The composition theory for this style of communication is shown in orange above. We call dynamical systems that compose via both directed and undirect communcation

*resource sharing machines*(Libkind 2020).↩︎What does “filling a box” correspond to in the operadic setting? Consider the operad algebra defined in (Schultz, Spivak, and Vasilakopoulou 2016). A box corresponds to a type in the operad, and a machine can fill the box if it corresponds to an element of .↩︎

The method

`oapply(d::SIPortGraph, xs::Vector{Machine{T}})`

implements the operad algebra defined in (Schultz, Spivak, and Vasilakopoulou 2016) — see also (Vagner, Spivak, and Lerman 2014) — restricted to the special case of morphisms where the codomain is the box with no incoming or outgoing wires. Such morphisms are equivalent to single-input port graphs. A generalization of the algebra to the operad of directed wiring diagrams is implemented in full in AlgebraicDynamics.jl.↩︎This strategy is inspired by the composition of dynamical systems defined in (DeVille and Lerman 2012).↩︎

A note on the nomenclature. We draw the boxes of port graphs as squares with ports on the top edge indicating the in-ports and the ports on the bottom edge indicating the out-ports. Since there is no such distinction between ports of a circular port graph, we draw the boxes of a circular port graph as circles. While port graphs are established terminology in the computer science literature, circular port graphs are not.↩︎

In that particular example, we were able to migrate the data in both directions because the functor was invertible.↩︎

$$

$$

So far on the blog we have motivated -sets mainly as a unifying abstraction that encompasses graphs, wiring diagrams, Petri nets, and other graph-like objects. But what makes attributed -sets so useful is that they are a joint generalization of *two* essential data structures in computer science: graphs, as we have seen, but also data frames, a mainstay in any modern environment for data analysis.

A **data frame** represents a table of data as a set of named columns, where different columns may have different data types. Here is an excerpt from a data frame with four columns, provided by the `Sitka`

dataset in the R package MASS.

395×4 DataFrame

370 rows omitted

Row | size | time | tree | treat |
---|---|---|---|---|

Float64 | Int64 | Int64 | Cat… | |

1 | 4.51 | 152 | 1 | ozone |

2 | 4.98 | 174 | 1 | ozone |

3 | 5.41 | 201 | 1 | ozone |

4 | 5.9 | 227 | 1 | ozone |

5 | 6.15 | 258 | 1 | ozone |

6 | 4.24 | 152 | 2 | ozone |

7 | 4.2 | 174 | 2 | ozone |

8 | 4.68 | 201 | 2 | ozone |

9 | 4.92 | 227 | 2 | ozone |

10 | 4.96 | 258 | 2 | ozone |

11 | 3.98 | 152 | 3 | ozone |

12 | 4.36 | 174 | 3 | ozone |

13 | 4.79 | 201 | 3 | ozone |

⋮ | ⋮ | ⋮ | ⋮ | ⋮ |

384 | 4.28 | 227 | 77 | control |

385 | 4.54 | 258 | 77 | control |

386 | 4.5 | 152 | 78 | control |

387 | 4.8 | 174 | 78 | control |

388 | 5.28 | 201 | 78 | control |

389 | 5.83 | 227 |