Catlab Refactor III: General categorical plumbing

code

Diving into the Cats module - some conceptual differences between Catlab v0.16 and v0.17 for modeling categories, functors, (co)limits, and more.

Author

Kris Brown

Published

May 16, 2025

This is a sequel post to Catlab refactor I and Catlab refactor II. Check those out first!

In this series, we are taking the major code refactor of Catlab, from v0.16 and v0.17, as an opportunity to highlight some core concepts in Catlab. I also will be mentioning some places where the structure of Catlab has changed, but readers who are just concerned with the present and future of Catlab can ignore these remarks.

This post discusses the Cats module, which is intended to support working with categories in general (in contrast to specific categories, such as \mathsf{Set}). This corresponds to the files Categories.jl, FinCats.jl, FreeDiagrams.jl, Limits.jl, Diagrams.jl, CommutativeDiagrams.jl, and Subobjects.jl in old Catlab’s CategoricalAlgebra module.

A common theme of the refactor is that interfaces (which are implicit in idiomatic Julia code) are being made explicit: in doing so, we are forced to think more clearly about what we mean when representing some mathematical concept as a data structure or operation in software. While sometimes making the interface explicit requires no substantial change to the underlying codebase beyond reorganization, at other times it leads us to additions and refinements to Catlab’s design.

ThCategory versus ThCategoryExplicitSets

Old Catlab had an abstract type Category{Ob,Hom,Size<:CatSize}, where the third type parameter allowed for type dispatch to treat finite categories differently. This previous design has some odd features. Although one always has access to the Julia types of a Category’s objects and morphisms, one couldn’t get specific set of objects or morphisms. Consistent with this, the only way to build a infinite Category{Ob,Hom} from scratch was to construct a TypeCat{Ob,Hom}, but we obviously want to be able to work in categories where, e.g., the objects aren’t all of the terms of a particular type. This is improved upon by making a new theory:

@theory ThCategoryExplicitSets <: ThCategory begin
  Set′::TYPE{AbsSet}
  ob_set()::Set′
  hom_set()::Set′
end

ThCategoryExplicitSets.Meta.@wrapper Category

Thus the new Catlab Category is a wrapper type for models of the theory of categories (as shown in our first post) which also provide an explicit set for their obs and homs. Although Julia requires us, when implementing methods like id, to define the method for the entire input Ob type, we will regard such methods as only defined on elements of the Ob type which are elements of ob_set().

Beyond TypeCat and OppositeCat, Catlab now has many different ways to construct a Category. For example, CoproductCat interprets a vector of categories as its coproduct:

struct CoproductCat{O,H}
  cats::Vector{Category}
  function CoproductCat(cats::Vector{Category})
    ΣOb, ΣHom = SumSet(ob_set.(cats)), SumSet(hom_set.(cats))
    ΣObType, ΣHomType = eltype[ΣOb](), eltype[ΣHom]()   
    new{ΣObType, ΣHomType}(cats)
  end
end

Base.getindex(x::CoproductCat, i::Int) = x.cats[i]

Note above that SumSet is an implementation of ThSet which has elements of type TaggedElem{I<:Val,V}, where I is a type-level tag and V is the type of the wrapped value. By default, these tags are just integers, e.g. the type of SumSet([SetOb(Int), SetOb(String)]) is Union{TaggedElem{Val{1}, Int}, TaggedElem{Val{2}, String}}. Because SumSet is a model for ThSet, which provides a nullary method eltype, we can get the type of any particular s::SumSet via eltype[s]().

We can see how the each operation of the implementation untags values, does a computation in one of its component categories, and then retags the result.

@instance ThCategoryExplicitSets{O,H} [model::CoproductCat{O,H}] where {O,H} begin 

  id(x::O)::H =
    TaggedElem(id(model[gettag(x)], getvalue(x)), gettag(x))

  dom(x::H)::O 
    TaggedElem(dom(model[gettag(x)], getvalue(x)), gettag(x))

  codom(x::H)::O =
    TaggedElem(codom(model[gettag(x)], getvalue(x)), gettag(x))

  function compose(x::H,y::H)::H
    tx, ty = gettag(x), gettag(y)
    tx == ty || error("Cannot compose $x $y")
    TaggedElem(compose(model[tx], getvalue(x), getvalue(y)), tx)
  end
  
  ob_set()::AbsSet = SetOb(SumSet(ob_set.(model.cats)))

  hom_set()::AbsSet = SetOb(SumSet(hom_set.(model.cats)))

end

Finitely-generated categories

To give a FinCat requires us to additionally provide a finite number of morphism generators and a means of interpreting each generator as a Hom.¹

@theory ThFinCat <: ThCategoryExplicitSets begin
  Gen(src::Ob,tgt::Ob)::TYPE; 
  to_hom(f::Gen(a,b))::Hom(a,b) ⊣ [(a,b)::Ob]
  FSet′::TYPE{FinSet}
  gen_set()::FSet′
end
  
ThFinCat.Meta.@wrapper FinCat

This theory was designed to unify the interface for FinCatPresentation, FinCatGraph, and symbolic models in old Catlab. The code surface of these constructs was not very uniform: little code was written for FinCats generically because implementations could not be switched interchangeably without breaking things.

Old Catlab defined FinCat{Ob,Hom} to be Category{Ob,Hom,FinCatSize}, which had the property of forcing every FinCat to be a Category implicitly. As discussed in the last post, while this is sometimes convenient, other times it is important to be capable of making the distinction. In our new approach, FinCats and Categories have different interfaces, though we can trivially implement the Category interface given any implementation of the FinCat interface by forgetting about gen_set and to_hom.²

Models for ThFinCat which come from old Catlab include DiscreteFinCat, FinCatGraph, FinCatPresentation, and OpFinCat. We add to this PathCat (a model of a ThFinCat-like theory except we assume Path=Hom) and PreorderCat. PreorderCat is an interesting edge case because it has Gen = Hom, as both generators and homs are just pairs of objects. This revealed major ambiguities in old Catlab’s design which needed to be fixed because they use type dispatch to do different things depending on whether one is working with a generator, a hom, or a hom given as a path of generators.

FinFunctors

One of the largest breaking changes in the refactor is how FinFunctor and FinDomFunctor are handled. This is a consequence of the additional structure that FinCat has, as described above. In old Catlab, certain Julia types were assumed to be either generators, objects, or homs, such that type dispatch could determine what to do when a FinFunctor received arbitrary data (whether in the process of defining the FinFunctor or applying an existing one to some data). So calling F(x) would do something different whether it received an Ob-like Julia type vs a Hom-like type. Even if one is explicit with hom_map(F, x), it still relied on dispatch to distinguish generator-like arguments, path-like arguments, and composite morphisms. However, in general one’s category could have the same Julia type for these different roles, so this code was only correct because the few instances of FinCats did not have any such overlap.

This means when defining a FinFunctor or FinDomFunctor, one should declare whether one is sending generators to generators, paths, or homs. A new homtype keyword argument allows us to specify this:

C = FinCat(parallel_arrows(Graph, 2))

h = @acset Graph begin V=4; E=4; src=[1,1,2,3]; tgt= [2,3,4,4] end
D = FinCat(h)

F = FinFunctor((V=[1,4], E=[[1,3], [2,4]]), C, D; homtype=:list)
G = FinFunctor((V=[3,4], E=[4,4]), C, D; homtype=:generator)

The models of ThFinDomFunctor include CompositeFinDomFunctor, FinDomFunctorMap, IdentityFinDomFunctor, and OppositeFinDomFunctor. Although we could define a separate theory ThFinFunctor, instead we opt to make a FinFunctor be a FinDomFunctor whose codomain happens to be a FinCat as there is no difference in the interface, even at the level of input and return types to the operations of the theory (in both cases, generators are sent to homs in the codomain).

What really is a free diagram?

Free diagrams in Catlab refer to DiscreteDiagram, ComposableMorphisms, ParallelMorphisms, Multispan, Multicospan, BipartiteGraph, and FreeDiagram³. In old Catlab these are a bunch of related data structures which shared some overlapping interfaces. This is now made more explicit in the refactored code, where we are forced to confront this section’s titular question.

Mathematically, a free diagram is a functor out of a small (free) category. For us, this means FreeDiagram is like a functor but with a domain Graph (viewed as a free category). However, we don’t worry about including the category structure of the codomain (e.g. a pair of composable morphisms need not carry the information for how to obtain a single composite morphism). This means in order to get the mathematical definition of a free diagram out of FreeDiagram, one must also supply a codomain Category. We opt for this simplicity because a FreeDiagram is supposed to be a syntactic object; expecting them to have the complete data of a (possibly large) category is unreasonable.

Because FreeDiagram does not know what its codomain category is (just the Julia types of its objects and morphisms), the FinDomFunctor(diagram) method is no longer supported. This was only possible because it used to be that one knew the full category of structure just from the types of obs and homs (because the ThCategory implementation was determined by type dispatch).

Now let’s look at the interface:

@theory ThFreeDiagram begin 
  V::TYPE; E::TYPE; Ob::TYPE; Hom::TYPE; FSet::TYPE{FinSet};
  src(h::E)::V;
  tgt(h::E)::V;
  obset()::FSet
  homset()::FSet
  obmap(v::V)::Ob
  hommap(e::E)::Hom
end

ThFreeDiagram.Meta.@wrapper FreeDiagram

The refactor of FreeDiagrams was more than just a cosmetic reorganization of free floating methods into the above interface. This is because old Catlab’s implementations for some of these ThFreeDiagram models did not contain enough data to implement this interface. For example, a Span in old Catlab has the data of a pair of morphisms. We can imagine V={1,2,3}, E={1,2}: the morphisms give the data of hommap, but not obmap. Old Catlab effectively derived what obmap was by presupposing that we can call (co)dom on whatever Julia type Hom is; however we just mentioned that we no longer want to presume the entire category structure of the free diagram’s codomain! After all, a term of some morphism type Foo can now have many different (co)domains, as we can have many different models of ThCategory with Foo as its morphisms. Thus our new design forces us modify the Span data structure to include the extra obmap data.⁴

Just to get a flavor of what a model of ThFreeDiagram looks like, consider:

@instance ThFreeDiagram{Int,Int,Union{Ob,Foot},Hom
                       } [model::Multispan{Ob,Hom,Foot,V,W}
                         ] where {Ob, Hom, Foot, V, W} begin
  src(::Int)::Int = 1
  tgt(x::Int)::Int = x+1
  obmap(x::Int)::Union{Ob,Foot} = @match x begin 
    1 => apex(model)
    i => feet(model)[i-1]
  end
  hommap(i::Int)::Hom = model[i]
  obset()::FinSet = FinSet(length(model)+1)
  homset()::FinSet = FinSet(length(model))
end

Limits and colimits

Limits and colimits of free diagrams were controlled in old Catlab by a complicated process which involved defining methods for particular FreeDiagram{Ob,Hom} types, dispatching on the Ob and Hom types of these diagrams. In contrast, we now have a base theory for categories with limits:⁵

@theory ThCategoryLimitBase <: ThCategoryExplicitSets begin
  Limit()::TYPE{AbsLimit}
  ob(lim::Limit)::Ob
  MSpan::TYPE{Multispan}
  apex(s::MSpan)::Ob
  cone(l::Limit)::MSpan
  apex(cone(l)) == ob(l) ⊣ [l::Limit]
end

Now for any particular kind of free diagram we can define a (co)limit theory, e.g. for ParallelMorphisms:

@theory ThCategoryWithEqualizers <: ThCategoryLimitBase begin
  ParallelDiagram()::TYPE{ParallelMorphisms}
  limit(p::ParallelDiagram)::Limit
  universal(eq::Limit, p::ParallelDiagram, s::MSpan)::(apex(s) → ob(eq))
end

ThCategoryWithEqualizers.Meta.@wrapper CatWithEqualizers

Now we can implement such a theory for any given model, e.g.

@instance ThCategoryWithEqualizers{FinSetInt, FinFunction} [model::SkelFinSet] begin

  function limit(para::ParallelMorphisms)
    @assert !isempty(para)
    f1, frest = para[1], para[2:end]
    m = length(dom(para))
    eq = FinFunction(filter(i -> all(f1(i) == f(i) for f in frest), 1:m), m)
    LimitCone(Multispan(dom[model](eq), [eq]; cat=model), FreeDiagram(para))
  end 
  
  function universal(res::AbsLimit, ::ParallelMorphisms, span::Multispan)
    ι = collect(only(cone(res)))
    h = only(span)
    FinFunction(Int[only(searchsorted(ι, h(i))) for i in dom[model](h)], length(ι))
  end 
end

With the interface implemented as above, we will not get a runtime error if we try to wrap SkelFinSet() with CatWithEqualizers.

To compare to old Catlab: we are no longer dispatching on the types of Ob and Hom which appear in the diagram, so we now able to have different (co)limits for the same type of diagram, e.g. DiscreteDiagram{Int, FinFunction}. This interface no longer offers an alg keyword (for example, the many different join algorithms for pullbacks), as the GAT formalism doesn’t have a built-in notion of optional or keyword argument. In this case, whatever the default algorithm was is the one that appears in @instance. The other methods (e.g. nested_loop_limit(cospan::Multicospan)) are still preserved in the codebase, but need to be explicitly called rather than invoked via the alg keyword.

Monoidal limits

In old Catlab, there is a macro @(co)cartesian_monoidal_instance ObType HomType which generates a bunch of methods related to monoidal categories (e.g. delete, copy, swap) on the assumption that ObType and HomType have the right (co)limit methods already defined. Because the (co)limit methods are now tied to models rather than to the types themselves, we can replace this macro with an implementation of Th(Co)CartesianCategory given any model of ThCatWithUnbiased(Co)Products.⁶ We can use a wrapped model of ThCatWithUnbiased(Co)Products as our raw model for Th(Co)CartesianCategory:⁷

@instance ThCocartesianCategory{Ob, Hom} [model::TypedCatWithCoproducts{Ob,Hom}
    ] where {Ob,Hom} begin

  function swap(A::Ob, B::Ob)
    AB = coproduct(model, A, B)
    BA = coproduct(model, B, A)
    copair(model, AB, coproj2(BA), coproj1(BA))
  end
  
  ...
end

Using ⊕ and ⊗ is not as seamless as it was before because their meaning depends on a particular choice of model. And because they are binary operators, it’s not possible to use the index notation (e.g. id[MyCat](x)) to fill in the model. This is where GATlab’s @withmodel macro is helpful:

const Grph = ACSetCategory(Graph()) # raw model

X = @withmodel TypedCatWithCoproducts(Grph) (⊗,⊕) begin 
 cycle_graph(Graph, 1) ⊗ (path_graph(Graph, 2) ⊕ cycle_graph(Graph, 4))
end

Slice categories

One of the most touted features of GATlab is that slice categories can be done in a more robust way.

struct SliceOb{ObT, HomT}
  ob::ObT
  hom::HomT
end

struct SliceC{ObT, HomT}
  cat::Any 
  over::ObT
  function SliceC(cat::Category, over)
    cat = getvalue(cat) # unwrap after confirming it's at least a Category model
    new{impl_types(cat, ThCategory)...}(cat, over)
  end
end

@instance ThCategoryExplicitSets{SliceOb{<:ObT, <:HomT}, HomT
    } [model::SliceC{ObT, HomT}] where {ObT, HomT} begin

  id(x::SliceOb{<:ObT, <:HomT}) = id[model.cat](x.ob)

  compose(f::HomT, g::HomT) = compose[model.cat](f, g)
  
  ...
end

As mentioned in the first post of this series, wrapper types put us in a tricky position when we want to use the same model in the context of multiple theories. We want the model that we are slicing over to at least support the theory of categories, but we’ll likely want to use it in other theories that extend ThCategory (e.g. ThCategoryWithPullbacks, ThCategoryWithInitial). The (co)limit methods for those will not accept a wrapped Category model, but they will accept the raw model (e.g. SkelFinSet) if it happens to have those interfaces implemented. For the reason, the SliceC data structure doesn’t store the wrapped Category model but rather just stores the raw model (which could have any type). There is definitely room for improvement in the future!

Next steps

There are other interesting aspects of refactoring the Cats module (e.g. various categories of diagrams in a category), but we’ll end this post here and continue next time with discussing how various categories related to ACSets are handled!

Footnotes

In order for the set to truly generate all morphisms, it should be the case that for every morphism there exists some sequence of generators which compose to↩︎
This pattern, “any X-model gives a Y-model” is common and can also often be expressed via a GAT morphism between theories X and Y that can functorially migrate models. This is a direction we want to move in as the GAT morphism machinery becomes more mature.↩︎
Now called FreeGraph because FreeDiagram is the wrapper type for the ThFreeDiagram interface↩︎
If no objects are provided, then the constructor will attempt to use type dispatch to figure this info out, thus recovering the old behavior.↩︎
Here we are using externally-defined Julia types AbsLimit and Multispan, rather than fully axiomatizing them in the language of GATs. This is possible but not ergonomic at the present moment: at the present we are sacrificing being explicit about what it means to be an AbsLimit or Multispan for convenience and clarity.↩︎
To be even more accurate, we should take the pushout of theories of products and terminal objects. This current implementation presumes that a model which has implemented products has already implemented terminal objects.↩︎
This would be another great application of model migration via theory morphisms once GATlab’s theory specification language becomes rich enough to specify these (co)limit constructions.↩︎