The cohomology of your programming language is boring

Answering the topos-theoretic questions no one's been stupid enough to ask.

0. Background

Topoi are used in programming languages as an abstraction for different logical systems, in order to make defining semantics easier and simplify certain formal reasoning. For example, reasoning about step-indexing can be done using a topos of time-dependent sets.

In these cases, the benefit of using a topos is that it comes with an interpretation of dependent type theory: most of the time, we can just use the familiar syntax of lambdas, pairs, etc., and it does the right thing with respect to step indexing (or whatever else we care about).

On the other hand, topos theory was originally developed by geometers, who use topoi as a setting for cohomology. Given a topos \(\mathscr{X}\) of sheaves on a space, its cohomology \(H^\ast(\mathscr{X})\) gives topological information about the space. In general, for an arbitrary topos, we can define its cohomology as \(H^i(\mathscr{X}) = H^i\mathsf{R}\Gamma(\mathbb{Z}_\mathscr{X})\), where \(\mathbb{Z}_\mathscr{X}\) is the constant sheaf on \(\mathbb{Z}\) and \(\mathsf{R}\Gamma : D(\mathscr{X}) \to D(Ab)\) is the derived functor of the global sections functor. For a topological space, this is equivalent to singular cohomology under mild assumptions. If \(\mathsf{R}\Gamma(\mathbb{Z}_\mathscr{X}) = \mathbb{Z}\), then \(\mathscr{X}\) is topologically boring: it’s weakly homotopy equivalent to a point.

But programming languages people never talk about cohomology of their topoi, because it’s not useful. This post is about doing the useless thing. I’ll assume standard facts about derived categories.

1. Extensional MLTT

The most important aspect of topoi for programming languages is that they are models of dependent type theory. This statement also has a kind of a converse: all models of extensional MLTT + certain extra features are (elementary) topoi1. So what’s the cohomology of the initial model of extensional MLTT + the extra features?

Topos: The initial model of extensional MLTT (+ extra features)
Verdict: N/A – doesn’t make sense.

Explanation. From a relative point of view, cohomology is about topoi over \(\mathsf{Sets}\). If \(f : \mathscr{X} \to \mathsf{Sets}\) is the unique geometric morphism to \(\mathsf{Sets}\), then we’re looking at \(\mathsf{R}\Gamma = \mathsf{R}f_\ast\). But any initial model of a type theory can’t be a Grothendieck topos, only an elementary topos, so it doesn’t come with a geometric morphism to \(\mathsf{Sets}\). In a sense, from the relative point of view, Grothendieck topos theory is exactly the theory of elementary topoi over \(\mathsf{Sets}\).

2. Cubical type theory

Cubical type theory has an intended semantics in cubical sets, that is, presheaves on a cube category \(\square\). Different type theories choose different cube categories – for example, BCH uses the “symmetric cube category”, CCHM uses the “de Morgan cube category”, and ABCFHL uses the “Cartesian cube category”.

Topos: Cubical sets (any of the variations), \(\widehat{\square}\).
Verdict: Topologically boring.

Proof. Every cube category contains a terminal object \(\ast\), so we have \(\Gamma(\mathcal F) = \mathcal F(\ast)\). But limits are computed pointwise in functor categories, so \(\Gamma\) is exact! This implies that all higher cohomology groups vanish, i.e., \(\mathsf{R}\Gamma(\mathbb{Z}_{\widehat{\square}}) = \mathbb{Z}\).

3. Normalization proofs

Normalization by evaluation, for any type system ranging from the simply typed lambda calculus all the way up to variations on MLTT, uses logical relations valued in presheaves on a category \(Ren\) of contexts and renamings. (What exactly \(Ren\) looks like depends on the type theory.) This ensures that the semantics are invariant under context extensions and variable renamings.

Topos: Presheaves on \(Ren\) (any of the variations), \(\widehat{Ren}\).
Verdict: Topologically boring.

Proof. The categories of contexts and renamings all have terminal objects, so \(\Gamma\) is exact just like above.

4. Step-indexing

Practical programming languages tend to have annoying features like the ability to write infinite loops. Step-indexing parameterizes the semantics with a decreasing counter \(k\) that represents the number of steps to run – now you only need to consider finitely many iterations of the loop, and better yet, you can define your semantics inductively on \(k\). This can be represented using presheaves on the poset \(\omega = \{ 0 \leq 1 \leq 2 \leq \dots\}\), called the topos of trees in this paper.

Topos: The topos of trees, \(\mathcal S\).
Verdict: Topologically boring.

Proof. Unlike the earlier presheaf topoi, this site does not have a terminal object, and so \(\Gamma\) is not exact. However, since it has an initial object \(0\), we can compute \(Hom_{\mathcal{S}}(\mathcal F, \mathbb{Z}_{\mathcal S}) = Hom(\mathcal F(0), \mathbb{Z})\). This implies that the following diagram commutes, where \(f : \mathcal{S} \to \mathsf{Sets}\) is the unique geometric morphism, and \(Sh(\mathcal{S})\) is the category of abelian sheaves on \(\mathcal{S}\):


\begin{tikzcd}
	{Ab^{\mathsf{op}}} && {Sh(\mathcal{S})^{\mathsf{op}}} \\
	\\
	&& Ab
	\arrow["{(f^\ast)^\mathsf{op}}", from=1-1, to=1-3]
	\arrow["{Hom_\mathcal{S}(-, \mathbb{Z}_\mathcal{S})}", from=1-3, to=3-3]
	\arrow["{Hom(-,\mathbb{Z})}"', from=1-1, to=3-3]
\end{tikzcd}

Taking right derived functors, we get the following diagram:


\begin{tikzcd}
	{D(\mathbb{Z})^\mathsf{op}} && {D(\mathcal{S})^{\mathsf{op}}} \\
	\\
	&& {D(Ab)}
	\arrow["{(\mathsf Lf^\ast)^\mathsf{op}}", from=1-1, to=1-3]
	\arrow["{\mathsf RHom_\mathcal{S}(-, \mathbb{Z}_\mathcal{S})}", from=1-3, to=3-3]
	\arrow["{\mathsf{R}Hom(-,\mathbb{Z})}"', from=1-1, to=3-3]
\end{tikzcd}

Since (co)limits are computed pointwise in a presheaf category, \(f^\ast\) is exact and \(\mathsf Lf^\ast(\mathbb{Z}) = \mathbb{Z}_\mathcal{S}\). Putting it all together, \[ \begin{align*} \mathsf{R}\Gamma(\mathbb{Z}_\mathcal S) &= \mathsf{R}Hom_\mathcal{S}(\mathbb{Z}_\mathcal S, \mathbb{Z}_\mathcal S) \\ &= \mathsf{R}Hom_\mathcal{S}(\mathsf Lf^\ast\mathbb{Z}, \mathbb{Z}_\mathcal S) \\ &= \mathsf{R}Hom(\mathbb{Z}, \mathbb{Z}) \\ &= \mathbb{Z}. \end{align*} \]

5. Sterling’s normalization topoi

In recent work, Jon Sterling and collaborators prove impressive metatheory results about dependent type theories (most notably normalization of cubical type theory) using topoi. They construct the topoi using gluing, so finally, we have the chance for some interesting topological behavior to happen!

Here, we start out with the category \(\mathcal T\), the syntactic model of our type theory, as well as a subcategory \(\mathcal A\) of “atomic terms” – for MLTT, \(\mathcal A = Ren\) is our category of contexts and renamings. The functor \(\rho : \mathcal A \hookrightarrow \mathcal T\) induces a functor \(\rho^\ast : \widehat{\mathcal T} \to \widehat{\mathcal A}\) by precomposition, which turns out to be the inverse image of a geometric morphism of topoi. Then we define our topos \(G\) by the following pushout diagram of topoi:

\begin{tikzcd}
	{\widehat{\mathcal A}} & {\widehat{\mathcal T}} \\
	{\mathbb{S}\times\widehat{\mathcal{A}}} & G
	\arrow[from=2-1, to=2-2]
	\arrow[hook, from=1-2, to=2-2]
	\arrow["\rho", from=1-1, to=1-2]
	\arrow["{\circ_{\widehat{\mathcal{A}}}}"', hook, from=1-1, to=2-1]
	\arrow["\lrcorner"{anchor=center, pos=0.125, rotate=180}, draw=none, from=2-2, to=1-1]
\end{tikzcd}

Topos: The normalization topos \(G\).
Verdict: Topologically boring.

Proof. Intuitively, this diagram is reminiscent of the mapping cylinder \(Mf\) of a continuous map \(f : X \to Y\), which is constructed using a similar pushout of topological spaces:


\begin{tikzcd}
	X & Y \\
	{[0,1]\times X} & Mf
	\arrow["{i_0}"', hook, from=1-1, to=2-1]
	\arrow["f", from=1-1, to=1-2]
	\arrow[from=2-1, to=2-2]
	\arrow[hook, from=1-2, to=2-2]
	\arrow["\lrcorner"{anchor=center, pos=0.125, rotate=180}, draw=none, from=2-2, to=1-1]
\end{tikzcd}

The inclusion \(Y \hookrightarrow Mf\) is a homotopy equivalence, so by analogy with spaces, we’d expect \(G\) to have the same cohomology as \(\widehat{\mathcal T}\), a presheaf category on a site with a terminal object. This is in fact what happens.

Equivalently, \(G\) can be described as the comma category \(\widehat{\mathcal{A}} \downarrow \rho^\ast\), whose objects are pairs of a presheaf \(\mathcal{F}\) on \(\mathcal{A}\), a presheaf \(\mathcal{G}\) on \(\mathcal T\), and a natural transformation \(\mathcal F \to \rho^\ast \mathcal{G}\), and whose morphisms are pairs of natural transformations in \(\widehat{\mathcal A}\) and \(\widehat{\mathcal T}\) making the obvious diagram commute. But this is equivalently presheaves on a site constructed from the following pushout of categories, where \(I = [0 \leq 1]\) is the directed interval category:


\begin{tikzcd}
	{\mathcal{A}} & {\mathcal{T}} \\
	{I \times \mathcal{A}} & \cdot
	\arrow["\rho", from=1-1, to=1-2]
	\arrow["{i_0}"', hook, from=1-1, to=2-1]
	\arrow[from=2-1, to=2-2]
	\arrow[hook, from=1-2, to=2-2]
	\arrow["\lrcorner"{anchor=center, pos=0.125, rotate=180}, draw=none, from=2-2, to=1-1]
\end{tikzcd}

This has a terminal object, the terminal object of \(I \times \mathcal A\), so again \(\Gamma\) is exact and \(\mathsf{R}\Gamma(\mathbb{Z}_G) = \mathbb Z\).

6. Sterling and Harper’s parametricity topoi

In their paper Logical Relations as Types, Jon Sterling and Bob Harper give a topos-theoretic framework for understanding parametricity in module systems. They elaborate ML to a core language with a sharp phase distinction between a dependently-typed static module system, and a simply-typed monadic runtime semantics. To support SML-style where clauses, which can only require static data to agree, a special proposition \(\P\) is introduced, under which all runtime data is considered equal. Using \(\P\), where this = that clauses can (roughly) be implemented by requiring that \(\P \Rightarrow \texttt{this} = \texttt{that}\).

Let \(\mathcal{T}\) be the syntactic model of their core language. The presheaf topos \(\widehat{\mathcal T \sqcup \mathcal T}\) has two copies of the internal language of \(\mathcal T\), so let \(\P_L\), \(\P_R\) be the two copies of the proposition \(\P\), and let \(\rho : \widehat{\mathcal T \sqcup \mathcal T} \to \mathbb{S}\) be the geometric morphism classifying the proposition \(\P_L \lor \P_R\). They construct a topos \(X\) using the following pushout of topoi:


\begin{tikzcd}
	{\widehat{\mathcal T\sqcup \mathcal T}} & {\mathbb{S}} \\
	{\widehat{\mathcal{T} \sqcup \mathcal T}\times \mathbb{S}} & X
	\arrow["\rho", from=1-1, to=1-2]
	\arrow["\bullet"', hook, from=1-1, to=2-1]
	\arrow[from=2-1, to=2-2]
	\arrow[hook, from=1-2, to=2-2]
	\arrow["\lrcorner"{anchor=center, pos=0.125, rotate=180}, draw=none, from=2-2, to=1-1]
\end{tikzcd}

Topos: The topos of phase-separated parametricity structures \(X\).
Verdict: Topologically boring.

Proof.

From Computation 5.29 in the paper, an object in \(X\) consists of the following data:

  • An object of \(\mathbb{S}\), i.e., a family of sets \(A \to B\)
  • A presheaf \((L, R)\) in \(\widehat{\mathcal{T} \sqcup \mathcal{T}}\)
  • Maps \(A \to L(\ast)\), \(A \to R(\ast)\), \(B \to L(\P)\), and \(B \to R(\P)\) making the obvious squares commute
and morphisms in \(X\) are the natural ones, i.e., families of maps making the obvious diagrams commute. So \(X\) also ends up being equivalent to the category of presheaves on a site with a terminal object, and \(\mathsf{R}\Gamma(\mathbb{Z}_X) = \mathbb{Z}\).

7. Conclusion

In conclusion, the topoi used in programming languages aren’t topologically interesting. This makes sense: programming languages researchers only care about the internal logic of topoi, and they don’t care about any geometric/topological aspects. It’s a bit like taking the QR decomposition of a QR code: sure, you can get a result, and it’s kinda funny, but don’t expect it to be useful or even meaningful.

QR decomposition of a QR code that reads “decomposition”

  1. The standard reference is Lambek and Scott, “Introduction to Higher-Order Categorical Logic”.↩︎