182 lines
6.4 KiB
Markdown
182 lines
6.4 KiB
Markdown
---
|
|
title: Building my site with monoidal categories
|
|
date: 2022-12-06
|
|
draft: true
|
|
---
|
|
|
|
Or how the right theoretical framework solved the last problem I had in the way
|
|
of incremental generation for "free": reasoning about dependencies optimally.
|
|
|
|
---
|
|
|
|
A while back I made [achille](/projects/achille), a library for building
|
|
incremental static site generators in Haskell. I'm not gonna delve into *why*
|
|
for long, if you want the full motivation you can read the details in the
|
|
(outdated) [documentation](/projects/achille/1-motivation.html).
|
|
|
|
The point was:
|
|
|
|
- static sites are good, therefore one wants to use static site generators.
|
|
- the way to build their site becomes quite intricate and difficult to express
|
|
with existing static site generators.
|
|
- thus one ends up making their own custom generator suited for the task.
|
|
|
|
Making your own static site generator is not very hard, but making it
|
|
*incremental* is tedious and requires some thinking.
|
|
|
|
That's the niche that [Hakyll](https://jaspervdj.be/hakyll/) tries to fill: an
|
|
embedded DSL in Haskell to specify your build rules, and compile them into a
|
|
full-fletched **incremental** static site generator. Some kind of static site
|
|
generator *generator*.
|
|
|
|
## achille, as it used to be
|
|
|
|
I had my gripes with Hakyll, and was looking for a simpler, more general way to
|
|
express build rules. I came up with the `Recipe` abstraction:
|
|
|
|
```haskell
|
|
newtype Recipe m a b =
|
|
{ runRecipe :: Context -> Cache -> a -> m (b, Cache) }
|
|
```
|
|
|
|
It's just a glorified Kleisli arrow: a `Recipe m a b` will produce an output of
|
|
type `b` by running a computation in `m`, given some input of type `a`.
|
|
|
|
The purpose is to *abstract over side effects* of build rules (such as producing
|
|
HTML files on disk) and shift the attention to *intermediate values* that flow
|
|
between build rules.
|
|
|
|
As one could expect, if `m` is a monad, so is `Recipe m a`. This means composing
|
|
recipes is very easy and dependencies *between* those are stated **explicitely**
|
|
in the code.
|
|
|
|
```haskell
|
|
main :: IO ()
|
|
main = achille do
|
|
posts <- match "posts/*.md" compilePost
|
|
compileIndex posts
|
|
```
|
|
|
|
``` {=html}
|
|
<details>
|
|
<summary>Type signatures</summary>
|
|
```
|
|
Simplifying a bit, these would be the type signatures of the building blocks in
|
|
the code above.
|
|
```haskell
|
|
compilePost :: Recipe IO FilePath PostMeta
|
|
match :: GlobPattern -> (Recipe IO FilePath b) -> Recipe IO () [b]
|
|
compileIndex :: PostMeta -> Recipe IO () ()
|
|
achille :: Recipe IO () () -> IO ()
|
|
```
|
|
``` {=html}
|
|
</details>
|
|
```
|
|
|
|
There are no ambiguities about the ordering of build rules and the evaluation model
|
|
is in turn *very* simple --- in contrast to Hakyll, its global store and
|
|
implicit ordering.
|
|
|
|
### Caching
|
|
|
|
In the definition of `Recipe`, a recipe takes some `Cache` as input, and
|
|
returns another one after the computation is done. This cache is simply a *lazy
|
|
bytestring*, and enables recipes to have some *persistent storage* between
|
|
runs, that they can use in any way they desire.
|
|
|
|
The key insight is how composition of recipes is handled:
|
|
|
|
```haskell
|
|
(*>) :: Recipe m a b -> Recipe m a c -> Recipe m a c
|
|
Recipe f *> Recipe g = Recipe \ctx cache x -> do
|
|
let (cf, cg) = splitCache cache
|
|
(_, cf') <- f ctx cf x
|
|
(y, cg') <- g ctx cg x
|
|
pure (y, joinCache cf cg)
|
|
```
|
|
|
|
The cache is split in two, and both pieces are forwarded to their respective
|
|
recipe. Once the computation is done, the resulting caches are put together
|
|
into one again.
|
|
|
|
This ensures that every recipe will be attributed the same local cache
|
|
--- assuming the description of the generator does not change between runs. Of
|
|
course this is only true when `Recipe m` is merely used as *selective*
|
|
applicative functor, though I doubt you need more than that for writing a
|
|
static site generator. It's not perfect, but I can say that this very simple model
|
|
for caching has proven to be surprisingly powerful.
|
|
|
|
I have improved upon it since then, in order to make sure that
|
|
composition is associative and to enable some computationally intensive recipes to
|
|
become insensitive to code refactorings, but the core idea is left unchanged.
|
|
|
|
### Incremental evaluation and dependency tracking
|
|
|
|
### But there is a but
|
|
|
|
## Arrows
|
|
|
|
I really like the `do` notation, but sadly losing this information about
|
|
variable use is bad, so no luck. If only there was a way to *overload* the
|
|
lambda abstraction syntax of Haskell to transform it into a representation free
|
|
of variable bindings...
|
|
|
|
That's when I discovered Haskell's arrows. It's a generalization of monads,
|
|
and is often presented as a way to compose things that behave like functions.
|
|
And indeed, we can define our very `instance Arrow (Recipe m)`. There is a special
|
|
syntax, the *arrow notation* that kinda looks like the `do` notation, so is this
|
|
the way out?
|
|
|
|
There is something fishy in the definition of `Arrow`:
|
|
|
|
```haskell
|
|
class Category k => Arrow k where
|
|
-- ...
|
|
arr :: (a -> b) -> a `k` b
|
|
```
|
|
|
|
We must be able to lift any function into `k a b` in order to make it an
|
|
`Arrow`. In our case we can do it, that's not the issue. No, the real issue is
|
|
how Haskell desugars the arrow notation.
|
|
|
|
...
|
|
|
|
There is a macro that is a bit smarter than current Haskell's desugarer, but not
|
|
by much. I've seen some discussions about actually fixing this upstream, but I
|
|
don't think anyone actually has the time to do this. So few people use arrows to
|
|
justify the cost.
|
|
|
|
|
|
## Conal Elliott's `concat`
|
|
|
|
Conal Elliott wrote a fascinating paper called *Compiling to Categories*.
|
|
The gist of it is that any cartesian-closed category is a model of simply-typed
|
|
lambda-calculus. Therefore, he made a GHC plugin giving access to a magical
|
|
function:
|
|
|
|
```
|
|
ccc :: Closed k => (a -> b) -> a `k` b
|
|
```
|
|
|
|
You can see that the signature is *very* similar to the one of `arr`.
|
|
|
|
A first issue is that `Recipe m` very much isn't *closed*. Another more
|
|
substantial issue is that the GHC plugin is *very* experimental. I had a hard
|
|
time running it on simple examples, it is barely documented.
|
|
|
|
Does this mean all hope is lost? **NO**.
|
|
|
|
|
|
## Compiling to monoidal cartesian categories
|
|
|
|
Two days ago, I stumbled upon this paper by chance:.
|
|
|
|
What they explain is that many interesting categories to compile to are in fact
|
|
not closed.
|
|
|
|
No GHC plugin required, just a tiny library with a few `class`es.
|
|
|
|
There is one drawback: `Recipe m` *is* cartesian. That is, you can freely
|
|
duplicate values. In their framework, they have you explicitely insert `dup` to
|
|
duplicate a value. This is a bit annoying, but they have a good reason to do so:
|