first draft on achille with smc
This commit is contained in:
parent
a97b4cf584
commit
a2046102e7
|
@ -0,0 +1,181 @@
|
|||
---
|
||||
title: Building my site with monoidal categories
|
||||
date: 2022-12-06
|
||||
draft: true
|
||||
---
|
||||
|
||||
Or how the right theoretical framework solved the last problem I had in the way
|
||||
of incremental generation for "free": reasoning about dependencies optimally.
|
||||
|
||||
---
|
||||
|
||||
A while back I made [achille](/projects/achille), a library for building
|
||||
incremental static site generators in Haskell. I'm not gonna delve into *why*
|
||||
for long, if you want the full motivation you can read the details in the
|
||||
(outdated) [documentation](/projects/achille/1-motivation.html).
|
||||
|
||||
The point was:
|
||||
|
||||
- static sites are good, therefore one wants to use static site generators.
|
||||
- the way to build their site becomes quite intricate and difficult to express
|
||||
with existing static site generators.
|
||||
- thus one ends up making their own custom generator suited for the task.
|
||||
|
||||
Making your own static site generator is not very hard, but making it
|
||||
*incremental* is tedious and requires some thinking.
|
||||
|
||||
That's the niche that [Hakyll](https://jaspervdj.be/hakyll/) tries to fill: an
|
||||
embedded DSL in Haskell to specify your build rules, and compile them into a
|
||||
full-fletched **incremental** static site generator. Some kind of static site
|
||||
generator *generator*.
|
||||
|
||||
## achille, as it used to be
|
||||
|
||||
I had my gripes with Hakyll, and was looking for a simpler, more general way to
|
||||
express build rules. I came up with the `Recipe` abstraction:
|
||||
|
||||
```haskell
|
||||
newtype Recipe m a b =
|
||||
{ runRecipe :: Context -> Cache -> a -> m (b, Cache) }
|
||||
```
|
||||
|
||||
It's just a glorified Kleisli arrow: a `Recipe m a b` will produce an output of
|
||||
type `b` by running a computation in `m`, given some input of type `a`.
|
||||
|
||||
The purpose is to *abstract over side effects* of build rules (such as producing
|
||||
HTML files on disk) and shift the attention to *intermediate values* that flow
|
||||
between build rules.
|
||||
|
||||
As one could expect, if `m` is a monad, so is `Recipe m a`. This means composing
|
||||
recipes is very easy and dependencies *between* those are stated **explicitely**
|
||||
in the code.
|
||||
|
||||
```haskell
|
||||
main :: IO ()
|
||||
main = achille do
|
||||
posts <- match "posts/*.md" compilePost
|
||||
compileIndex posts
|
||||
```
|
||||
|
||||
``` {=html}
|
||||
<details>
|
||||
<summary>Type signatures</summary>
|
||||
```
|
||||
Simplifying a bit, these would be the type signatures of the building blocks in
|
||||
the code above.
|
||||
```haskell
|
||||
compilePost :: Recipe IO FilePath PostMeta
|
||||
match :: GlobPattern -> (Recipe IO FilePath b) -> Recipe IO () [b]
|
||||
compileIndex :: PostMeta -> Recipe IO () ()
|
||||
achille :: Recipe IO () () -> IO ()
|
||||
```
|
||||
``` {=html}
|
||||
</details>
|
||||
```
|
||||
|
||||
There are no ambiguities about the ordering of build rules and the evaluation model
|
||||
is in turn *very* simple --- in contrast to Hakyll, its global store and
|
||||
implicit ordering.
|
||||
|
||||
### Caching
|
||||
|
||||
In the definition of `Recipe`, a recipe takes some `Cache` as input, and
|
||||
returns another one after the computation is done. This cache is simply a *lazy
|
||||
bytestring*, and enables recipes to have some *persistent storage* between
|
||||
runs, that they can use in any way they desire.
|
||||
|
||||
The key insight is how composition of recipes is handled:
|
||||
|
||||
```haskell
|
||||
(*>) :: Recipe m a b -> Recipe m a c -> Recipe m a c
|
||||
Recipe f *> Recipe g = Recipe \ctx cache x -> do
|
||||
let (cf, cg) = splitCache cache
|
||||
(_, cf') <- f ctx cf x
|
||||
(y, cg') <- g ctx cg x
|
||||
pure (y, joinCache cf cg)
|
||||
```
|
||||
|
||||
The cache is split in two, and both pieces are forwarded to their respective
|
||||
recipe. Once the computation is done, the resulting caches are put together
|
||||
into one again.
|
||||
|
||||
This ensures that every recipe will be attributed the same local cache
|
||||
--- assuming the description of the generator does not change between runs. Of
|
||||
course this is only true when `Recipe m` is merely used as *selective*
|
||||
applicative functor, though I doubt you need more than that for writing a
|
||||
static site generator. It's not perfect, but I can say that this very simple model
|
||||
for caching has proven to be surprisingly powerful.
|
||||
|
||||
I have improved upon it since then, in order to make sure that
|
||||
composition is associative and to enable some computationally intensive recipes to
|
||||
become insensitive to code refactorings, but the core idea is left unchanged.
|
||||
|
||||
### Incremental evaluation and dependency tracking
|
||||
|
||||
### But there is a but
|
||||
|
||||
## Arrows
|
||||
|
||||
I really like the `do` notation, but sadly losing this information about
|
||||
variable use is bad, so no luck. If only there was a way to *overload* the
|
||||
lambda abstraction syntax of Haskell to transform it into a representation free
|
||||
of variable bindings...
|
||||
|
||||
That's when I discovered Haskell's arrows. It's a generalization of monads,
|
||||
and is often presented as a way to compose things that behave like functions.
|
||||
And indeed, we can define our very `instance Arrow (Recipe m)`. There is a special
|
||||
syntax, the *arrow notation* that kinda looks like the `do` notation, so is this
|
||||
the way out?
|
||||
|
||||
There is something fishy in the definition of `Arrow`:
|
||||
|
||||
```haskell
|
||||
class Category k => Arrow k where
|
||||
-- ...
|
||||
arr :: (a -> b) -> a `k` b
|
||||
```
|
||||
|
||||
We must be able to lift any function into `k a b` in order to make it an
|
||||
`Arrow`. In our case we can do it, that's not the issue. No, the real issue is
|
||||
how Haskell desugars the arrow notation.
|
||||
|
||||
...
|
||||
|
||||
There is a macro that is a bit smarter than current Haskell's desugarer, but not
|
||||
by much. I've seen some discussions about actually fixing this upstream, but I
|
||||
don't think anyone actually has the time to do this. So few people use arrows to
|
||||
justify the cost.
|
||||
|
||||
|
||||
## Conal Elliott's `concat`
|
||||
|
||||
Conal Elliott wrote a fascinating paper called *Compiling to Categories*.
|
||||
The gist of it is that any cartesian-closed category is a model of simply-typed
|
||||
lambda-calculus. Therefore, he made a GHC plugin giving access to a magical
|
||||
function:
|
||||
|
||||
```
|
||||
ccc :: Closed k => (a -> b) -> a `k` b
|
||||
```
|
||||
|
||||
You can see that the signature is *very* similar to the one of `arr`.
|
||||
|
||||
A first issue is that `Recipe m` very much isn't *closed*. Another more
|
||||
substantial issue is that the GHC plugin is *very* experimental. I had a hard
|
||||
time running it on simple examples, it is barely documented.
|
||||
|
||||
Does this mean all hope is lost? **NO**.
|
||||
|
||||
|
||||
## Compiling to monoidal cartesian categories
|
||||
|
||||
Two days ago, I stumbled upon this paper by chance:.
|
||||
|
||||
What they explain is that many interesting categories to compile to are in fact
|
||||
not closed.
|
||||
|
||||
No GHC plugin required, just a tiny library with a few `class`es.
|
||||
|
||||
There is one drawback: `Recipe m` *is* cartesian. That is, you can freely
|
||||
duplicate values. In their framework, they have you explicitely insert `dup` to
|
||||
duplicate a value. This is a bit annoying, but they have a good reason to do so:
|
Loading…
Reference in New Issue