more work on article, wishful thinking
This commit is contained in:
parent
f8b1c385f0
commit
4dd1130758
|
@ -19,55 +19,66 @@ import Achille as A
|
|||
|
||||
main :: IO ()
|
||||
main = achille $ task A.do
|
||||
-- copy every static asset as is
|
||||
match_ "assets/*" copyFile
|
||||
|
||||
-- load site template
|
||||
template <- matchFile "template.html" loadTemplate
|
||||
|
||||
-- render every article in `posts/`
|
||||
-- and gather all metadata
|
||||
posts <-
|
||||
match "posts/*.md" \src -> A.do
|
||||
(meta, content) <- processPandocMeta src
|
||||
writeFile (src -<.> ".html") (renderPost meta content)
|
||||
writeFile (src -<.> ".html") (renderPost template meta content)
|
||||
meta
|
||||
|
||||
-- render index page with the 10 most recent articles
|
||||
renderIndex (take 10 (sort posts))
|
||||
renderIndex template (take 10 (sort posts))
|
||||
```
|
||||
|
||||
|
||||
Importantly, I want to emphasize that *you* --- the library user --- neither
|
||||
have to care about or understand the internals of [achille] in order to use it.
|
||||
You are free to ignore this post and directly go through the [user
|
||||
manual][manual] to get started!
|
||||
*Most* of the machinery below is purposefully kept hidden from plain sight. You
|
||||
are free to ignore this post and directly go through the [user manual][manual]
|
||||
to get started!
|
||||
|
||||
[manual]: /projects/achille/
|
||||
|
||||
This post is just there to document how the right theoretical framework was key
|
||||
in providing a good user interface that preserves all the desired properties.
|
||||
This article is just there to document how the right theoretical framework was
|
||||
instrumental in providing a good user interface *and yet* preserve all the
|
||||
desired properties. It also gives pointers on how to reliably overload Haskell's
|
||||
*lambda abstraction* syntax, because I'm sure many applications could make good
|
||||
use of that but are unaware that there are now ways to do it properly, *without
|
||||
any kind of metaprogramming*.
|
||||
|
||||
---
|
||||
|
||||
## Foreword
|
||||
|
||||
The original postulate is that *static sites are good*. Of course not for every
|
||||
use case, but for single-user, small-scale websites, it is a very practical way
|
||||
of managing content. Very easy to edit offline, very easy to deploy. All in all
|
||||
My postulate is that *static sites are good*. Of course not for every
|
||||
use case, but for single-user, small-scale websites, it is a very convenient way
|
||||
to manage content. Very easy to edit offline, very easy to deploy. All in all
|
||||
very nice.
|
||||
|
||||
There are lots of static site generators readily available. However each and
|
||||
every one of them has a very specific idea of how you *should* manage your
|
||||
content. For simple websites --- i.e weblogs --- they are great, but as soon as
|
||||
you want to heavily customize the building process of your site, require more
|
||||
fancy transformations, and thus step outside of the supported feature set of
|
||||
your site generator of choice, you're in for a lot of trouble.
|
||||
every one of them has a very specific idea of how you should *structure* your
|
||||
content. For simple websites --- i.e weblogs --- they are wonderful, but as soon
|
||||
as you want to heavily customize the generation process of your site or require
|
||||
more fancy transformations, and thus step outside of the supported feature set
|
||||
of your generator of choice, you're out of luck.
|
||||
|
||||
For this reason, many people end up not using existing static site generators,
|
||||
and instead prefer to write their own. Depending on the language you use, it is
|
||||
fairly straightforward to write a little static site generator doing everything
|
||||
you want. Sadly, making it *incremental* or *parallel* is another issue, and way
|
||||
trickier.
|
||||
fairly straightforward to write a little static site generator that does
|
||||
precisely what you want. Sadly, making it *incremental* or *parallel* is another
|
||||
issue, and way trickier.
|
||||
|
||||
That's precisely the niche that [Hakyll] and
|
||||
[achille] try to fill: use an embedded DSL in Haskell to specify your *custom* build
|
||||
rules, and compile them all into a full-fletched **incremental** static site
|
||||
generator executable. Some kind of static site generator *generator*.
|
||||
That's precisely the niche that [Hakyll] and [achille] try to fill: provide an
|
||||
embedded DSL in Haskell to specify your *custom* build rules, and compile them
|
||||
all into a full-fletched **incremental** static site generator executable. Some
|
||||
kind of static site generator *generator*.
|
||||
|
||||
[Hakyll]: https://jaspervdj.be/hakyll/
|
||||
|
||||
|
@ -78,13 +89,13 @@ is with a flow diagram, where *boxes* are "build rules". Boxes have
|
|||
distinguished inputs and outputs, and dependencies between the build rules are
|
||||
represented by wires going from outputs of boxes to inputs of other boxes.
|
||||
|
||||
The static site generator corresponding to the Haskell code above could be
|
||||
represented as the following diagram:
|
||||
The static site generator corresponding to the Haskell code above corresponds
|
||||
to the following diagram:
|
||||
|
||||
...
|
||||
|
||||
Build rules are clearly identified, and we see that in order to render the `index.html`
|
||||
page, we need to wait for the `renderPosts` rule to finish rendering each
|
||||
page, *we need to wait* for the `renderPosts` rule to finish rendering each
|
||||
article to HTML and return the metadata of every one of them.
|
||||
|
||||
Notice how some wires are **continuous** **black** lines, and some other wires are
|
||||
|
@ -96,9 +107,10 @@ generator.
|
|||
- files that are written to the filesystem, like the HTML output of every
|
||||
article, or the `index.html` file.
|
||||
|
||||
The first insight is to realize that the build system *shouldn't care about side
|
||||
effects*. Its *only* role is to know whether build rules *should be executed*,
|
||||
and how intermediate values get passed around.
|
||||
The first important insight is to realize that the build system *shouldn't care
|
||||
about side effects*. Its *only* role is to know whether build rules *should be
|
||||
executed*, how intermediate values get passed around, and how they change
|
||||
between consecutive runs.
|
||||
|
||||
### The `Recipe m` abstraction
|
||||
|
||||
|
@ -117,37 +129,6 @@ The purpose is to *abstract over side effects* of build rules (such as producing
|
|||
HTML files on disk) and shift the attention to *intermediate values* that flow
|
||||
between build rules.
|
||||
|
||||
As one could expect, if `m` is a monad, so is `Recipe m a`. This means composing
|
||||
recipes is very easy and dependencies *between* those are stated **explicitely**
|
||||
in the code.
|
||||
|
||||
```haskell
|
||||
main :: IO ()
|
||||
main = achille do
|
||||
posts <- match "posts/*.md" compilePost
|
||||
compileIndex posts
|
||||
```
|
||||
|
||||
``` {=html}
|
||||
<details>
|
||||
<summary>Type signatures</summary>
|
||||
```
|
||||
Simplifying a bit, these would be the type signatures of the building blocks in
|
||||
the code above.
|
||||
```haskell
|
||||
compilePost :: Recipe IO FilePath PostMeta
|
||||
match :: GlobPattern -> (Recipe IO FilePath b) -> Recipe IO () [b]
|
||||
compileIndex :: PostMeta -> Recipe IO () ()
|
||||
achille :: Recipe IO () () -> IO ()
|
||||
```
|
||||
``` {=html}
|
||||
</details>
|
||||
```
|
||||
|
||||
There are no ambiguities about the ordering of build rules and the evaluation model
|
||||
is in turn *very* simple --- in contrast to Hakyll, its global store and
|
||||
implicit ordering.
|
||||
|
||||
### Caching
|
||||
|
||||
In the definition of `Recipe`, a recipe takes some `Cache` as input, and
|
||||
|
@ -185,12 +166,105 @@ become insensitive to code refactorings, but the core idea is left unchanged.
|
|||
|
||||
### But there is a but
|
||||
|
||||
## Arrows
|
||||
We've now defined all the operations we could wish for in order to build,
|
||||
compose and combine recipes. We've even found the theoretical framework our
|
||||
concrete application inserts itself into. How cool!
|
||||
|
||||
I really like the `do` notation, but sadly losing this information about
|
||||
variable use is bad, so no luck. If only there was a way to *overload* the
|
||||
lambda abstraction syntax of Haskell to transform it into a representation free
|
||||
of variable bindings...
|
||||
**But there is catch**, and I hope you've already been thinking about it:
|
||||
**what an awful, awful way to write recipes**.
|
||||
|
||||
Sure, it's nice to know that we have all the primitive operations required to
|
||||
express all the flow diagrams we could ever be interested in. We *can*
|
||||
definitely define the site generator that has been serving as example
|
||||
throughout:
|
||||
|
||||
```
|
||||
rules :: Task ()
|
||||
rules = renderIndex ∘ (...)
|
||||
```
|
||||
|
||||
But I hope we can all agree on the fact that this code is **complete
|
||||
gibberish**. It's likely *some* Haskellers would be perfectly happy with this
|
||||
interface, but alas my library isn't *only* targeted to this crowd. No, what I
|
||||
really want is a way to assign intermediate results --- outputs of rules --- to
|
||||
*variables*, that then get used as inputs. Plain old Haskell variables. That is,
|
||||
I want to write my recipes as plain old *functions*.
|
||||
|
||||
And here is where my --- intermittent --- search for a readable syntax started,
|
||||
roughly two years ago.
|
||||
|
||||
## The quest for a friendly syntax
|
||||
|
||||
### Monads
|
||||
|
||||
If you've done a bit of Haskell, you *may* know that as soon as you're working
|
||||
with things that compose and sequence, there are high chances that what you're
|
||||
working with are *monads*. Perhaps the most well-known example is the `IO`
|
||||
monad. A value of type `IO a` represents a computation that, after doing
|
||||
side-effects (reading a file, writing a file, ...) will produce a value of type
|
||||
`a`.
|
||||
|
||||
Crucially, being a monad means you have a way to *sequence* computations. In
|
||||
the case of the `IO` monad, the bind operation has the following type:
|
||||
|
||||
```haskell
|
||||
(>>=) :: IO a -> (a -> IO b) -> IO b
|
||||
```
|
||||
|
||||
And because monads are so prevalent in Haskell, there is a *custom syntax*, the
|
||||
`do` notation, that allows you to bind results of computations to *variables*
|
||||
that can be used for the following computations. This syntax gets desugared into
|
||||
the primitive operations `(>>=)` and `pure`.
|
||||
|
||||
```haskell
|
||||
main :: IO ()
|
||||
main = do
|
||||
content <- readFile "input.txt"
|
||||
writeFile "output.txt" content
|
||||
```
|
||||
|
||||
The above gets transformed into:
|
||||
|
||||
```haskell
|
||||
main :: IO ()
|
||||
main = readFile "input.txt" >>= writeFile "output.txt"
|
||||
```
|
||||
|
||||
Looks promising, right? I can define a `Monad` instance for `Recipe m a`,
|
||||
fairly easily.
|
||||
|
||||
```haskell
|
||||
instance Monad (Recipe m a) where
|
||||
(>>=) :: Recipe m a b -> (b -> Recipe m a c) -> Recipe m a c
|
||||
```
|
||||
|
||||
And now problem solved?
|
||||
|
||||
```haskell
|
||||
rules :: Task IO ()
|
||||
rules = do
|
||||
posts <- match "posts/*.md" renderPosts
|
||||
renderIndex posts
|
||||
```
|
||||
|
||||
The answer is a resolute **no**. The problem becomes apparent when we try to
|
||||
actually define this `(>>=)` operation.
|
||||
|
||||
1. The second argument is a Haskell function of type `b -> Recipe m a c`. And
|
||||
precisely because it is a Haskell function, it can do anything it wants
|
||||
depending on the value of its argument. In particular, it could very well
|
||||
return *different recipes* for *different inputs*. That is, the *structure*
|
||||
of the graph is no longer *static*, and could change between runs, if the
|
||||
output of type `b` from the first rule happens to change. This is **very
|
||||
bad**, because we rely on the static structure of recipes to make the claim
|
||||
that the cache stays consistent between runs.
|
||||
|
||||
Ok, sure, but what if we assume that users don't do bad things (we never should).
|
||||
No, even then, there is an ever bigger problem:
|
||||
|
||||
2. Because the second argument is *just a Haskell function*.
|
||||
|
||||
## Arrows
|
||||
|
||||
That's when I discovered Haskell's arrows. It's a generalization of monads,
|
||||
and is often presented as a way to compose things that behave like functions.
|
||||
|
@ -212,31 +286,56 @@ how Haskell desugars the arrow notation.
|
|||
|
||||
...
|
||||
|
||||
There is a macro that is a bit smarter than current Haskell's desugarer, but not
|
||||
by much. I've seen some discussions about actually fixing this upstream, but I
|
||||
don't think anyone actually has the time to do this. So few people use arrows to
|
||||
justify the cost.
|
||||
So. Haskell's `Arrow` isn't it either. Well, in principle it *should* be the
|
||||
solution. But the desugarer is broken, the syntax still unreadable to my taste,
|
||||
and nobody has the will to fix it.
|
||||
|
||||
This syntax investigation must carry on.
|
||||
|
||||
## Conal Elliott's `concat`
|
||||
## Compiling to cartesian closed categories
|
||||
|
||||
Conal Elliott wrote a fascinating paper called *Compiling to Categories*.
|
||||
The gist of it is that any cartesian-closed category is a model of simply-typed
|
||||
lambda-calculus. Therefore, he made a GHC plugin giving access to a magical
|
||||
function:
|
||||
About a year after this project started, and well after I had given up on this
|
||||
whole endeavour, I happened to pass by Conal Elliott's fascinating paper
|
||||
["Compiling to Categories"][ccc]. In this paper, Conal recalls:
|
||||
|
||||
```
|
||||
ccc :: Closed k => (a -> b) -> a `k` b
|
||||
[ccc]: http://conal.net/papers/compiling-to-categories/
|
||||
|
||||
> It is well-known that the simply typed lambda-calculus is modeled by any
|
||||
> cartesian closed category (CCC)
|
||||
|
||||
I had heard of it, that is true. What this means is that, given any cartesian
|
||||
closed category, any *term* of type `a -> b` (a function) in the simply-typed
|
||||
lambda calculus corresponds to (can be interpreted as) an *arrow* (morphism)
|
||||
`a -> b` in the category. But a cartesian-closed category crucially has no notion
|
||||
of *variables*, just some *arrows* and operations to compose and rearrange them
|
||||
(among other things). Yet in the lambda calculus you *have* to construct functions
|
||||
using *lambda abstraction*. In other words, there is consistent a way to convert
|
||||
things defined with variables bindings into a representation (CCC morphisms)
|
||||
where variables are *gone*.
|
||||
|
||||
How interesting. Then, Conal goes on to explain that because Haskell is
|
||||
"just" lambda calculus on steroids, any monomorphic function of type `a -> b`
|
||||
really ought to be convertible into an arrow in the CCC of your choice.
|
||||
And so he *did* just that. He is behind the [concat] GHC plugin and library.
|
||||
This library exports a bunch of typeclasses that allow anyone to define instances
|
||||
for their very own target CCC. Additionally, the plugin gives access to the
|
||||
following, truly magical function:
|
||||
|
||||
[concat]: https://github.com/compiling-to-categories/concat
|
||||
|
||||
```haskell
|
||||
ccc :: CartesianClosed k => (a -> b) -> a `k` b
|
||||
```
|
||||
|
||||
You can see that the signature is *very* similar to the one of `arr`.
|
||||
|
||||
A first issue is that `Recipe m` very much isn't *closed*. Another more
|
||||
substantial issue is that the GHC plugin is *very* experimental. I had a hard
|
||||
time running it on simple examples, it is barely documented.
|
||||
|
||||
Does this mean all hope is lost? **NO**.
|
||||
When the plugin is run during compilation, every time it encounters this specific
|
||||
function it will convert the Haskell term (in GHC Core form) for the first
|
||||
argument (a function) into the corresponding Haskell term for the morphism in
|
||||
the target CCC.
|
||||
|
||||
How neat. A reliable way to overload the lambda notation in Haskell.
|
||||
The paper is really, really worth a read, and contains many practical
|
||||
applications such as compiling functions into circuits or automatic
|
||||
differentiation.
|
||||
|
||||
## Compiling to monoidal cartesian categories
|
||||
|
||||
|
|
Loading…
Reference in New Issue