From 6c0d4ad54ca4a983e8ee2960878cd6bd6e5038f3 Mon Sep 17 00:00:00 2001 From: flupe Date: Sat, 10 Dec 2022 01:30:23 +0100 Subject: [PATCH] more work on achille-smc draft --- content/posts/achille-smc.md | 288 ++++++++++++++++++++++++++++++++--- 1 file changed, 266 insertions(+), 22 deletions(-) diff --git a/content/posts/achille-smc.md b/content/posts/achille-smc.md index 30f637b..6c5f86e 100644 --- a/content/posts/achille-smc.md +++ b/content/posts/achille-smc.md @@ -114,9 +114,6 @@ between consecutive runs. ### The `Recipe m` abstraction -I had my gripes with Hakyll, and was looking for a simpler, more general way to -express build rules. I came up with the `Recipe` abstraction: - ```haskell newtype Recipe m a b = { runRecipe :: Context -> Cache -> a -> m (b, Cache) } @@ -129,12 +126,20 @@ The purpose is to *abstract over side effects* of build rules (such as producing HTML files on disk) and shift the attention to *intermediate values* that flow between build rules. +... + +Visual noise + +... + ### Caching -In the definition of `Recipe`, a recipe takes some `Cache` as input, and -returns another one after the computation is done. This cache is simply a *lazy -bytestring*, and enables recipes to have some *persistent storage* between -runs, that they can use in any way they desire. +In the definition of `Recipe m a b`, a recipe takes some `Cache` as input, and +returns another one after the computation is done. + +This cache --- for which I'm not gonna give a definition here --- enables recipes to +have some *persistent storage* between runs, that they can use in any way they +desire. The key insight is how composition of recipes is handled: @@ -152,18 +157,16 @@ recipe. Once the computation is done, the resulting caches are put together into one again. This ensures that every recipe will be attributed the same local cache ---- assuming the description of the generator does not change between runs. Of -course this is only true when `Recipe m` is merely used as *selective* -applicative functor, though I doubt you need more than that for writing a -static site generator. It's not perfect, but I can say that this very simple model +--- assuming the description of the generator does not change between runs. +It's not perfect, but I can say that this very simple model for caching has proven to be surprisingly powerful. -I have improved upon it since then, in order to make sure that -composition is associative and to enable some computationally intensive recipes to -become insensitive to code refactorings, but the core idea is left unchanged. +... ### Incremental evaluation and dependency tracking +... + ### But there is a but We've now defined all the operations we could wish for in order to build, @@ -190,6 +193,8 @@ really want is a way to assign intermediate results --- outputs of rules --- to *variables*, that then get used as inputs. Plain old Haskell variables. That is, I want to write my recipes as plain old *functions*. +--- + And here is where my --- intermittent --- search for a readable syntax started, roughly two years ago. @@ -264,6 +269,8 @@ No, even then, there is an ever bigger problem: 2. Because the second argument is *just a Haskell function*. +... + ## Arrows That's when I discovered Haskell's arrows. It's a generalization of monads, @@ -337,15 +344,252 @@ The paper is really, really worth a read, and contains many practical applications such as compiling functions into circuits or automatic differentiation. -## Compiling to monoidal cartesian categories +... -Two days ago, I stumbled upon this paper by chance:. +--- -What they explain is that many interesting categories to compile to are in fact -not closed. +Another year goes through, without any solution in sight. And yet. -No GHC plugin required, just a tiny library with a few `class`es. +## "Compiling" to (symmetric) monoidal categories -There is one drawback: `Recipe m` *is* cartesian. That is, you can freely -duplicate values. In their framework, they have you explicitely insert `dup` to -duplicate a value. This is a bit annoying, but they have a good reason to do so: +A month ago, while browsing a Reddit thread on the sad state of `Arrow`, +I stumbled upon an innocent link buried in the depth of replies. +To a paper from Jean-Philippe Bernardy and Arnaud Spiwack: +["Evaluating Linear Functions to Symmetric Monoidal Categories"][smc]. + +[smc]: https://arxiv.org/abs/2103.06195v2 + +And boy oh boy, *what a paper*. I haven't been able to stop thinking about it +since then. + +It starts with the following: + +> A number of domain specific languages, such as circuits or +> data-science workflows, are best expressed as diagrams of +> boxes connected by wires. + +Well yes indeed, what I want to express in my syntax are just plain old +diagrams, made out of boxes and wires. + +> A faithful abstraction is Symmetric Monoidal Categories +> (smcs), but, so far, it hasn’t been convenient to use. + +Again yes, cannot agree more. This is the right abstraction, but a terrible way +to design these diagrams. But then, the kicker, a bit later in the paper: + +> Indeed, every linear function can be interpreted in terms of an smc. + +What. This, I had never heard. Indeed it makes sense, since in (non-cartesian) +monoidal categories you cannot duplicate objects +(that is, have morphisms from `a` to `(a, a)`), +to only reason about functions that can only use their arguments *once*, and +that *have* to use it (or pass it along by returning it). Note that here we talk +about *linear* functions in the sense of Linear Haskell, type theory kind of +"linear", not linear in the linear algebra kind of "linear". + +So far so good. But then, they explain how to *evaluate* any such *linear* +Haskell function into the right SMC, **without metaprogramming**. And the +techniques they employ to do so are some of the smartest, most beautiful things +I've seen. I cannot recommend enough that you go read that paper to learn the +full detail. It's *amazing*, and perhaps more approachable than Conal's paper. +It is accompanied by the [linear-smc] libray, that exposes a very simple +interface: + +[linear-smc]: https://hackage.haskell.org/package/linear-smc + +- The module `Control.Category.Constrained` exports typeclasses to declare your + type family of choice `k :: * -> * -> *` (in the type-theory sense of type + family, not the Haskell sense) as the right kind of category, using `Category`, + `Monoidal` and `Cartesian`. + + ```haskell + class Category k where + id :: a `k` a + (∘) :: (b `k` c) -> (a `k` c) -> a `k` c + + class Category k => Monoidal k where + (×) :: (a `k` b) -> (c `k` d) -> (a ⊗ c) `k` (b ⊗ d) + swap :: (a ⊗ b) `k` (b ⊗ a) + assoc :: ((a ⊗ b) ⊗ c) `k` (a ⊗ (b ⊗ c)) + assoc' :: (a ⊗ (b ⊗ c)) `k` ((a ⊗ b) ⊗ c) + unitor :: a `k` (a ⊗ ()) + unitor' :: Obj k a => (a ⊗ ()) `k` a + + class Monoidal k => Cartesian k where + exl :: (a ⊗ b) `k` a + exr :: (a ⊗ b) `k` b + dup :: a `k` (a ⊗ a) + ``` + + So far so good, nothing surprising, we can confirm that indeed + we've already defined (or can define) these operations for `Recipe m`, + thus forming a cartesian category. + +- But the truly incredible bit comes from `Control.Category.Linear` that + provides the primitives to construct morphisms in a monoidal category using + linear functions. + + - It exports an abstract type `P k r a` that is supposed to represent the + "output of an arrow/box in the SMC `k`, of type `a`. + - A function to convert a SMC arrow into a linear functions on *ports*. + + ```haskell + encode :: (a `k` b) -> P k r a %1 -> P k r b + ``` + - A function to convert a linear function on *ports* to an arrow in your SMC: + + ```haskell + decode :: Monoidal k => (forall r. P k r a %1 -> P k r b) -> a `k` b + ``` + + There are other primitives that we're gonna ignore here. + +Now there are at least two things that are remarkable about this interface: + +- By keeping the type of ports `P k r a` *abstract*, and making sure that the + exported functions to *produce* ports also take ports *as arguments*, they are able + to enforce that any linear function on ports written by the user **had to use + the operations of the library**. + + There is virtually no other way to produce a port out of thin air than to use the + export `unit :: P k r ()`, and because the definition of `P k r a` is *not* + exported, users have *no way* to retrieve a value of type `a` from it. + Therefore, ports can only be *carried around*, and ultimately *given as input* to + arrows in the SMC, that have been converted into linear functions with `encode`. + + I have since been told this is a fairly typical method used by DSL writers, to + ensure that end users only ever use the allowed operations and nothing more. + But it was a first for me, and truly some galaxy-brain technique. + +- The second thing is this `r` parameter in `P k r a`. This type variable isn't + relevant to the information carried by the port. No, it's true purpose is + *ensuring that linear functions given to `decode` are **closed***. + + Indeed, the previous point demonstrated that linear functions + `P k r a %1 -> P k r b` can only ever be defined in terms of *variables* + carrying ports, or linear functions on *ports*. + + By quantifying over `r` in the first argument of `decode`, they prevent the + function to ever mention variables coming from *outside* the definition. + Indeed, all operations of the library use the same `r` for inputs and + outputs. So if an outsider port of type `Port k r a` was used in the definition + of a linear function, but defined *outside of it*, the function would *have* + to also use the same `r` for every port manipulated inside of it. Crucially, + this function can no longer be quantified over `r`, precisely because this `r` + was bound outside of its definition. + + I have seen this technique once before, in `Control.Monad.ST.Safe`, and it's + so neat. + +Because of the last two points, [linear-smc] ensures that the functions written by the +user given to `decode` can always be translated back into arrows, simply because they +must be *closed* and *only use the allowed operations*. Incorrect functions are +simply rejected by the type-checker with "readable" error messages. + +Even though the library does the translation at runtime, **it cannot fail**. + +[linear-smc] is readily available as a tiny, self-contained library on Hackage. +Because it doesn't do any metaprogramming, neither through Template Haskell nor +GHC plugins, it is very robust, easy to maintain and safe to depend on. + +The only experimental feature being used is Linear Haskell, plus some constraint wizardry. + +All in all, this seems like a wonderful foundation to stand on. +The library sadly doesn't have an associated Github page, and it seems like +nobody has heard about this paper and approach. At the time of writing, +this library has only been downloaded `125` times, and I'm responsible for a +large part of it. Please give it some love and look through the paper, you're +missing out. + +--- + +But now, let's look into how to apply this set of tools and go beyond. + +## Reaching the destination: compiling to cartesian categories + +... + +--- + +And here is the end destination. We've finally been able to fully overload the +Haskell lambda abstraction syntax, yet are still able to track the use of +variables in order to keep generators incremental. + +## Conclusion + +If anyone has made it so far, I would like to thank you for reading through this +post in its entirety. I quite frankly have no clue whether this will be of use +to anyone, but I've been thinking about this for so long and was so happy to +reach a "simple" solution that I couldn't just keep it to myself. + +Now again, I am very thankful for Bernardy and Spiwack's paper and library. +It is to my knowledge the cleanest way to do this kind of painless overloading. +It truly opened my eyes and allowed me to go a bit further. I hope the techniques +presented here can at least make a few people aware that these solutions exist +and can be used *painlessly*. + +Now as for [achille], my pet project that was the motivation for this entire +thing, it has now reached the level of usefulness and friction that I was +--- only a few years ago --- merely dreaming of. Being the only user to date, I +am certainly biased, and would probably do a bad job convincing anyone that they +should use it, considering the amount of available tools. + +However if you've been using [Hakyll] and are a tad frustrated by some of its +limitations --- as I was, I would be very happy if you could consider taking +[achille] for a spin. It's new, it's small, I don't know if it's as efficient as +it could be, but it is definitely made with love (and sweat). + +### Future work + +I now consider the syntax problem to be entirely solved. But there are always +more features that I wish for. + +- I didn't implement **parallelism** yet, because it wasn't in the first version + of [achille] and thus not a priority. But as shown in this article, it should + *also* come for free. I first have to learn how Haskell does concurrency, then + just go and implement it. + +- Make `Recipe m a b` into a GADT. Right now, the result of the translation from + functions on ports to recipes is non-inspectable, because I just get a Haskell + function. I think it would be very useful to make `Recipe m a b` a GADT, + where in addition to the current constructor, we have one for each primitive + operation (the operations of cartesian categories). + + This should make it possible to **produce an SVG image of the diagram behind + every generator** made with [achille], which I find pretty fucking cool. + +- In some rare cases, if the source code of the generator has been modified + between two runs, it can happen that a build rule receives as input cache the + old cache of a different recipe, that yet contains *exactly* the right kind of + information. + + I haven't witnessed this often, but for now the only way to restore proper + incrementality is to clean the cache fully and rebuild. A bit drastic if your + site is big or you have computationally expensive recipes. Now that the + diagram is completely static (compared to the previous version using monads), + I think it *should* be possible to let users give *names* to specific recipes, + so that: + + - If we want to force execution of a *specific* recipe, by ignoring its cache, + we can do so by simply giving the name in the CLI. + + - The cache of named recipes is stored *separately* from this tree-like nesting + of caches, so that these recipes become insensitive to refactorings of the + generator source code. + + I would even go as far as saying that this would be easy to implement, but + those are famous last words. + +- Actually, we can go even further. Because the diagram is static, we can + compute a hash at every node of the diagram. Yes, a *merkle tree*. + Every core recipe must be given a different hash (hand-picked, by me or + implementors of other recipes). Then by convention every recipe appends its + own hash to its local cache. This should entirely solve the problem of running + recipes that have changed *from scratch*, and *only those*. If any of the + sub-recipe of an outer receipe *has changed*, then the hash won't match, and + therefore it *has* to run again. + + At what point do we consider things over-engineered? I think I've been past + that point for a few years already. + +Til next time!