new post on overloading lambda

2022-12-11 10:31:49 +01:00 · 2022-12-11 10:31:49 +01:00 · 8d14f78c2c
parent 6c0d4ad54c
commit 8d14f78c2c
1 changed files with 445 additions and 0 deletions
--- a/content/posts/overloading-lambda.md
+++ b/content/posts/overloading-lambda.md
@ -0,0 +1,445 @@
+---
+title: Overloading the lambda abstraction in Haskell
+date: 2022-12-10
+draft: false
+---
+
+I've been dreaming about overloading the lambda abstraction in Haskell for a
+few years already. I am working on an EDSL with its very own notion of
+morphisms `k a b` from `a` to `b`. Thus I'm looking for a syntax that enables
+users to write said morphisms using just regular Haskell functions and
+the `\x -> ...` notation.
+
+Until 5 days ago I thought it was impossible.
+
+- Haskell's `Arrow` abstraction and associated syntactic sugar are supposed to
+  do just that, but it just doesn't work as intended. It's a
+  difficult thing to [implement properly](https://github.com/tomjaguarpaw/Arrows2/issues/1)
+  and [if proposals to fix it](https://github.com/lexi-lambda/ghc-proposals/blob/constraint-based-arrow-notation/proposals/0000-constraint-based-arrow-notation.md) have been come up, nothing concrete
+  came out of it (yet).
+
+  But even if the notation worked properly (and was convenient to use),
+  many times your internal representation of morphisms `k a b` cannot possibly embed
+  *all* Haskell functions `a -> b`. Therefore it's impossible to declare an
+  instance of `Arrow k`.
+
+- Now there is indeed this wonderful paper from Conal Elliott: ["Compiling to Categories"][ccc].  
+  In it, he recalls that any function in the simply typed lambda calculus corresponds
+  to an arrow in any cartesian closed category (CCC). What he demonstrated with the
+  [concat] GHC plugin is that this well-known result can be used to lift
+  any monomorphic Haskell function into a Haskell arrow of any user-defined Haskell CCC.
+
+  This is very cool, but also an experimental research project. It's very unstable and
+  unreliable. It's not packaged on Hackage so fairly difficult to install as a library,
+  and looks hard to maintain precisely because it's a compiler plugin.
+
+  So probably not a sound dependency to carry around when trying to develop a
+  [little, self-contained, portable library yourself](https://github.com/flupe/achille).
+
+  And again, not every category is cartesian-closed, especially if exponentials
+  *have to be plain Haskell functions* --- which [concat] enforces.
+
+[ccc]: http://conal.net/papers/compiling-to-categories/
+[concat]: https://github.com/compiling-to-categories/concat
+
+**However** --- I'm happy to report that **"overloading" the lambda abstraction is
+*completely doable***. Not only that: it is actually *very easy* and
+doesn't require any advanced Haskell feature, nor any kind of metaprogramming.
+No. library. needed. It really is the *perfect* solution.
+
+I cannot emphasize enough how powerful the approach presented here looks to me:
+
+- You can define your very own `proc` notation for the category of your choice.
+
+- Your category `k a b` only has to define operations `(&&&)`, `(.)`, and `id`.
+  In particular, you can define this notation even for things that cannot
+  provide an `Arrow` instance because of the `arr` function.
+  *You don't have to be able to embed pure Haskell functions in your category*.
+
+- And finally, this syntax is way more usable than the `proc` notation,
+  because you can manipulate morphims `k a b` of your category as *plain old
+  Haskell functions*, and can therefore compose them and apply them to variables
+  just as any other regular Haskell function, with the same syntax.
+
+It's in fact so simple that I suspect it must already be documented *somewhere*,
+but for the life of me I couldn't find anything. So here you go.
+
+## A toy DSL for flow diagrams
+
+So let's say our EDSL is supposed to encode flow diagrams, with boxes and wires.
+Boxes have distinguished inputs and outputs, and wires flow from outputs of
+boxes to inputs of other boxes.
+
+We can encode *any* flow diagram using the following operations:
+
+```haskell
+{-# LANGUAGE GADTs #-}
+
+data Flow a b where
+  -- just a wire
+  Id :: Flow a a
+  -- putting two boxes one after the other
+  Seq :: Flow a b -> Flow b c -> Flow a c
+  -- putting two boxes one next to the other
+  Par :: Flow a b -> Flow c d -> Flow (a, c) (b, d)
+  -- box that duplicates its input
+  Dup :: Flow a (a, a)
+  -- box that gets rid of its input
+  Void :: Flow a ()
+  -- box that projects on first input
+  Fst :: Flow (a, b) a
+  -- box that projects on second input
+  Snd :: Flow (a, b) b
+  -- finally, we embed any pure function into a box
+  Embed :: (a -> b) -> Flow a b
+```
+
+I think there are enough constructors here to claim it a cartesian category, but
+I didn't double check and it doesn't really matter.
+
+Now I think we can agree that although this is the right abstraction to
+internally transform and reason about diagrams, what an **awful**, **awful** way
+to write them. Even if we provided a bunch of weird infix operators for some of the
+constructors, only a few Haskellers would be able to make sense of this
+gibberish.
+
+No, what matters is that we *can* give an instance for both `Category Flow` and `Arrow Flow`: 
+
+```haskell
+import Control.Category
+import Control.Arrow
+
+instance Category Flow where id = Id ; g . f = Seq f g
+
+instance Arrow Flow where
+  arr     = Embed
+  first f = Par f id
+  second  = Par id
+  (***)   = Par
+  f &&& g = Seq Dup (Par f g)
+```
+
+We can even give a custom implementation for every `Arrow` operation.
+And yet, disappointment ensues when we attempt to use the `proc` notation:
+Haskell's desugarer for the `Arrow` syntactic sugar is *really dumb*.
+
+```haskell
+{-# LANGUAGE Arrows #-}
+
+t :: Flow (a, b) (b, a)
+t = proc (x, y) -> returnA -< (y, x)
+```
+
+If you print the term, you get something like this:
+
+```haskell
+Seq (Embed(_)) (Seq (Embed(_)) (Embed(_)))
+```
+
+We just wrote `swap`. But Haskell *doesn't even try* to use the operations we've
+just defined in `Arrow Flow`, and just lifts pure Haskell functions using `arr`. 
+The terms you get are **not inspectable**, they are black boxes. Not. good.
+
+What I'm gonna show you is how to get the following, *amazing* syntax:
+
+```haskell
+t :: Flow (a, b) (b, a)
+t = flow \(Tup x y) -> Tup y x
+```
+
+That reduces to the following term:
+
+```haskell
+t = Seq Dup (Par (Seq Id Snd) (Seq Id Fst))
+```
+
+How beautiful! *Sure*, it's not the most *optimal* representation, and a traversal 
+over the term could simplify it by removing `Seq Id`, but at the very
+least it's *inspectable*.
+
+## The solution
+
+Now my solution stems from this truly *amazing* paper from Jean-Philippe
+Bernardy and Arnaud Spiwack:
+[Evaluating Linear Functions to Symmetric Monoidal Categories][smc] (SMCs).
+
+In this paper, they explain that if CCCs are models of the simply typed lambda
+calculus, it is "well-known" that SMCs are models of the *linear* simply-typed
+lambda calculus. And thus, they explain how they are able to *evaluate* linear
+Haskell functions (as in, Linear Haskell linear functions) into arrows in any
+target SMC.
+
+They even released a library for that: [linear-smc]. I implore you to go and
+take a look at both the paper and the library, it's very *very* smart. Sadly, it
+seems to have gone mostly unnoticed.
+
+[smc]: https://arxiv.org/abs/2103.06195v2
+[linear-smc]: https://hackage.haskell.org/package/linear-smc-1.0.1
+
+I found out about this paper 5 days ago. Because my target category is
+actually cartesian (ergo, I can duplicate values), I suspected that I could
+remove almost all the complex machinery they had to employ to
+go from a free cartesian category to a monoidal one. I was hopeful that I
+could do without Linear Haskell, too.
+
+And, relieved as I am, I can tell you that yes: not only can all of this be
+simplified (if you don't care about SMCs), but everything can be implemented in
+a handful lines of Haskell code.
+
+### Interface
+
+So, the first thing we're gonna look at is the *interface* exposed to the users
+of your EDSL. These are the magic primitives that we will have to implement later.
+
+```haskell
+type Port r a
+```
+
+First there is this (abstract) type `Port r a`. It is meant to represent the
+**output** of a box (in our flow diagrams) carrying information of type `a`.
+
+Because the definition of `Port r a` is *not* exported, there is crucially *no
+way* for the user to retrieve a value of type `a` from it. Therefore, to use this
+variable in any meaningful way, they can **only** use the operations
+on ports that *you* --- the library designer --- export.
+
+And now, the two other necessary primitives:
+
+```haskell
+encode :: Flow a b -> Port r a -> Port r b
+decode :: (forall r. Port r a -> Port r b) -> Flow a b
+```
+
+- `encode` transforms a morphism from your category --- here a diagram `Flow a b` --- 
+  into a Haskell functions *on ports*. By exporting this function, you enable
+  users to apply any morphism *in your category* to the "port variables" they
+  have at their disposal. And precisely *only* those operations.
+
+- `decode` does the *reverse* translation. It takes any Haskell function *on
+  ports*, and converts it back into a morphism in your category. Well, not *any*
+  function. Only functions that are *closed* with respect to port variables.
+  That's why there is this type variable `r` in `Port r a`. Because all the
+  operations on ports exported by this interface share the same `r` between all
+  inputs and outputs, there is *no way* to use a port variable defined outside
+  of a function over ports, and still have the function be quantified over this `r`.
+
+  The same kind of trick, I believe, is used in [Control.Monad.ST][st]
+
+  [st]: https://hackage.haskell.org/package/base-4.17.0.0/docs/Control-Monad-ST.html
+
+  This is really really neat.
+
+And finally, because inputs of your diagrams are nested tuples, it's important
+to provide ways to combine ports to construct new ports carrying tuples.
+
+```haskell
+pair :: Port r a -> Port r b -> Port r (a, b)
+unit :: Port r ()
+```
+
+And... that's all you need! Really? Really!
+
+---
+
+Now of course you're free to include in your API your own morphisms converted
+into functions over ports. This would allow you to fully hide from sight your
+internal representation of diagrams. And to do so, you only need to use the
+previous primitives:
+
+```haskell
+flow = decode
+
+fst :: Port r (a, b) -> Port r a
+fst = encode Fst
+
+snd :: Port r (a, b) -> Port r b
+snd = encode Snd
+
+split :: Port r (a, b) -> (Port r a, Port r b)
+split p = (fst p, snd p)
+
+pattern Tup x y <- (split -> (x, y))
+  where Tup x y = pair x y
+
+void :: Port r a -> Port r ()
+void = encode Void
+
+box :: (a -> b) -> Port r a -> Port r b
+box f = encode (Embed f)
+
+(>>) :: Port r a -> Port r b -> Port r b
+x >> y = snd (pair x y)
+```
+
+You can see we use `PatternSynonyms` and `ViewPatterns` to define this `Tup`
+syntax that appeared earlier. Because we defined this `(>>)` operator to discard
+its first port, we can use `QualifiedDo` to have an even nicer syntax for
+sequencing ports:
+
+```haskell
+box1 :: Port r Text -> Port r (Int, Int)
+box2 :: Port r Text -> Port r Bool
+box3 :: Port r Int -> Port r ()
+box4 :: Port r Int -> Port r ()
+
+diag :: Flow (Text, Text) Bool
+diag = flow \(Tup x y) -> Flow.do
+  let Tup a b = box1 x
+  box3 a
+  box4 b
+  box2 y
+```
+
+Because we defined `(>>)` in such a way, `box1`, `box3` and `box4` *will appear*
+in the resulting diagram even though their output gets discarded.
+Were we to define `x >> y = y`, they would simply disappear altogether.
+This technique gives a lot of control over how the translation should
+be specified.
+
+Finally and most importantly, `box` is the **only** way to introduce pure
+Haskell functions `a -> b` into a black box in our diagram. For this reason, we
+can be sure that `Embed` will never show up if `box` isn't used explicitely by
+the user.
+
+### Implementation
+
+And now, the reveal: `Ports r a` are just morphisms from `r`
+to `a` in your category! *This* is the implementation of the primitives from
+earlier.
+
+```haskell
+newtype Port r a = P { unPort:: Flow r a }
+
+encode :: Flow a b -> Port r a -> Port r b
+encode f (P x) = P (f . x)
+
+decode :: (forall r. Port r a -> Port r b) -> Flow a b
+decode f = unPort (f (P id))
+
+pair :: Port r a -> Port r b -> Port r (a, b)
+pair (P x) (P y) = P (x &&& y)
+
+unit :: Port r ()
+unit = P Void
+```
+
+And indeed, if you consider `Port r a` to represent paths from some input `r`
+to output `a` in a diagram, then by squinting your eyes a bit the *meaning* (in
+your category) of this interface makes sense.
+
+- For `encode`, surely if you now how to go from `a` to `b`, and from `r` to
+  `a`, then you know how to go from `r` to `b`: that's just composition, hence
+  `(.)`.
+
+- For `decode`, you have as assumption that from any point `r` in a diagram,
+  you know how to transform a path from `r` to `a` into a path from `r` to `b`.
+  In particular, because this holds for any `r`, this holds for `a`. And you
+  know a path from `a` to `a`, it's the identity! This gets you a path from `a`
+  to `b`.
+
+It would appear `encode` and `decode` are just another instance of the Yoneda
+lemma, but I'm not a categorician so I cannot give any further explanation, my
+apologies.
+
+Still, the sheer simplicity of this technique is mesmerizing to me.
+
+---
+
+### Some things to be wary of
+
+Ok, so now we've seen how to successfully "overload" the lambda abstraction.
+But there is one quirk that I think you should be aware of.
+
+#### Preventing (too much) duplication in the diagram
+
+If you look at the definition of `split p`, you can see that the input `p` gets
+duplicated in the two output ports. If both of them are used in a given diagram, `p`
+will appear at least *twice* in this diagram. Now in the abstract setting of
+category theory, duplication in your diagram doesn't matter because of the
+equivalences and laws that equate diagrams.
+
+However, we're trying to encode *useful* things. My original motivation for
+going to this deep end was modeling flow diagrams of *effectful computations*, for
+the new version of [achille]. You could imagine changing the `Embed` constructor
+of `Flow` to the following:
+
+[achille]: /projects/achille
+
+```haskell
+Embed :: (a -> IO b) -> Flow a b
+```
+
+From then on, encoded flow diagrams are meant to be *executed* --- and even parallelised for
+free, how neat --- but I hope you see now why we should **never** duplicate any
+such `embed` box anymore. In the paper, that's why they argue it's
+important to restrict yourself to monoidal categories and linear functions,
+and make `copy` an explicit operation in your syntax.
+
+```haskell
+copy :: Port r a %1 -> (Port r a, Port r a)
+```
+
+Indeed, the type system will prevent you to use a variable more than once
+without explicitly inserting `copy`. However, everything has to be made linear
+and this has huge implications on the user-friendliness of the overall syntax.
+I've also found error messages from Linear Haskell to be pretty uninformative
+--- and `let` bindings are *not* linear, **what??**.
+
+Instead, I came up with a solution I'm fairly happy with. There is a trick: we
+introduce a new primitive.
+
+```haskell
+(>>=) :: Port r a -> (Port r a -> Port r b) -> Port r b
+```
+
+Now of course, you could just implement it as `x >>= f = f x`, but that's
+precisely what we want to *avoid*. No, instead, I force the evaluation of `x`,
+and *then* evaluate `f` applied to a box that simply returns the pre-computed
+value, *side-effect free*. So a box gets duplicated, sure, but this box does
+*nothing*. What I am now noticing while writing this down, is that introducing
+this `Bind` operator makes the diagram non-inspectable again, oops. I can still
+run it of course, but this is a problem for future me.
+
+Anyway, the benefit of this `(>>=)` operator is that it's very convenient with
+`QualifiedDo`:
+
+```haskell
+diag :: Flow FilePath Bool
+diag = flow \src -> Flow.do
+  Tup a b <- box1 src
+  box3 a
+  box4 (box2 b)
+```
+
+You just have to stop using `let` and you're good to go. What's *remarkable*
+in the syntax is that these *effectful* boxes are used as plain old **functions**.
+Therefore there is *no need* to clutter the syntax with `<$>`, `>>=`, `pure`,
+you just compose and apply regular functions (that do side effects).
+
+#### Compile vs. Evaluate
+
+Now the paper is called "Evaluating [...]" and says there is a runtime cost to
+pay for the translation. Considering how simple *this* implementation is, and
+how the target is fully first-order (well, without `(>>=)`), I wouldn't be
+surprised if GHC is actually able to fully unfold and translate to category
+morphisms at compile-time. But I'm not an expert Haskell developer by any means
+and have no clue how one would go about checking this, so if anyone does, [please tell me][me]!
+
+[me]: mailto:lucas@escot.me
+
+#### Expressive power
+
+I'm very curious to see which kind of categories you can "compile" Haskell
+functions to using this technique. I didn't expect converting to cartesian
+category's morphisms to be so straightforward, and I'm looking forward to see
+whether there is any fundamental limitation preventing us to do the same for
+CCCs.
+
+---
+
+Thank you for reading! I hope this was or will be useful to some of you. I am
+also very grateful for Bernardy and Spiwack's fascinating paper and library, it
+quite literally made my week.
+
+Till next time, perhaps when I manage to release the next version of [achille],
+using this very trick that I've been searching for in the past 2 years.