typos + text
This commit is contained in:
parent
8d14f78c2c
commit
84ea2d597b
|
@ -4,26 +4,47 @@ date: 2022-12-10
|
|||
draft: false
|
||||
---
|
||||
|
||||
I've been dreaming about overloading the lambda abstraction in Haskell for a
|
||||
few years already. I am working on an EDSL with its very own notion of
|
||||
morphisms `k a b` from `a` to `b`. Thus I'm looking for a syntax that enables
|
||||
users to write said morphisms using just regular Haskell functions and
|
||||
the `\x -> ...` notation.
|
||||
About two years ago, I started working on a little embedded domain-specific
|
||||
language (EDSL) called [achille], using Haskell. Because this EDSL has its own
|
||||
notion of *morphisms* `Recipe m a b` from `a` to `b`, I was looking for a way to
|
||||
let users write such morphisms using regular Haskell functions and the syntax
|
||||
for lambda abstraction, `\x -> ...`.
|
||||
|
||||
Until 5 days ago I thought it was impossible.
|
||||
As recently as 5 days ago, I was still convinced that there was no way to do
|
||||
such a thing without requiring a lot of engineering effort, either through GHC
|
||||
compiler plugins or with Template Haskell --- two things I'd rather stay far
|
||||
away from.
|
||||
|
||||
- Haskell's `Arrow` abstraction and associated syntactic sugar are supposed to
|
||||
do just that, but it just doesn't work as intended. It's a
|
||||
Indeed,
|
||||
|
||||
[arrow]: https://hackage.haskell.org/package/base-4.17.0.0/docs/Control-Arrow.html
|
||||
[proc]: https://www.haskell.org/arrows/syntax.html
|
||||
|
||||
- [Haskell's `Arrow` abstraction][arrow] and the related [`proc` notation][proc] are
|
||||
supposed to enable just that, but they don't work as intended. It's a
|
||||
difficult thing to [implement properly](https://github.com/tomjaguarpaw/Arrows2/issues/1)
|
||||
and [if proposals to fix it](https://github.com/lexi-lambda/ghc-proposals/blob/constraint-based-arrow-notation/proposals/0000-constraint-based-arrow-notation.md) have been come up, nothing concrete
|
||||
came out of it (yet).
|
||||
and [if proposals to fix it](https://github.com/lexi-lambda/ghc-proposals/blob/constraint-based-arrow-notation/proposals/0000-constraint-based-arrow-notation.md) have been discussed at large, nothing made it
|
||||
upstream.
|
||||
|
||||
But even if the notation worked properly (and was convenient to use),
|
||||
many times your internal representation of morphisms `k a b` cannot possibly embed
|
||||
*all* Haskell functions `a -> b`. Therefore it's impossible to declare an
|
||||
instance of `Arrow k`.
|
||||
it is quite common to work with morphisms `k a b` that cannot possibly embed
|
||||
*all* Haskell functions `a -> b`, preventing us to use this notation as we
|
||||
would like, simply because we cannot define a suitable instance of `Arrow k`.
|
||||
|
||||
- Now there is indeed this wonderful paper from Conal Elliott: ["Compiling to Categories"][ccc].
|
||||
There are *solutions* to this, like the great [overloaded] package, that
|
||||
fixes the `proc` notation and crucially makes it available to many categories
|
||||
that are not `Arrow`s. This may very well be exactly what you --- the reader --- need to
|
||||
resolve this problem.
|
||||
|
||||
However, I want to avoid making my library rely on compiler plugins at
|
||||
all cost, and think the `proc` notation, while way better than raw `Arrow`
|
||||
combinators, is still too verbose and restrictive. *I want to write lambdas*,
|
||||
and *I want to apply my morphisms just like any other Haskell function*.
|
||||
So no luck.
|
||||
|
||||
[overloaded]: https://hackage.haskell.org/package/overloaded-0.3.1
|
||||
|
||||
- There is also this wonderful paper from Conal Elliott: ["Compiling to Categories"][ccc].
|
||||
In it, he recalls that any function in the simply typed lambda calculus corresponds
|
||||
to an arrow in any cartesian closed category (CCC). What he demonstrated with the
|
||||
[concat] GHC plugin is that this well-known result can be used to lift
|
||||
|
@ -33,36 +54,38 @@ Until 5 days ago I thought it was impossible.
|
|||
unreliable. It's not packaged on Hackage so fairly difficult to install as a library,
|
||||
and looks hard to maintain precisely because it's a compiler plugin.
|
||||
|
||||
So probably not a sound dependency to carry around when trying to develop a
|
||||
[little, self-contained, portable library yourself](https://github.com/flupe/achille).
|
||||
|
||||
And again, not every category is cartesian-closed, especially if exponentials
|
||||
*have to be plain Haskell functions* --- which [concat] enforces.
|
||||
|
||||
[ccc]: http://conal.net/papers/compiling-to-categories/
|
||||
[concat]: https://github.com/compiling-to-categories/concat
|
||||
|
||||
**However** --- I'm happy to report that **"overloading" the lambda abstraction is
|
||||
**However**, 5 days ago I stumbled upon some paper by sheer luck.
|
||||
And well, I'm happy to report that **"overloading" the lambda abstraction is
|
||||
*completely doable***. Not only that: it is actually *very easy* and
|
||||
doesn't require any advanced Haskell feature, nor any kind of metaprogramming.
|
||||
No. library. needed. It really is the *perfect* solution.
|
||||
No. library. needed.
|
||||
|
||||
I cannot emphasize enough how powerful the approach presented here looks to me:
|
||||
I cannot emphasize enough how powerful the approach appears to be:
|
||||
|
||||
- You can define your very own `proc` notation for the category of your choice.
|
||||
- You can define your very own `proc` notation for *any* category you desire.
|
||||
|
||||
- Your category `k a b` only has to define operations `(&&&)`, `(.)`, and `id`.
|
||||
In particular, you can define this notation even for things that cannot
|
||||
provide an `Arrow` instance because of the `arr` function.
|
||||
*You don't have to be able to embed pure Haskell functions in your category*.
|
||||
- That is, as soon as you can provide an instance for `Category k`,
|
||||
you can *already* overload the lambda abstraction to write your morphisms.
|
||||
Even if your category isn't an instance of `Arrow` because of this `arr` function.
|
||||
*You do not have to be able to embed pure Haskell functions in your category*.
|
||||
|
||||
- And finally, this syntax is way more usable than the `proc` notation,
|
||||
because you can manipulate morphims `k a b` of your category as *plain old
|
||||
Haskell functions*, and can therefore compose them and apply them to variables
|
||||
just as any other regular Haskell function, with the same syntax.
|
||||
- The resulting syntax is way more intuitive than the `proc` notation,
|
||||
simply because you can manipulate morphims `k a b` of your category as *plain old
|
||||
Haskell functions*, and therefore compose them and apply them to variables
|
||||
just as any other Haskell function, using the same syntax.
|
||||
|
||||
- Implementing the primitives for overloading the lambda abstraction is a matter of about
|
||||
10 lines of Haskell.
|
||||
|
||||
It's in fact so simple that I suspect it must already be documented *somewhere*,
|
||||
but for the life of me I couldn't find anything. So here you go.
|
||||
but for the life of me I couldn't find anything. So here you go. I think this
|
||||
will be very useful for EDSL designers out here.
|
||||
|
||||
## A toy DSL for flow diagrams
|
||||
|
||||
|
@ -94,16 +117,19 @@ data Flow a b where
|
|||
Embed :: (a -> b) -> Flow a b
|
||||
```
|
||||
|
||||
I think there are enough constructors here to claim it a cartesian category, but
|
||||
I didn't double check and it doesn't really matter.
|
||||
There are probably enough constructors here to claim it is a cartesian category,
|
||||
but I didn't double check and it doesn't really matter here.
|
||||
|
||||
Now I think we can agree that although this is the right abstraction to
|
||||
I think we can agree that although this may very well be the right abstraction to
|
||||
internally transform and reason about diagrams, what an **awful**, **awful** way
|
||||
to write them. Even if we provided a bunch of weird infix operators for some of the
|
||||
constructors, only a few Haskellers would be able to make sense of this
|
||||
constructors, only few Haskellers would be able to make sense of this
|
||||
gibberish.
|
||||
|
||||
No, what matters is that we *can* give an instance for both `Category Flow` and `Arrow Flow`:
|
||||
What *I* want is a way to write these diagrams down using lambda abstractions
|
||||
and variable bindings, that get translated to this internal representation.
|
||||
|
||||
We *can* give an instance for both `Category Flow` and `Arrow Flow`:
|
||||
|
||||
```haskell
|
||||
import Control.Category
|
||||
|
@ -120,8 +146,9 @@ instance Arrow Flow where
|
|||
```
|
||||
|
||||
We can even give a custom implementation for every `Arrow` operation.
|
||||
And yet, disappointment ensues when we attempt to use the `proc` notation:
|
||||
Haskell's desugarer for the `Arrow` syntactic sugar is *really dumb*.
|
||||
And yet, we're left with nothing more than disappointment when we attempt
|
||||
to use the `proc` notation:
|
||||
Haskell's desugarer for the `proc` notation is *really dumb*.
|
||||
|
||||
```haskell
|
||||
{-# LANGUAGE Arrows #-}
|
||||
|
@ -139,8 +166,13 @@ Seq (Embed(_)) (Seq (Embed(_)) (Embed(_)))
|
|||
We just wrote `swap`. But Haskell *doesn't even try* to use the operations we've
|
||||
just defined in `Arrow Flow`, and just lifts pure Haskell functions using `arr`.
|
||||
The terms you get are **not inspectable**, they are black boxes. Not. good.
|
||||
Granted, morphisms `Fst` and `Snd` are not part of the interface of `Arrow`,
|
||||
so this example is a bit unfair. But if you've used the `proc`
|
||||
notation at all, you know this isn't an isolated incident, and `arr` eventually
|
||||
always shows up.
|
||||
|
||||
What I'm gonna show you is how to get the following, *amazing* syntax:
|
||||
What I'm gonna show you is how to enable the following, *straight-to-the-point*
|
||||
syntax:
|
||||
|
||||
```haskell
|
||||
t :: Flow (a, b) (b, a)
|
||||
|
@ -155,7 +187,7 @@ t = Seq Dup (Par (Seq Id Snd) (Seq Id Fst))
|
|||
|
||||
How beautiful! *Sure*, it's not the most *optimal* representation, and a traversal
|
||||
over the term could simplify it by removing `Seq Id`, but at the very
|
||||
least it's *inspectable*.
|
||||
least it's fully *inspectable*.
|
||||
|
||||
## The solution
|
||||
|
||||
|
@ -165,9 +197,9 @@ Bernardy and Arnaud Spiwack:
|
|||
|
||||
In this paper, they explain that if CCCs are models of the simply typed lambda
|
||||
calculus, it is "well-known" that SMCs are models of the *linear* simply-typed
|
||||
lambda calculus. And thus, they explain how they are able to *evaluate* linear
|
||||
Haskell functions (as in, Linear Haskell linear functions) into arrows in any
|
||||
target SMC.
|
||||
lambda calculus. And thus, they show how they are able to *evaluate* linear
|
||||
functions (as in, Linear Haskell linear functions) into arrows in any
|
||||
target SMC of your choice.
|
||||
|
||||
They even released a library for that: [linear-smc]. I implore you to go and
|
||||
take a look at both the paper and the library, it's very *very* smart. Sadly, it
|
||||
|
@ -176,20 +208,20 @@ seems to have gone mostly unnoticed.
|
|||
[smc]: https://arxiv.org/abs/2103.06195v2
|
||||
[linear-smc]: https://hackage.haskell.org/package/linear-smc-1.0.1
|
||||
|
||||
I found out about this paper 5 days ago. Because my target category is
|
||||
actually cartesian (ergo, I can duplicate values), I suspected that I could
|
||||
*This* paper was the tipping point. Because my target category is
|
||||
cartesian (ergo, I can duplicate values), I suspected that I could
|
||||
remove almost all the complex machinery they had to employ to
|
||||
go from a free cartesian category to a monoidal one. I was hopeful that I
|
||||
could do without Linear Haskell, too.
|
||||
go from a free cartesian category over a SMC to the underlying SMC.
|
||||
I was hopeful that I could ditch Linear Haskell, too.
|
||||
|
||||
And, relieved as I am, I can tell you that yes: not only can all of this be
|
||||
simplified (if you don't care about SMCs), but everything can be implemented in
|
||||
a handful lines of Haskell code.
|
||||
simplified (if you don't care about SMCs or Linear Haskell), but everything can
|
||||
be implemented in a handful lines of Haskell code.
|
||||
|
||||
### Interface
|
||||
|
||||
So, the first thing we're gonna look at is the *interface* exposed to the users
|
||||
of your EDSL. These are the magic primitives that we will have to implement later.
|
||||
of your EDSL. These are the magic primitives that we will have to implement.
|
||||
|
||||
```haskell
|
||||
type Port r a
|
||||
|
@ -199,8 +231,8 @@ First there is this (abstract) type `Port r a`. It is meant to represent the
|
|||
**output** of a box (in our flow diagrams) carrying information of type `a`.
|
||||
|
||||
Because the definition of `Port r a` is *not* exported, there is crucially *no
|
||||
way* for the user to retrieve a value of type `a` from it. Therefore, to use this
|
||||
variable in any meaningful way, they can **only** use the operations
|
||||
way* for the user to retrieve a value of type `a` from it. Therefore, to use a
|
||||
"port variable" in any meaningful way, they can **only** use the operations
|
||||
on ports that *you* --- the library designer --- export.
|
||||
|
||||
And now, the two other necessary primitives:
|
||||
|
@ -223,28 +255,29 @@ decode :: (forall r. Port r a -> Port r b) -> Flow a b
|
|||
inputs and outputs, there is *no way* to use a port variable defined outside
|
||||
of a function over ports, and still have the function be quantified over this `r`.
|
||||
|
||||
The same kind of trick, I believe, is used in [Control.Monad.ST][st]
|
||||
The same kind of trick, I believe, is used in [Control.Monad.ST][st].
|
||||
This is really really neat.
|
||||
|
||||
[st]: https://hackage.haskell.org/package/base-4.17.0.0/docs/Control-Monad-ST.html
|
||||
|
||||
This is really really neat.
|
||||
---
|
||||
|
||||
And finally, because inputs of your diagrams are nested tuples, it's important
|
||||
to provide ways to combine ports to construct new ports carrying tuples.
|
||||
`encode` and `decode` can be defined for any `Category k`.
|
||||
But if `k` is *also* cartesian, we can provide the following additional primitives:
|
||||
|
||||
```haskell
|
||||
pair :: Port r a -> Port r b -> Port r (a, b)
|
||||
unit :: Port r ()
|
||||
```
|
||||
|
||||
And... that's all you need! Really? Really!
|
||||
And... that's all you need!
|
||||
|
||||
---
|
||||
|
||||
Now of course you're free to include in your API your own morphisms converted
|
||||
into functions over ports. This would allow you to fully hide from sight your
|
||||
internal representation of diagrams. And to do so, you only need to use the
|
||||
previous primitives:
|
||||
into functions over ports. This would allow you to fully hide from the world
|
||||
your internal representation of diagrams. And to do that, you only need to use
|
||||
the previous primitives:
|
||||
|
||||
```haskell
|
||||
flow = decode
|
||||
|
@ -271,10 +304,10 @@ box f = encode (Embed f)
|
|||
x >> y = snd (pair x y)
|
||||
```
|
||||
|
||||
You can see we use `PatternSynonyms` and `ViewPatterns` to define this `Tup`
|
||||
syntax that appeared earlier. Because we defined this `(>>)` operator to discard
|
||||
You can see I use `PatternSynonyms` and `ViewPatterns` to define this `Tup`
|
||||
syntax that appeared earlier. Because I defined this `(>>)` operator to discard
|
||||
its first port, we can use `QualifiedDo` to have an even nicer syntax for
|
||||
sequencing ports:
|
||||
putting all ports in parallel and discard all but the last one:
|
||||
|
||||
```haskell
|
||||
box1 :: Port r Text -> Port r (Int, Int)
|
||||
|
@ -290,7 +323,7 @@ diag = flow \(Tup x y) -> Flow.do
|
|||
box2 y
|
||||
```
|
||||
|
||||
Because we defined `(>>)` in such a way, `box1`, `box3` and `box4` *will appear*
|
||||
Because `(>>)` is defined in such a way, `box1`, `box3` and `box4` *will appear*
|
||||
in the resulting diagram even though their output gets discarded.
|
||||
Were we to define `x >> y = y`, they would simply disappear altogether.
|
||||
This technique gives a lot of control over how the translation should
|
||||
|
@ -299,13 +332,15 @@ be specified.
|
|||
Finally and most importantly, `box` is the **only** way to introduce pure
|
||||
Haskell functions `a -> b` into a black box in our diagram. For this reason, we
|
||||
can be sure that `Embed` will never show up if `box` isn't used explicitely by
|
||||
the user.
|
||||
the user. If your category has no embedding of Haskell functions,
|
||||
or does *not* export it, it's impossible to insert them, ever.
|
||||
|
||||
### Implementation
|
||||
|
||||
And now, the reveal: `Ports r a` are just morphisms from `r`
|
||||
to `a` in your category! *This* is the implementation of the primitives from
|
||||
earlier.
|
||||
And now, the big reveal... `Ports r a` are just morphisms from `r`
|
||||
to `a` in your category!
|
||||
|
||||
**This** is the implementation of the primitives from earlier.
|
||||
|
||||
```haskell
|
||||
newtype Port r a = P { unPort:: Flow r a }
|
||||
|
@ -327,7 +362,7 @@ And indeed, if you consider `Port r a` to represent paths from some input `r`
|
|||
to output `a` in a diagram, then by squinting your eyes a bit the *meaning* (in
|
||||
your category) of this interface makes sense.
|
||||
|
||||
- For `encode`, surely if you now how to go from `a` to `b`, and from `r` to
|
||||
- For `encode`, surely if you know how to go from `a` to `b`, and from `r` to
|
||||
`a`, then you know how to go from `r` to `b`: that's just composition, hence
|
||||
`(.)`.
|
||||
|
||||
|
@ -337,11 +372,24 @@ your category) of this interface makes sense.
|
|||
know a path from `a` to `a`, it's the identity! This gets you a path from `a`
|
||||
to `b`.
|
||||
|
||||
It would appear `encode` and `decode` are just another instance of the Yoneda
|
||||
lemma, but I'm not a categorician so I cannot give any further explanation, my
|
||||
apologies.
|
||||
It would appear `encode` and `decode` being inverse of one another is
|
||||
yet another instance of the Yoneda lemma, but I'm not a categorician so
|
||||
I will not attempt to explain this any more than that, apologies.
|
||||
|
||||
Still, the sheer simplicity of this technique is mesmerizing to me.
|
||||
It kinda looks like doing CPS, but it *isn't* CPS. Quoting the paper:
|
||||
|
||||
> the encoding from `k a b` to `P k r a ⊸ P k r b` can be thought of as a
|
||||
> transformation to continuation-passing-style (cps), albeit reversed --- perhaps a
|
||||
> “prefix-passing-style” transformation.
|
||||
|
||||
Apparently it is *the dual of CPS* or something.
|
||||
|
||||
And you can double-check that in order to implement `encode` and `decode` you just
|
||||
need `id` and `(.)`. Now to allow more stuff than just composing
|
||||
morphisms as functions, more primitives can be added, like `pair` using `(&&&)`,
|
||||
and `unit` using `Void :: Flow a ()` --- which may or may not be available,
|
||||
dependending on the kind of categories you're working with.
|
||||
|
||||
---
|
||||
|
||||
|
@ -370,8 +418,8 @@ Embed :: (a -> IO b) -> Flow a b
|
|||
```
|
||||
|
||||
From then on, encoded flow diagrams are meant to be *executed* --- and even parallelised for
|
||||
free, how neat --- but I hope you see now why we should **never** duplicate any
|
||||
such `embed` box anymore. In the paper, that's why they argue it's
|
||||
free, how neat --- but I hope you see now why we should **never ever** duplicate any
|
||||
such `Embed` box anymore. In the paper, that's why they argue it's
|
||||
important to restrict yourself to monoidal categories and linear functions,
|
||||
and make `copy` an explicit operation in your syntax.
|
||||
|
||||
|
@ -414,7 +462,9 @@ diag = flow \src -> Flow.do
|
|||
You just have to stop using `let` and you're good to go. What's *remarkable*
|
||||
in the syntax is that these *effectful* boxes are used as plain old **functions**.
|
||||
Therefore there is *no need* to clutter the syntax with `<$>`, `>>=`, `pure`,
|
||||
you just compose and apply regular functions (that do side effects).
|
||||
`-<`, `returnA` and a bind for every single arrow like in the `proc` notation:
|
||||
you just compose and apply them like regular functions --- even when they are
|
||||
effectful.
|
||||
|
||||
#### Compile vs. Evaluate
|
||||
|
||||
|
@ -435,11 +485,31 @@ category's morphisms to be so straightforward, and I'm looking forward to see
|
|||
whether there is any fundamental limitation preventing us to do the same for
|
||||
CCCs.
|
||||
|
||||
#### Recursive morphisms and infinite structures
|
||||
|
||||
[Someone asked on Reddit][reddit] whether recursive morphisms can be
|
||||
written out using this syntax, *wihtout* making `decode` loop forever.
|
||||
I've thought about it for a bit, and I think precisely because this is not
|
||||
*CPS* and we construct morphisms `k a b` rather than Haskell functions,
|
||||
`decode` should not be a problem. It *should* produce recursive morphisms where
|
||||
recursive occurences are guarded by constructors. Because of laziness, looping
|
||||
is not a concern.
|
||||
|
||||
But I haven't tried this out yet. Since in [achille] recursive
|
||||
*recipes* can be quite useful, I want to support them, and so I will *definitely*
|
||||
investigate further.
|
||||
|
||||
[reddit]: https://www.reddit.com/r/haskell/comments/zi9mxp/comment/izqh96d/?utm_source=share&utm_medium=web2x&context=3
|
||||
|
||||
---
|
||||
|
||||
Thank you for reading! I hope this was or will be useful to some of you. I am
|
||||
also very grateful for Bernardy and Spiwack's fascinating paper and library, it
|
||||
quite literally made my week.
|
||||
|
||||
Feel free to go on [the r/haskell thread][thread] related to this post.
|
||||
|
||||
[thread]: https://www.reddit.com/r/haskell/comments/zi9mxp/overloading_the_lambda_abstraction_in_haskell/
|
||||
|
||||
Till next time, perhaps when I manage to release the next version of [achille],
|
||||
using this very trick that I've been searching for in the past 2 years.
|
||||
using this very trick that I had been desperately looking for in the past 2 years.
|
||||
|
|
Loading…
Reference in New Issue