This commit is contained in:
flupe 2024-12-21 10:45:19 +01:00
parent c08195444e
commit c159e989ef
1 changed files with 174 additions and 71 deletions

View File

@ -5,19 +5,19 @@ draft: true
--- ---
This years' [Advent of Code][AOC] has lots of 2D grids, This years' [Advent of Code][AOC] has lots of 2D grids,
and makes you do many traversals on them to find paths and makes you traverse them to find paths
of various kinds. of various kinds.
At some point I had to implement Dijkstra's algorithm, in Haskell. At some point I had to implement Dijkstra's algorithm, in Haskell.
In trying to make my implementation reusable for the following In trying to make my implementation reusable for the following
days, I realized that Dijkstra's algorithm is actually way more general than I days, I realized that Dijkstra's algorithm is actually way more general than I
remembered --- or was taught! remembered (or was taught)!
In short, *weights don't have to be real-valued*! In short, *weights don't have to be real-valued*!
In this post, I describe a general interface for the algorithm, In this post, I describe a general interface for the algorithm,
such that we can implement it exactly once and still use it to compute many such that we can implement it exactly once and use it to compute many
different things. different things.
This article is a literate Haskell file, so feel free to download it and try it This article is a literate Haskell file, so feel free to [download it](/posts/haskell-dijkstra.lhs.md) and try it
for yourself! As such, let's get a few imports and language extensions out of for yourself! As such, let's get a few imports and language extensions out of
the way: the way:
@ -31,14 +31,26 @@ the way:
{-# LANGUAGE BlockArguments #-} {-# LANGUAGE BlockArguments #-}
{-# LANGUAGE PatternSynonyms #-} {-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-} {-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE RecordWildcards #-} {-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE StandaloneKindSignatures #-} {-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE DeriveAnyClass #-}
import Control.Monad (when, foldM)
import Control.Monad.ST (ST, runST)
import Data.Ix (Ix, inRange)
import Data.Array (Array, (!), listArray)
import Data.Array qualified as Array (bounds)
import Data.Array.MArray (newArray, freeze, readArray, writeArray)
import Data.Array.ST (STArray)
import Data.Function ((&)) import Data.Function ((&))
import Data.Functor ((<&>)) import Data.Functor ((<&>))
import Data.Kind (Type) import Data.Kind (Type)
import Data.Set (Set) import Data.Set (Set)
import Data.Set qualified as Set import Data.Set qualified as Set
import Data.Semigroup (Sum(Sum))
``` ```
```{=html} ```{=html}
@ -56,11 +68,14 @@ the next section.
### The shortest path problem ### The shortest path problem
It's been a while since I had to use any formalism to talk about graphs proper.
So I will be using the notations from the Cormen book that I just looked up for a refresher.
Consider a weighted directed graph $G = (V, E, w)$. Consider a weighted directed graph $G = (V, E, w)$.
- $V$ denotes the set of vertices. - $V$ denotes the set of vertices.
- $E \subseteq V \times V$ denotes the set of edges. - $E \subseteq V \times V$ denotes the set of edges.
- Every edge $e \in E$ has an associated (non-negative) weight $w(e) \in \mathbb{R}, w(e) 0$. - Every edge $e \in E$ has an associated (non-negative) weight $w(e) \in \mathbb{R}, w(e) > 0$.
We call *path* a sequence of vertices such that there is an edge between every We call *path* a sequence of vertices such that there is an edge between every
consecutive vertex in the sequence. If we denote $\text{paths}(a, b)$ consecutive vertex in the sequence. If we denote $\text{paths}(a, b)$
@ -74,10 +89,10 @@ constituent edges.
$$w(p) = \sum_{i = 1}^k{w(v_{i - 1}, v_i)} $$w(p) = \sum_{i = 1}^k{w(v_{i - 1}, v_i)}
$$ $$
*Shortest-paths problems* ask questions along the line of: *Shortest-paths problems* ask questions along the lines of:
- What is the minimum weight from $a$ to $b$ in $G$? - What is the minimum weight from $a$ to $b$ in $G$?
- Can we find one path from $a$ to $b$ with minimum weight? - What is one path from $a$ to $b$ with minimum weight?
- Can we find *all* such paths from $a$ to $b$? - Can we find *all* such paths from $a$ to $b$?
If you interpret the weight as a physical distance, this amounts to finding the If you interpret the weight as a physical distance, this amounts to finding the
@ -85,14 +100,16 @@ shortest trip from one vertex to the other.
Dijkstra's algorithm is an infamous technique for solving the *single-source Dijkstra's algorithm is an infamous technique for solving the *single-source
shortest-paths problem*: finding a shortest path from a given source vertex $s$ shortest-paths problem*: finding a shortest path from a given source vertex $s$
to *every* vertex $v \in V$. It's essentially a generalization of breadth-first to *every* other vertex $v \in V$. It's essentially a generalization of breadth-first
search to (positively) weighted graphs. And it's pretty fast! search to (positively) weighted graphs. And it's pretty fast!
### Dijkstra's algorithm ### Dijkstra's algorithm
TODO: high-level overview of the algorithm
--- ---
## Taking a step back ## Taking a step back, generalizing
One thing to notice in the problem statement from earlier is that weights have One thing to notice in the problem statement from earlier is that weights have
very little to do with real numbers. In fact, they don't have to be scalars at all! very little to do with real numbers. In fact, they don't have to be scalars at all!
@ -119,44 +136,38 @@ the algorithm merely requires:
- An absorbing element $\infty$ for $\oplus$, that should also be an upper - An absorbing element $\infty$ for $\oplus$, that should also be an upper
bound of $W$. bound of $W$.
If we summarize, it *looks* like $(W/\approx, \oplus, 0)$ is a *monoid*, If we summarize, it *looks* like $(W/\approx, \oplus, 0)$ should be a *monoid*,
*totally ordered* by $\leq$ and with null element $\infty$. *totally ordered* by $\leq$ and with null element $\infty$.
I think this encompasses all the properties stated above, and nothing more, I think this encompasses all the properties stated above, and nothing more,
but I haven't looked that deeply into the formalism and how but I haven't looked that deeply into the formalism and how
mathematicians usually call these things. mathematicians usually call these things.
The restriction that edges must have non-negative weights can simply be reworded
as weights having to be strictly larger than the identity element.
$$\forall e \in E, w(e) > 0$$
Now we can freely redefine the weight of a path: Now we can freely redefine the weight of a path:
$$ $$
w(p) = \bigoplus_{i = 1}^k{w(v_{i - 1}, v_i)} w(p) = \bigoplus_{i = 1}^k{w(v_{i - 1}, v_i)}
$$ $$
Equipped with this toolkit, we can redefine the single-source shortest-path Equipped with this toolkit, we can state the single-source shortest-path
problem: for a given source vertex $s \in V$, how do we compute the smallest problem again: for a given source vertex $s \in V$, how do we compute the smallest
weight achievable on a path from $s$ to any other vertex $e \in V$? weight achievable on a path from $s$ to any other vertex $e \in V$?
We can see that the pseudo-code for Dijkstra's algorithm remains almost
identical:
1. $S = \emptyset$
2. $Q = \emptyset$
3. **For** each vertex $u \in V$:
- Insert $u \in Q$
4. **While** $Q \neq \emptyset$:
- $u = \text{Extract-Min}(Q)$
- $S = S \cup \{u\}$
- **For** each vertex $v$ in such that $(u, v) \in E$:
Some interesting remarks:
- $(\cdot \oplus \cdot)$ does *not* have to be commutative!
--- ---
## Abstract Haskell interface and implementation
Now that we've figured out the building blocks that are required for the algorithm
to work, let's write this down in Haskell!
### Weights
Given the requirements on weights we established earlier, Given the requirements on weights we established earlier,
we can try to map it to the corresponding Haskell implementation. we can try to map each of them to their corresponding Haskell counterpart.
- Weights should have an equivalence relation: that's `Eq`. - Weights should have an equivalence relation: that's `Eq`.
- Weights should have a total order: that's `Ord`. - Weights should have a total order: that's `Ord`.
@ -164,55 +175,102 @@ we can try to map it to the corresponding Haskell implementation.
that's `Semigroup`. that's `Semigroup`.
Sadly we're not using Agda so we can't enforce the fact that the order relation Sadly we're not using Agda so we can't enforce the fact that the order relation
must be compatible with the semigroup operation from inside the language, must be compatible with the semigroup operation from inside the language.
so we'll just have to be careful when instanciating. We'll just have to be careful when defining the instances.
So, a *cost* should satisfy have instances for all three classes above (and So, a `Weight` should therefore have instances for all three classes above (and
`Ord` implies `Eq` in Haskell, somehow). `Ord` implies `Eq` in Haskell, somehow).
```hs ```hs
class (Semigroup a, Ord a) => Cost a where class (Semigroup a, Ord a) => Weight a where
merge :: a -> a -> a infty :: a
merge x = const x
updateWeight :: a -> a -> a
updateWeight x = const x
``` ```
Add corresponds to `semigroup` `infty` is the absorbing element of $W$. As stated earlier, it must be an upper bound of $W$.
The default implementation is to ignore the new cost. This is quite common, say,
if we only want to find "a shortest path", and not every one of them.
### A generic interface But what is this `updateWeight` operation here? It is used to *merge equivalent weights*.
Indeed, during the execution of the Dijkstra algorithm, in the relaxation phase,
we may find that the weight of going to $v$ by passing through $u$ is equal
to the cost we have already computed for $v$.
Because we haven't decreased the weight, we shouldn't update the priority of $v$ in the queue,
however it might still be useful to *account* for the new paths through $u$.
That's what this function is for. The only requirement for `updateWeight` is that the output should be
in the same equivalence class as its (equivalent) inputs.
$$\forall w, w' \in W s.t. w \approx w', \texttt{updateWeight}(w, w') \approx w \approx w'
$$
As a convention, the first argument is the already computed weight, and the second argument
is the newly discovered (equivalent) cost along the new path(s) through $u$.
The priority queue should then update the weight of $v$ to this new value.
It won't change the priority of $v$ in the queue, and the order of traversal,
but the new information is now accounted for.
The default implementation for `mergeWeight` discards the new weight entirely.
This is quite common, say, if we only want to find "a shortest path", and not every one of them.
### Graphs
Now that we know what weights are, we need to describe what kind of graphs are suitable for our Dijkstra algorithm.
```hs ```hs
data Dijkstra i c = Dijkstra data Dijkstra i c = Dijkstra
{ bounds :: (i, i) { bounds :: (i, i)
, startCost :: i -> c , startCost :: i -> c
, defaultCost :: c
, next :: i -> c -> [(c, i)] , next :: i -> c -> [(c, i)]
} }
``` ```
So, let's expand a bit on the fields of the interface. So, let's expand a bit on the fields of the interface.
- `bounds` describes the lower and upper bound of the vertices. - `bounds` describes the lower and upper bound of $V$.
This is only necessary because I want to store intermediate This is just an implementation detail: I want to store intermediate
costs in a *mutable* array during the traversal, for efficiency purposes. weights in a *mutable* array during the traversal, for efficiency purposes.
So I need to know the size of $V$.
If you cannot reasonnably enumerate all vertices, you can drop the `bounds` field If you cannot reasonnably enumerate all vertices, you can drop the `bounds` field
and use a pure persistent `Map` instead. and use a purely-functional persistent `Map` instead in the implementation.
- `initCost` returns the initial cost we use for the start vertex. - `initCost` returns the initial cost we use for a given start vertex.
It must *always* be an identity element of $W$, and a lower bound of $W$.
- `defaultCost` is the initial cost for all other (yet unvisited) vertices. $$\forall s \in V, w \in W, \texttt{startCost}(s) \oplus w \approx w \oplus \texttt{startCost}(s) \approx w$$
$$\forall s \in V, w \in W, \texttt{startCost}(s) \leq w$$
Given a vertex and its associated cost, the transition function `next` returns Concretely, this means that rather than have a single identity $0 \in W$, we have one for *every vertex*.
its neighbours in the graph, along with the updated cost to get there. By anti-symmetry of the ordering relation they are all equivalent anyway.
For it to make sense, the cost of neighbours should be higher then (or equal to) This is very useful to store information about the starting vertex in the weight.
the cost of the parent vertex. Say, if we're computing paths, we initially store a 0-length path containing only the starting vertex.
- And finally, the bread and butter of the graph: a *transition* function `next`.
For any vertex $u$ and its associated weight $w$, `next u w` returns the neighbours of $u$,
with the weight of the edges.
As discussed earlier, weight of edges must be strictly larger than $0$.
One may wonder why we take as input the weight of $u$, and indeed it *is* weird.
Most reasonable transition functions ignore it.
But this means you can define funky graphs where the weight of an edge depends
on the minimal weight to get there from a specific source.
I *think* it is perfectly fine w.r.t the assumptions of the Dijkstra algorithm,
though knowing exactly what kind of graph this corresponds to is a bit more tedious.
I show one such example where I rely on this input weight later on.
And here we have it! A description of graphs that can serve as input for
the Dijkstra algorithm to solve the single-source shortest-path problem.
Note that this interface is completely agnostic to how we encode our graphs,
so long as we can extract a transition function from this underlying representation.
### Generic Dijkstra implementation ### Generic Dijkstra implementation
Now, let's implement this thing. Finally. It is time. We can implement the Dijkstra algorithm.
But first we need a priority queue, with the following interface:
We first need a priority queue, with the following interface:
```hs ```hs
type PQueue :: Type -> Type type PQueue :: Type -> Type
@ -245,29 +303,57 @@ minView (PQueue s) =
Just (x, s') -> Just (x, PQueue s') Just (x, s') -> Just (x, PQueue s')
Nothing -> Nothing Nothing -> Nothing
pattern EmptyQ = (Set.minView -> Nothing) pattern EmptyQ <- (minView -> Nothing)
pattern (:<) x q = (Set.minView -> Just (x, q)) pattern (:<) x q <- (minView -> Just (x, q))
``` ```
```{=html} ```{=html}
</details> </details>
``` ```
And last, the implementation for Dijkstra's algorithm: I haven't tried existing implementations available on Hackage yet,
I should get around to it at some point. It also looks like I may want
a priority *search* queue, so that I can really *update* the priority for a given key.
At last, the implementation for Dijkstra's algorithm:
```hs ```hs
dijkstra :: (Ix i, Cost c) => Dijkstra i c -> i -> Array i c dijkstra :: (Ix i, Weight c) => Dijkstra i c -> i -> Array i c
dijkstra (Dijkstra{..} :: Dijkstra i c) start = runST do dijkstra (Dijkstra{..} :: Dijkstra i c) start = runST do
-- create a mutable array to store the cost of all vertices costs :: STArray s i c <- newArray bounds infty
costs :: STArray s i c <- newArray bounds defaultCost let zero = startWeight start
-- update the cost of the starting position writeArray costs start zero
let startCost = initCost start let queue = singletonQ (zero, start)
writeArray costs start startCost aux costs queue
-- traverse the graph starting from the start
aux costs (Set.singleton (startCost, start))
-- return the minimal costs of all vertices
freeze costs freeze costs
where where
aux :: STArray s i c -> PQueue (c, i) -> ST s ()
aux :: forall s. STArray s i c -> PQueue (c, i) -> ST s ()
aux costs EmptyQ = pure ()
aux costs ((_, u) :< queue) = do
uWeight' <- readArray costs u
-- because of how our pure PQueue works,
-- we cannot really "update" the priority of an element in the queue
-- instead, we just insert it again, with a lower priority
-- so, if the cost just popped off the queue is larger than the one already known
-- it's because we've already visited the node.
when (uWeight == uWeight')
let
relaxNeighbour :: PQueue (c, i) -> (c, i) -> ST s (PQueue (c, i))
relaxNeighbour !queue (uvWeight, v) = do
let !vWeight = uWeight <> uvWeight
vWeight' <- readArray costs v
case vWeight `compare` vWeight' of
GT -> pure queue -- going through u yields a higher cost to v
EQ -> do -- same cost, we merge them
writeArray costs v $ updateWeight vWeight' vWeight
pure queue
LT -> do -- going through u decreases the cost of v
writeArray costs v vWeight
pure $ insertQ (vWeight, v) queue
in aux costs =<< foldM relaxNeighbour queue (next u uWeight)
``` ```
--- ---
@ -342,8 +428,7 @@ data Path i = Path !Int [i]
Given that we only want to find *a* shortest path, we can put paths with the Given that we only want to find *a* shortest path, we can put paths with the
same length in the same equivalence class, and compare paths only by looking same length in the same equivalence class, and compare paths only by looking
atChar at their length.
their length.
```hs ```hs
instance Eq (Path i) where instance Eq (Path i) where
@ -393,6 +478,24 @@ instance Cost (Paths i) where
And... that's it! And... that's it!
## Closing thoughts
Here we are. I hope this weekend obsession of mine was interesting to someone.
It sure was quite surprising to me that an algorithm I was taught a while back
could be applied in a more general context quite easily.
Disclaimer: I have done little to no research about whether this generalization
has been discussed at large already. I did find a few research papers on routing
algorithms over networks that give more algebraic structure to weights. I don't think
they match one to one with what I describe here, because they seemed to be interested in
more general probems. And I haven't found anything targeted at a larger non-scientific audience.
But as always, if you have any feedback, or any additional insight on what's discussed here, please *reach out*!
Feel free to react on [reddit].
[reddit]: https://reddit.com
```{=html} ```{=html}
<!-- TODO: pre-compile katex to MathML only --> <!-- TODO: pre-compile katex to MathML only -->
<!-- <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.16.3/dist/katex.min.css" integrity="sha384-Juol1FqnotbkyZUT5Z7gUPjQ9gzlwCENvUZTpQBAPxtusdwFLRy382PSDx5UUJ4/" crossorigin="anonymous"> --> <!-- <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.16.3/dist/katex.min.css" integrity="sha384-Juol1FqnotbkyZUT5Z7gUPjQ9gzlwCENvUZTpQBAPxtusdwFLRy382PSDx5UUJ4/" crossorigin="anonymous"> -->