245 lines
8.1 KiB
Markdown
245 lines
8.1 KiB
Markdown
---
|
||
title: How achille works
|
||
---
|
||
|
||
### Caching
|
||
|
||
So far we haven't talked about caching and incremental builds.
|
||
Rest assured: **achille produces generators with robust incremental
|
||
builds** for free. To understand how this is done, we can simply look at the
|
||
definition of `Recipe m a b`:
|
||
|
||
```haskell
|
||
-- the cache is simply a lazy bytestring
|
||
type Cache = ByteString
|
||
|
||
newtype Recipe m a b = Recipe (Context a -> m (b, Cache))
|
||
```
|
||
|
||
In other words, when a recipe is run, it is provided a **context** containing
|
||
the input value, **a current cache** *local* to the recipe, and some more
|
||
information. The IO action is executed, and we update the local cache with the
|
||
new cache returned by the recipe. We say *local* because of how composition of
|
||
recipes is handled internally. When the *composition* of two recipes (made with
|
||
`>>=` or `>>`) is being run, we retrieve two bytestrings from the local cache
|
||
and feed them as local cache to both recipes respectively. Then we gather the two updated
|
||
caches, join them and make it the new cache of the composition.
|
||
|
||
This way, a recipe is guaranteed to receive the same local cache it returned
|
||
during the last run, *untouched by other recipes*. And every recipe is free to
|
||
dispose of this local cache however it wants.
|
||
|
||
As a friend noted, **achille** is "just a library for composing memoized
|
||
computations".
|
||
|
||
----
|
||
|
||
#### High-level interface
|
||
|
||
Because we do not want the user to carry the burden of updating the cache
|
||
manually, **achille** comes with many utilies for common operations, managing
|
||
the cache for us under the hood. Here is an exemple highlighting how we keep
|
||
fine-grained control over the cache at all times, while never having to
|
||
manipulate it directly.
|
||
|
||
Say you want to run a recipe for every file maching a glob pattern, *but do
|
||
not care about the output of the recipe*. A typical exemple would be to copy
|
||
every static asset of your site to the output directory. **achille** provides
|
||
the `match_` function for this very purpose:
|
||
|
||
```haskell
|
||
match_ :: Glob.Pattern -> Recipe FilePath b -> Recipe a ()
|
||
```
|
||
|
||
We would use it in this way:
|
||
|
||
```haskell
|
||
copyAssets :: Recipe a ()
|
||
copyAssets = match_ "assets/*" copyFile
|
||
|
||
main :: IO ()
|
||
main = achille copyAssets
|
||
```
|
||
|
||
Under the hood, `match_ p r` will cache every filepath for which the recipe was
|
||
run. During the next run, for every filepath matching the pattern, `match_ p r` will
|
||
lookup the path in its cache. If it is found and hasn't been modified since,
|
||
then we do nothing for this path. Otherwise, the task is run and the filepath
|
||
added to the cache.
|
||
|
||
Now assume we do care about the output of the recipe we want to run on every filepath.
|
||
For example if we compile every blogpost, we want to retrieve each blogpost's title and
|
||
the filepath of the compiled `.html` file. In that case, we can use the
|
||
built-in `match` function:
|
||
|
||
```haskell
|
||
match :: Binary b
|
||
=> Glob.Pattern -> Recipe FilePath b -> Recipe a [b]
|
||
```
|
||
|
||
Notice the difference here: we expect the type of the recipe output `b` to have
|
||
an instance of `Binary`, **so that we can encode it in the cache**. Fortunately,
|
||
many of the usual Haskell types have an instance available. Then we can do:
|
||
|
||
```haskell
|
||
data PostMeta = PostMeta { title :: Text }
|
||
renderPost :: Text -> Text -> Text
|
||
renderIndex :: [(Text, FilePath)] -> Text
|
||
|
||
buildPost :: Recipe FilePath (Text, FilePath)
|
||
buildPost = do
|
||
(PostMeta title, pandoc) <- compilePandocMeta
|
||
renderPost title pdc & saveAs (-<.> "html")
|
||
<&> (title,)
|
||
|
||
buildPost :: Recipe a [(Text, FilePath)]
|
||
buildPosts = match "posts/*.md" buildPost
|
||
|
||
buildIndex :: [(Text, FilePath)] -> Recipe
|
||
```
|
||
|
||
#### Shortcomings
|
||
|
||
The assertion *"A recipe will always receive the same cache between two runs"*
|
||
can only violated in the two following situations:
|
||
|
||
- There is **conditional branching in your recipes**, and more specifically,
|
||
**branching for which the branch taken can differ between runs**.
|
||
|
||
For example, it is **not** problematic to do branching on the extension of a file,
|
||
as the same path will be taken each execution.
|
||
|
||
But assuming you want to parametrize by some boolean value for whatever reason,
|
||
whose value you may change between runs, then because the two branches will
|
||
share the same cache, every time the boolean changes, the recipe will start
|
||
from an inconsistent cache so it will recompute from scratch, and overwrite
|
||
the existing cache.
|
||
|
||
```haskell
|
||
buildSection :: Bool -> Task IO ()
|
||
buildSection isProductionBuild =
|
||
if isProductionBuild then
|
||
someRecipe
|
||
else
|
||
someOtherRecipe
|
||
```
|
||
|
||
Although I expect few people ever do this kind of conditional branching for
|
||
generating a static site, **achille** still comes with combinators for branching.
|
||
You can use `if` in order to keep two separate caches for the two branches:
|
||
|
||
```haskell
|
||
if :: Bool -> Recipe m a b -> Recipe m a b -> Recipe m a b
|
||
```
|
||
|
||
The previous example becomes:
|
||
|
||
```haskell
|
||
buildSection :: Bool -> Task IO ()
|
||
buildSection isProductionBuild =
|
||
Achille.if isProductionBuild
|
||
someRecipe
|
||
someOtherRecipe
|
||
```
|
||
|
||
### No runtime failures
|
||
|
||
All the built-in cached recipes **achille** provides are implemented carefully
|
||
so that **they never fail in case of cache corruption**. That is, in the
|
||
eventuality of failing to retrieve the desired values from the cache, our
|
||
recipes will automatically recompute the result from the input, ignoring the
|
||
cache entirely. To make sure this is indeed what happens, every cached recipe
|
||
in **achille** has been tested carefully (not yet really, but it is on the todo
|
||
list).
|
||
|
||
This means the only failures possible are those related to poor content
|
||
formatting from the user part: missing frontmatter fields, watching files
|
||
that do not exist, etc. All of those are errors are gracefully reported to the
|
||
user.
|
||
|
||
### Parallelism
|
||
|
||
**achille** could very easily support parallelism for free, I just didn't take
|
||
the time to make it a reality.
|
||
|
||
## Recursive recipes
|
||
|
||
It is very easy to define recursive recipes in **achille**. This allows us to
|
||
traverse and build tree-like structures, such as wikis.
|
||
|
||
For example, given the following structure:
|
||
|
||
```bash
|
||
content
|
||
├── index.md
|
||
├── folder1
|
||
│ └── index.md
|
||
└── folder2
|
||
├── index.md
|
||
├── folder21
|
||
│ └── index.md
|
||
├── folder22
|
||
│ └── index.md
|
||
└── folder23
|
||
├── index.md
|
||
├── folder231
|
||
│ └── index.md
|
||
├── folder222
|
||
│ └── index.md
|
||
└── folder233
|
||
└── index.md
|
||
```
|
||
|
||
We can generate a site with the same structure and in which each index page has
|
||
links to its children:
|
||
|
||
```haskell
|
||
renderIndex :: PageMeta -> [(PageMeta, FilePath)] -> Text -> Html
|
||
|
||
buildIndex :: Recipe IO a (PageMeta, FilePath)
|
||
buildIndex = do
|
||
children <- walkDir
|
||
|
||
matchFile "index.*" do
|
||
(meta, text) <- compilePandoc
|
||
renderIndex meta children text >>= save (-<.> "html")
|
||
return $ (meta,) <$> getInput
|
||
|
||
walkDir :: Recipe IO a [(PageMeta, FilePath)]
|
||
walkDir = matchDir "*/" buildIndex
|
||
|
||
main :: IO ()
|
||
main = achille buildIndex
|
||
```
|
||
|
||
## Forcing the regeneration of output
|
||
|
||
Currently, **achille** doesn't track what files a recipe produces in the output
|
||
dir. This means you cannot ask for things like *"Please rebuild
|
||
output/index.html"*.
|
||
|
||
That's because we make the assumption that the output dir is untouched between
|
||
builds. The only reason I can think of for wanting to rebuild a specific page
|
||
is if the template used to generate it has changed.
|
||
But in that case, the template is *just another input*.
|
||
So you can treat it as such by putting it in your content directory and doing
|
||
the following:
|
||
|
||
```haskell
|
||
import Templates.Index (renderIndex)
|
||
|
||
buildIndex :: Task IO ()
|
||
buildIndex =
|
||
watchFile "Templates/Index.hs" $ match_ "index.*" do
|
||
compilePandoc <&> renderIndex >>= write "index.html"
|
||
```
|
||
|
||
This way, **achille** will automatically rebuild your index if the template has
|
||
changed!
|
||
|
||
While writing these lines, I realized it would be very easy for **achille**
|
||
to know which recipe produced which output file,
|
||
so I might just add that. Still, it would still require you to ask for an output
|
||
file to be rebuilt if a template has changed. With the above pattern, it is
|
||
handled automatically!
|