Using Metaprogramming for Architecting Flow in Elixir
TL;DR You can adapt the ideas presented in the last article to any kind of application. Macros can be used to build a Plug-like DSL for your specific use-case. But be careful: Use metaprogramming wisely and only where it’s a good fit. Do not try to build the “Plug for everything”.
In the last article we explored a third concept next to |>
or with
to model how data flows through our program: The Token
approach.
One of the questions that will immediately come to mind is:
Can I do the same in my app and utilize this for my own use-case?
Well, of course, you can. Let’s go down the rabbit hole.
Adapting Plug for Your Use-Case
Our use-case from the previous article is the conversion of images via a Mix task:
All activities in the BPMN flow chart above should be pluggable (green tasks).
The major properties of Plug are:
- A Plug is a module (or function) that takes a
Plug.Conn
struct and returns a (modified)Plug.Conn
struct. - Each request is processed by a Plug pipeline, a series of plugs that get invoked one after another.
- The
Plug.Conn
struct contains all information received in the request and all information necessary to give a response to the request.
We will call our “Plugs” simply “Steps”, because they represent steps in a business process (and because naming things is hard 😄).
A Step will be defined as a module implementing the Step
behaviour.
defmodule Converter.Step do
# Plug also supports functions as Plugs
# we could do that, but for the sake of this article, we won't :)
@type t :: module
@callback call(token :: Converter.Step.t()) :: Converter.Step.t()
defmacro __using__(_opts \\ []) do
quote do
@behaviour Converter.Step
alias Converter.Token
end
end
end
Next, we refactor our first activity into a Step:
defmodule Converter.Task.ParseOptions do
use Converter.Step
@default_glob "./image_uploads/*"
@default_target_dir "./tmp"
@default_format "jpg"
def call(%Token{argv: argv} = token) do
{opts, args, _invalid} =
OptionParser.parse(argv, switches: [target_dir: :string, format: :string])
glob = List.first(args) || @default_glob
target_dir = opts[:target_dir] || @default_target_dir
format = opts[:format] || @default_format
%Token{token | glob: glob, target_dir: target_dir, format: format}
end
end
This Step module is already fully functional. We can call it like this:
argv = ["my_images_dir/", "--target_dir", "my_output_dir", "--format", "png"]
token = %Converter.Token{argv: argv}
Converter.Step.ParseOptions.call(token)
# => %Converter.Token{argv: ["my_images_dir/", "--target_dir", "my_output_dir", "--format", "png"], errors: nil, filenames: nil, format: "png", glob: "my_images_dir/", halted: nil, results: nil, target_dir: "my_output_dir"}
Next, we have to be able to define the equivalent of Plug pipelines, i.e. a way to plug several Step modules together and then be able to call them like we would call a “single” Step.
Ideally, the DSL would be as clean as Plug’s:
defmodule Converter.MyProcess do
use Converter.StepBuilder
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
step Converter.Step.PrepareConversion
step Converter.Step.ConvertImages
step Converter.Step.ReportResults
end
That does look pretty nice. But how do we get there?
The use Converter.StepBuilder
part is where the metaprogramming starts:
defmodule Converter.StepBuilder do
# this macro is invoked by `use Converter.StepBuilder`
defmacro __using__(_opts \\ []) do
quote do
# we enable the module attribute `@steps` to accumulate all its values;
# this means that the value of this attribute is not reset when
# set a second or third time, but rather the new values are prepended
Module.register_attribute(__MODULE__, :steps, accumulate: true)
# register this module to be called before compiling the source
@before_compile Converter.StepBuilder
# import the `step/1` macro to build the pipeline
import Converter.StepBuilder
# implement the `Step` behaviour's callback
def call(token) do
# we defer this call to a function, which we will generate at compile time;
# we can't generate this function (`call/1`) directly because we would get
# a compiler error since the function would be missing when the compiler
# checks run
do_call(token)
end
end
end
# this macro gets used to register another Step with our pipeline
defmacro step(module) do
quote do
# this is why we set the module attribute to `accumulate: true`:
# all Step modules will be stored in this module attribute,
# so we can read them back before compiling
@steps unquote(module)
end
end
# this macro is called after all macros were evaluated (e.g. the `use` statement
# and all `step/1` calls), but before the source gets compiled
defmacro __before_compile__(_env) do
quote do
# this quoted code gets inserted into the module containing
# our `use Converter.StepBuilder` statement
defp do_call(token) do
# we are reading the @steps and hand them to another function for execution
#
# IMPORTANT: the reason for deferring again here is that we want to do
# as little complexity as possible in our generated code in
# order to minimize the implicitness in our code!
steps = Enum.reverse(@steps)
Converter.StepBuilder.call_steps(token, steps)
end
end
end
def call_steps(initial_token, steps) do
# to implement the "handing down" of our token through the pipeline,
# we utilize `Enum.reduce/3` and use the accumulator to store the token
Enum.reduce(steps, initial_token, fn step, token ->
step.call(token)
end)
end
end
That seems like a lot to take in. But at the end, it’s rather trivial:
- Each call to
step/1
adds another module to the@steps
attribute. - Right before compiling, we generate a
do_call/1
function, which reads the accumulated Step modules from this attribute. - A third function is used to actually call all the Steps. We do this to minimize the work done in the generated parts of our code.
- Also, please note how there is no reference to
Converter.Token
in ourStepBuilder
and how it’s just ~40 lines of code. That’s pretty cool!
Our Mix task now looks like this:
defmodule Mix.Tasks.ConvertImages do
use Mix.Task
alias Converter.MyProcess
alias Converter.Token
# `run/1` simply calls the pipeline
def run(argv), do: MyProcess.call(%Token{argv: argv})
end
We could also define the pipeline directly in the Mix task, in order to have everything in one place:
defmodule Mix.Tasks.ConvertImages do
use Converter.StepBuilder
use Mix.Task
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
step Converter.Step.PrepareConversion
step Converter.Step.ConvertImages
step Converter.Step.ReportResults
def run(argv), do: call(%Converter.Token{argv: argv})
end
I really like how this provides visibility into the “business process” that our code is concerned with. This piece of code can serve as an entrypoint for new contributors, since it is not only the runtime blueprint, but it also serves as documentation.
If you read this far, take a deep breath. You’re about to take the red pill.
Advanced Metaprogramming for Complex Flows
In most cases, business processes are more complicated than our example.
Even the flow of our Mix task is less trivial than we made it out to be: This diagram completely ignores the fact that this flow has at least two different outcomes: one following an early exit, where the given arguments can not be validated and a happy one, where images can be found and converted successfully.
If we remodel our process based on this insight, the result looks something like this:
In order to be able to express this change in our MyProcess
module, we will have to be able to provide a filter condition to step/1
, which expresses under which circumstances a Step module should be called:
defmodule Converter.MyProcess do
use Converter.StepBuilder
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
# we'll provide the conditions via a keyword
step Converter.Step.PrepareConversion, if: token.errors == []
step Converter.Step.ConvertImages, if: token.errors == []
step Converter.Step.ReportResults, if: token.errors == []
# `if:` is not something Elixir provides, we'll have to implement it ourselves
# also, we could have named this any way we wanted, `if:` just seemed obvious
step Converter.Step.ReportErrors, if: token.errors != []
end
With this, we can model the flow from the diagram.
Let’s see how this is done.
Compiling Steps as Case-Statements
To add the dynamic conditions provided via if:
, we have to revise our approach from the beginning and rework our __before_compile__/1
and step/1
macros:
defmodule Converter.StepBuilder do
defmacro __using__(_opts \\ []) do
# this macro remains unchanged
quote do
Module.register_attribute(__MODULE__, :steps, accumulate: true)
@before_compile Converter.StepBuilder
import Converter.StepBuilder
def call(token) do
do_call(token)
end
end
end
defmacro step(module) do
quote do
# we are now using 2-element-sized tuples to save the steps
# (the second element will be used to store the given conditions)
@steps {unquote(module), true}
end
end
defmacro __before_compile__(env) do
# read steps from `env` (they are in reverse order, like before)
steps = Module.get_attribute(env.module, :steps)
# we are compiling the body of our `do_call/1` as a quoted expression
body = Converter.StepBuilder.compile(steps)
quote do
# unlike before, we do not call another function, but rather unquote the
# body returned by `Converter.StepBuilder.compile/1`
defp do_call(token) do
unquote(body)
end
end
end
def compile(steps) do
token = quote do: token
# we use Enum.reduce/3 like before, but this time we are compiling all the
# calls at compile-time into multiple nested case-statements
Enum.reduce(steps, token, &compile_step/2)
end
defp compile_step({step, _conditions}, acc) do
quoted_call =
quote do
unquote(step).call(token)
end
# this is where the magic happens: we generate a case-statement for
# each call and nest them into each other
quote do
case unquote(quoted_call) do
%Converter.Token{} = token ->
# this is where all the previously compiled case-statements are inserted
# thereby "wrapping" them in this new case-statement
unquote(acc)
_ ->
raise unquote("expected #{inspect(step)}.call/1 to return a Token")
end
end
end
end
Okay, that was fast. Here’s how the “nested case-statements” technique works:
When we read the steps
attribute, we get the reversed list of all steps:
Converter.Step.ReportResults
Converter.Step.ConvertImages
Converter.Step.PrepareConversion
Converter.Step.ValidateOptions
Converter.Step.ParseOptions
Using Enum.reduce/3
, we then start with a token
…
# NOTE: the plus sign (+) isn't code; it marks the lines added in each iteration
+ | token
… then wrap a case-statement for the first step in our list around it …
+ | case Converter.Step.ReportResults.call(token) do
+ | %Converter.Token{} = token ->
| token
|
+ | _ ->
+ | raise("expected Converter.Step.ReportResults.call/1 to return a Token")
+ | end
… and with each iteration of the reducer, we wrap the previous block in a new case-statement for the current step in our list …
+ | case Converter.Step.ConvertImages.call(token) do
+ | %Converter.Token{} = token ->
| case Converter.Step.ReportResults.call(token) do
| %Converter.Token{} = token ->
| token
|
| _ ->
| raise("expected Converter.Step.ReportResults.call/1 to return a Token")
| end
|
+ | _ ->
+ | raise("expected Converter.Step.ConvertImages.call/1 to return a Token")
+ | end
At the end we get a long list of nested case-statements representing our flow:
defp do_call(token) do
case Converter.Step.ParseOptions.call(token) do
%Converter.Token{} = token ->
case Converter.Step.ValidateOptions.call(token) do
%Converter.Token{} = token ->
case Converter.Step.PrepareConversion.call(token) do
%Converter.Token{} = token ->
case Converter.Step.ConvertImages.call(token) do
%Converter.Token{} = token ->
case Converter.Step.ReportResults.call(token) do
%Converter.Token{} = token ->
token
_ ->
raise("expected Converter.Step.ReportResults.call/1 to ...")
end
_ ->
raise("expected Converter.Step.ConvertImages.call/1 to ...")
end
_ ->
raise("expected Converter.Step.PrepareConversion.call/1 to ...")
end
_ ->
raise("expected Converter.Step.ValidateOptions.call/1 to ...")
end
_ ->
raise("expected Converter.Step.ParseOptions.call/1 to ...")
end
end
That’s a lot to take in. But in the end, it’s not that complicated:
- We call a step and check the result via a
case
macro. - If there is an unexpected return, we raise an exception.
- If not, we put the result into the next step and so on …
Think of it as a series of assignments …
result1 =
case step1(token) do
%Token{} = result1 ->
result1
_ ->
raise "Step1 did not work!"
end
result2 =
case step2(result1) do
%Token{} = result2 ->
result2
_ ->
raise "Step2 did not work!"
end
result3 =
case step3(result2) do
# and so on ...
end
… only that we nest the case-statements instead of assigning them to variables.
case step1(token) do
%Token{} = result1 ->
case step2(result1) do
%Token{} = result2 ->
case step3(result2) do
# and so on ...
end
_ ->
raise "Step2 did not work!"
end
_ ->
raise "Step1 did not work!"
end
Adding Conditions to Steps
Next, we want to add our conditionals:
defmodule Converter.MyProcess do
use Converter.StepBuilder
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
step Converter.Step.PrepareConversion, if: token.errors == []
step Converter.Step.ConvertImages, if: token.errors == []
step Converter.Step.ReportResults, if: token.errors == []
step Converter.Step.ReportErrors, if: token.errors != []
end
We achieve this by adding a new macro to our StepBuilder
: step/2
defmacro step(module, if: conditions) do
quote do
# the second element of the tuple stores the given conditions
@steps {unquote(module), unquote(Macro.escape(conditions))}
end
end
… and by updating compile_step/2
to include the given conditions:
defp compile_step({step, conditions}, acc) do
quoted_call =
quote do
unquote(step).call(token)
end
quote do
# instead of just calling the Step, we are compiling the given conditions
# into the call
result = unquote(compile_conditions(quoted_call, conditions))
case result do
%Converter.Token{} = token ->
unquote(acc)
_ ->
raise unquote("expected #{inspect(step)}.call/1 to return a Token")
end
end
end
defp compile_conditions(quoted_call, true) do
# if no conditions were given, we simply call the Step
quoted_call
end
defp compile_conditions(quoted_call, conditions) do
quote do
# we have to use `var!/1` for our variable to be accessible
# by the code inside `conditions`
var!(token) = token
# to avoid "unused variable" warnings, we assign the variable to `_`
_ = var!(token)
if unquote(conditions) do
# if the given conditions are truthy, we call the Step
unquote(quoted_call)
else
# otherwise, we just return the token
token
end
end
end
This compiles the step and conditions into a block of code, which ensures access to the current token
and tests the given conditions with an if
statement.
Here’s an example for the last step ReportErrors
, which should only be invoked if token.errors != []
:
result =
(
var!(token) = token
_ = var!(token)
if token.errors() != [] do
Converter.Step.ReportErrors.call(token)
else
token
end
)
case result do
%Converter.Token{} = token ->
token
_ ->
raise("expected Converter.Step.ReportErrors.call/1 to return a Token")
end
These blocks of code are then nested into each other like explained before. The generated code might seem cumbersome, but since it is generated at compile-time, you do not have to actually read it.
Icing on the Cake: Adding cond
-like blocks to Steps
We are now able to model our improved flow diagram. But we’re just not there yet.
We don’t want to write the same if:
conditional for each individual step on a path.
Ideally, we want to recognize the paths from the flow diagram in our code without comparing conditions.
To achieve this, we will add a cond
-like syntax to our step/1
macro:
defmodule Converter.MyProcess do
use Converter.StepBuilder
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
step do
token.errors == [] ->
step Converter.Step.PrepareConversion
step Converter.Step.ConvertImages
step Converter.Step.ReportResults
token.errors != [] ->
step Converter.Step.ReportErrors
end
end
We can implement this in a rather simple fashion by simply appending all step/1
calls inside a conditional block with the conditions given in the block’s head.
defmacro step(do: clauses) do
Enum.reduce(clauses, nil, fn {:->, _, [[conditions], args]}, acc ->
# we collect all calls inside the current `->` block ...
quoted_calls =
case args do
{:__block__, _, quoted_calls} -> quoted_calls
single_quoted_call -> [single_quoted_call]
end
# ... and add conditions where applicable
quote do
unquote(acc)
unquote(add_conditions(quoted_calls, conditions))
end
end)
end
defp add_conditions(list, conditions) when is_list(list) do
Enum.map(list, &add_conditions(&1, conditions))
end
# quoted calls to our `step/1` macro look like this:
#
# {:step, _, [MyStepModule]}
#
# so all we have to do is append the `if:` condition
#
# {:step, _, [MyStepModule, [if: conditions]]}
#
defp add_conditions({:step, meta, args}, conditions) do
{:step, meta, args ++ [[if: conditions]]}
end
# if we encounter any other calls, we just leave them intact
defp add_conditions(ast, _conditions) do
ast
end
What this does is simply rewriting this
step do
token.errors == [] ->
step Converter.Step.PrepareConversion
step Converter.Step.ConvertImages
step Converter.Step.ReportResults
token.errors != [] ->
step Converter.Step.ReportErrors
end
to this
step Converter.Step.PrepareConversion, if: token.errors == []
step Converter.Step.ConvertImages, if: token.errors == []
step Converter.Step.ReportResults, if: token.errors == []
step Converter.Step.ReportErrors, if: token.errors != []
I know what you’re thinking: This is freakin’ awesome!
But it is also freakin’ scary: We just wrote a macro that rewrites macro calls that generate macros used to dynamically write new code paths via AST manipulation during compile-time.
With great power comes great responsibility
At this point, you won’t be surprised to hear that Elixir’s metaprogramming facilities are sometimes referred to as “sane insanity”, because you can do these insane things, but at least only at compile-time.
What I want you to take away is this:
- Write macros responsibly
- Never use a macro where a simple function call works just as good.
- Avoid excessive use of metaprogramming as it tends to make things implicit, side effects less obvious and debugging a nightmare.
- Realize that it is more important to understand the principles behind the presented ideas than to build a magically generic solution.
- Do not attempt to build the “Plug for everything”.
Build a solution tailored towards your specific problem, since this is the real strength of metaprogramming: You can build a great DSL that uses conditionals, case-statements, pattern matching and guards under the hood to abstract away the most common use-case of your domain.
Plug & Phoenix do this and the router is a great example of how to create a meaningful DSL for the most common use-case in a large domain!
Conclusion
Building your own DSL using metaprogramming can be super benefical.
To use our example: Once you have a diagram like this …
… and the corresponding code actually looks like this …
defmodule Converter.MyProcess do
use Converter.StepBuilder
step Converter.Step.ParseOptions
step Converter.Step.ValidateOptions
step do
token.errors == [] ->
step Converter.Step.PrepareConversion
step Converter.Step.ConvertImages
step Converter.Step.ReportResults
token.errors != [] ->
step Converter.Step.ReportErrors
end
end
… you will notice positive side effects (in addition to the satisfaction of writing Elixir):
- Onboarding new team members becomes easier, because there’s a clear entrypoint for new contributors.
- Discussing upcoming changes becomes more focussed, because the team can get a clear picture of what should happen quickly.
- And last but not least, simply reasoning about what happens with colleagues, managers and clients becomes less error-prone, because everybody is talking about the same thing.
The reason for this is that the flow of the program or request or data transformation is suddenly more visible, comprehensible and documented.