Build Automation Part 1
A blog post about Ant vs. Maven concludes that “the best build tool is the one you write yourself” and the Programmer Competency Matrix has “can setup a script to build the system” as requirement for reaching the higher levels in the “build automation” row.
I have looked at a lot of build systems myself, and while I agree that the best build system is the one you create yourself I am also a big fan of make
and believe that the best approach is to use generated Makefiles.
This post is a “getting started with make
”. I plan to follow up with a part 2 about how to handle auto-generated self-updating Makefiles.
Concept
The UNIX philosophy is to have small tools (commands) which solve a well defined problem. These can then be combined to build more complex systems.
While each build process is different, the common denominator is that we should be able to represent our target(s) as nodes in a directed acyclic graph where each node represents a file and each edge represents a dependency.
This is what a Makefile captures, i.e. a Makefile should be a declaration of the dependency graph with actions per node to create it if (the file it corresponds to on disk) is missing or older than its dependencies, i.e. the nodes we can reach from the (directed) edges.
By keeping the dependency information declarative we let make
figure out which files are outdated and need to be rebuilt plus give it freedom to pick a strategy to rebuild files which may include running jobs in parallel.
Example
To give an example let us look at the generate_keys
script which is part of Sparkle and can generate a public and private key file.
The public key is extracted from the private key and the private key requires a DSA parameter file (we’ll ignore the -genkey
flag to dsaparam
).
So our (simple) graph looks like this:
pubkey → privkey → dsa_parameters
A Makefile “rule” is effectively one node in our graph and looks like:
«goal»: «dependencies»
«action»
Here «goal»
is the node itself, that is, the file it represents. The «dependencies»
is the nodes it depends on and «action»
is the command(s) to execute to generate/update the node/file (interpreted by the shell).
Using the generate_keys
script as source our Makefile ends up like this:
pubkey: privkey
openssl dsa -in '$<' -pubout -out '$@'
privkey: dsa_parameters
openssl gendsa '$<' -out '$@'
dsa_parameters:
openssl dsaparam 2048 < /dev/urandom -out '$@'
In the above I have used two variables. The variable $@
expands to the goal (i.e. the file we are generating) and $<
expands to the first dependency.
If you save the above as Makefile
and run make
then it will generate 3 files: pubkey
, privkey
, and dsa_parameters
. By default calling make
without arguments will ensure the first goal in the Makefile is up to date. If you re-run make
it should say:
make: `pubkey' is up to date.
You can also run make privkey
to ensure (only) privkey
is up to date (which then won’t extract the public key).
Intermediate Files
The above Makefile reproduce the script except that we are not removing the temporary dsa_parameters
file after having generated the keys. We can fix this by making dsa_parameters
a dependency of the fake .INTERMEDIATE
goal by adding this line:
.INTERMEDIATE: dsa_parameters
If we now run make
it will automatically remove the dsa_parameters
file after it has been used.
We probably want to use our public key from C so let us add another goal (node) namely pubkey.h
. This goal will create a C header from the pubkey
file, so it will depend on it. This goal can be handled by adding the following rule:
pubkey.h: pubkey
{ echo 'static char const* pubkey ='; \
sed < '$<' -e $$'s/.*/\t"&\\\\n"/'; \
echo ';'; } > '$@'
Perhaps not the nicest way to generate the pubkey.h
file but what is nice about this is that whatever application needs to use this header can declare it as a dependency, and it will be generated when needed, including extracting the public key if not already done.
Includes
To keep things modular we can save our Makefile as Makefile.keys
and include it from our main Makefile using:
include Makefile.keys
If we go back to the Sparkle distribution there is also a sign_update
script which signs an update using the private key.
We can add this as another goal to our Makefile, e.g. using:
archive.sig: privkey archive.tbz
openssl dgst -dss1 -sign privkey archive.tbz
Here the archive signature depends on both having a private key and an archive. The private key will be generated if not already there, the archive we of course need to add another goal to create. The archive goal will depend on our actual binary which will depend on its object files which will depend on the sources (where one source is likely going to depend on pubkey.h
).
Phony Targets
In addition we probably want to add another goal to construct an RSS feed (or similar) which include the archive signature and eventually we will want a deploy goal which will depend on the RSS feed and the archive. The action for this goal will likely be using scp
to copy the files to the server and the goal itself will not be a file, i.e. when we run make deploy
we do not expect an actual deploy
file to be generated. While there is little harm in declaring a goal with actions that do not generate the file, we could risk getting a:
make: `deploy' is up to date.
If there actually is a deploy
file which is newer then the dependencies of the deploy
goal. To avoid this we make the fake goal named .PHONY
depend on deploy
similar to what we did with the .INTERMEDIATE
goal:
.PHONY: deploy
Closing Words
This post is just a mild introduction to make
. I have deliberately picked something that does not involve building C sources as the example to show that make
is a versatile tool.
Whenever you have a set of actions that need to be run in a specific order then consider if a Makefile can capture the dependency graph.
When you do write a Makefile aim for having a rule only do one thing. For example imagine we are writing a manual and store each chapter as Markdown. Rather than do something like this:
chapter.html: header.html chapter.mdown footer.html
{ cat header.html; \
Markdown.pl < chapter.mdown; \
cat footer.html } > '$@'
We can instead do:
chapter.html: header.html cache/chapter.html footer.html
cat > '$@' $^
cache/chapter.html: chapter.mdown
Markdown.pl < '$<' > '$@'
The new $^
variable expands to all the dependencies.
There are a few reasons to favor this approach. In this concrete example we have the advantage of not needing to pipe all the chapters through Markdown.pl
if we change the header or footer. But in general it just makes things more flexible, easier to re-use goals, faster to restart a failed build, it may improve the number of jobs that can run in parallel, etc.