Build Automation Part 1

A blog post about Ant vs. Maven concludes that “the best build tool is the one you write yourself” and the Programmer Competency Matrix has “can setup a script to build the system” as requirement for reaching the higher levels in the “build automation” row.

I have looked at a lot of build systems myself, and while I agree that the best build system is the one you create yourself I am also a big fan of make and believe that the best approach is to use generated Makefiles.

This post is a “getting started with make”. I plan to follow up with a part 2 about how to handle auto-generated self-updating Makefiles.

Concept

The UNIX philosophy is to have small tools (commands) which solve a well defined problem. These can then be combined to build more complex systems.

While each build process is different, the common denominator is that we should be able to represent our target(s) as nodes in a directed acyclic graph where each node represents a file and each edge represents a dependency.

This is what a Makefile captures, i.e. a Makefile should be a declaration of the dependency graph with actions per node to create it if (the file it corresponds to on disk) is missing or older than its dependencies, i.e. the nodes we can reach from the (directed) edges.

By keeping the dependency information declarative we let make figure out which files are outdated and need to be rebuilt plus give it freedom to pick a strategy to rebuild files which may include running jobs in parallel.

Example

To give an example let us look at the generate_keys script which is part of Sparkle and can generate a public and private key file.

The public key is extracted from the private key and the private key requires a DSA parameter file (we’ll ignore the -genkey flag to dsaparam).

So our (simple) graph looks like this:

pubkey → privkey → dsa_parameters

A Makefile “rule” is effectively one node in our graph and looks like:

«goal»: «dependencies»
	«action»

Here «goal» is the node itself, that is, the file it represents. The «dependencies» is the nodes it depends on and «action» is the command(s) to execute to generate/update the node/file (interpreted by the shell).

Using the generate_keys script as source our Makefile ends up like this:

pubkey: privkey
	openssl dsa -in '$<' -pubout -out '$@'

privkey: dsa_parameters
	openssl gendsa '$<' -out '$@'

dsa_parameters:
	openssl dsaparam 2048 < /dev/urandom -out '$@'

In the above I have used two variables. The variable $@ expands to the goal (i.e. the file we are generating) and $< expands to the first dependency.

If you save the above as Makefile and run make then it will generate 3 files: pubkey, privkey, and dsa_parameters. By default calling make without arguments will ensure the first goal in the Makefile is up to date. If you re-run make it should say:

make: `pubkey' is up to date.

You can also run make privkey to ensure (only) privkey is up to date (which then won’t extract the public key).

Intermediate Files

The above Makefile reproduce the script except that we are not removing the temporary dsa_parameters file after having generated the keys. We can fix this by making dsa_parameters a dependency of the fake .INTERMEDIATE goal by adding this line:

.INTERMEDIATE: dsa_parameters

If we now run make it will automatically remove the dsa_parameters file after it has been used.

We probably want to use our public key from C so let us add another goal (node) namely pubkey.h. This goal will create a C header from the pubkey file, so it will depend on it. This goal can be handled by adding the following rule:

pubkey.h: pubkey
	{ echo 'static char const* pubkey ='; \
	  sed < '$<' -e $$'s/.*/\t"&\\\\n"/'; \
	  echo ';'; } > '$@'

Perhaps not the nicest way to generate the pubkey.h file but what is nice about this is that whatever application needs to use this header can declare it as a dependency, and it will be generated when needed, including extracting the public key if not already done.

Includes

To keep things modular we can save our Makefile as Makefile.keys and include it from our main Makefile using:

include Makefile.keys

If we go back to the Sparkle distribution there is also a sign_update script which signs an update using the private key.

We can add this as another goal to our Makefile, e.g. using:

archive.sig: privkey archive.tbz
	openssl dgst -dss1 -sign privkey archive.tbz

Here the archive signature depends on both having a private key and an archive. The private key will be generated if not already there, the archive we of course need to add another goal to create. The archive goal will depend on our actual binary which will depend on its object files which will depend on the sources (where one source is likely going to depend on pubkey.h).

Phony Targets

In addition we probably want to add another goal to construct an RSS feed (or similar) which include the archive signature and eventually we will want a deploy goal which will depend on the RSS feed and the archive. The action for this goal will likely be using scp to copy the files to the server and the goal itself will not be a file, i.e. when we run make deploy we do not expect an actual deploy file to be generated. While there is little harm in declaring a goal with actions that do not generate the file, we could risk getting a:

make: `deploy' is up to date.

If there actually is a deploy file which is newer then the dependencies of the deploy goal. To avoid this we make the fake goal named .PHONY depend on deploy similar to what we did with the .INTERMEDIATE goal:

.PHONY: deploy

Closing Words

This post is just a mild introduction to make. I have deliberately picked something that does not involve building C sources as the example to show that make is a versatile tool.

Whenever you have a set of actions that need to be run in a specific order then consider if a Makefile can capture the dependency graph.

When you do write a Makefile aim for having a rule only do one thing. For example imagine we are writing a manual and store each chapter as Markdown. Rather than do something like this:

chapter.html: header.html chapter.mdown footer.html
	{ cat header.html; \
	  Markdown.pl < chapter.mdown; \
	  cat footer.html } > '$@'

We can instead do:

chapter.html: header.html cache/chapter.html footer.html
	cat > '$@' $^

cache/chapter.html: chapter.mdown
	Markdown.pl < '$<' > '$@'

The new $^ variable expands to all the dependencies.

There are a few reasons to favor this approach. In this concrete example we have the advantage of not needing to pipe all the chapters through Markdown.pl if we change the header or footer. But in general it just makes things more flexible, easier to re-use goals, faster to restart a failed build, it may improve the number of jobs that can run in parallel, etc.

SIGPIPE 13