This afternoon, on a hunch, I wrote a short C++ program to compute Fibonacci numbers. Hardly a remarkable accomplishment, but here’s the catch: by the time this program begins executing, its result has already been computed.
Just to be sure we’re speaking the same language, you may want to review the mathematical definition of the Fibonacci sequence. We define the infinite series according to the following rules:
Thus , and likewise
,
,
, &c. My program takes
ARG as a compile-time parameter and prints ARG at execution time. For example:
> g++ -DARG=40 fib.cc -o fib40
> ./fib40
102334155
A slightly more roundabout way of compiling programs with g++ is to generate the assembly code and then assemble it:
> g++ -S -DARG=40 fib.cc
> g++ fib.s -o fib40
> ./fib40
102334155
Think of the intermediate file fib.s as containing a version of the original program stripped of all abstractions. Each line corresponds to a single machine instruction, and all that remains is to translate this barely human-readable shorthand into the binary language the processor understands.
Now, I’m sure you’ll agree that is a pretty peculiar number, a number you wouldn’t expect to find lurking inside
fib.s unless my earlier claim were true—unless the result of the fib40 program really had been computed before fib40 began executing. It would be like finding your social security number written in blood on the bathroom mirror at the scene of a serial murder. You’d want answers, and you’d want them right away.
> grep -n 102334155 fib.s
22: movl $102334155, 4(%esp)
OH MY GOD THERE IT IS!! Okay, okay, just—everybody calm down. We’ll get out of here together.
Writing such a program requires a solid grasp of C++ templates and the concept of partial specialization, two key ingredients of a general technique called template metaprogramming. Reading the code will be easier if you have any knowledge of these topics, but the basic structure of the code mirrors the mathematical definition, which I trust you understand. Puzzle through the syntax if you have to:
#include <iostream>
// Recursive case:
template <int N> struct Fib {
enum { val = Fib<N-1>::val
+ Fib<N-2>::val };
};
// Base cases:
template <> struct Fib<0> { enum { val = 0 }; };
template <> struct Fib<1> { enum { val = 1 }; };
int main() {
std::cout << Fib<ARG>::val << std::endl;
return 0;
}
Compile:
> time g++ -DARG=40 fib.cc -o fib40
g++ -DARG=40 fib.cc -o fib40 0.23s user 0.05s system 97% cpu 0.291 total
And run:
> time ./fib40
102334155
./fib40 0.00s user 0.00s system 80% cpu 0.002 total
As before, . No surprise there. If we changed
ARG to 10, we would get 55 instead of 102334155, but I’ll let you try that on your own. Notice the benefit of performing computation at compile time: the command time ./fib40 reports nearly instantaneous execution time, because the fib40 program really is just printing a precomputed value.
If you’ve seen Fibonacci numbers computed recursively before, you know how costly the computation becomes as the input grows large, so it may surprise you that the compilation of fib40 takes so little time: about 0.23 seconds, and probably only a fraction of that devoted to computing , considering the overhead of compilation.
(If you haven’t seen the math before, I’ll illustrate briefly: the number of subterms that contribute to the final sum is on the order of
, thanks to the binary branching in the recursive case, so you’d expect computing
to take roughly twice as long as computing
. Even if
took all 0.23 seconds, therefore,
would still take something like
as long, about two tenths of a picosecond, which is an absurdly short amount of time.)
We can estimate the contribution from overhead by timing the compilation of fib0:
> time g++ -DARG=0 fib.cc -o fib0
g++ -DARG=0 fib.cc -o fib0 0.23s user 0.05s system 96% cpu 0.291 total
Again, 0.23s. This means the computation is no more visible amidst the noise of compilation than the computation of
. How is this possible? Is
g++ exempt from the laws of the universe? Is it cheating? What’s going on here?
My hunch, it seems, was correct: the compile-time computation of Fibonacci numbers takes linear rather than exponential time, because Fib<N>::val once computed never needs to be recomputed. In order for the expression Fib<40>::val even to compile, g++ must first find a way to compile Fib<39>::val and Fib<38>::val, and in order to compile Fib<39>::val it has to know about Fib<38>::val and Fib<37>::val. In the exponential implementation, is needed twice and so it is computed twice, a massively redundant waste.
In my implementation this double computation is not only unnecessary; it’s impossible, since C++ forbids redeclaration of Fib<38>. The recursion, if you can call it that, happens backwards, because the subterms F<N-1>::val and F<N-2>::val are guaranteed to be available already when each struct F<N> is declared. The compiler has to figure out an order of declarations that ensures this availability, but that’s not my problem.
Critics of template metaprogramming often complain about the convoluted syntax, and they certainly have a point. If I’ve demonstrated anything here, I hope I’ve persuaded you that template metaprogramming has its place, sometimes. I took the simplest, most natural approach in my implementation (subject to the constraint that all computation should happen at compile time), an approach almost identical in structure to the incredibly inefficient solution taught as a cautionary example in introductory algorithms courses, an approach I should have known better than to use. And yet the compiler found a way to make this simplistic implementation run efficiently. That’s what template metaprogramming is all about: using the compiler’s own algorithms to solve problems whose answers are the same every time the program executes.
I was planning to end this post by playing a little trick on the compiler, but I think the trick deserves a post of its own. Perhaps you can anticipate it: how could you change the Fibonacci program so that it takes exponential time to generate the same machine code?

Mozilla is paying you too much if this is what you do on their time.. Keith is a big fan of this template meta-programming stuff. I've only found limited use for it, but I'm not so l33t.
found your blog! (via adam hahn's)
c++ templates are truly evil incarnate.
have you seen D's "static" functions? does this sort of thing really nicely.
Hey now. I did this across a couple of evenings, not at work. And template metaprogramming gives me the same sort of enjoyment that I get from Karel the Robot, if that tells you anything.