by Carl Burch, Hendrix College, September 2012
Over the last few decades, compiler researchers have made much progress toward compiling and optimizing functional languages to translate to efficient code on computers which are, after all, imperative in nature. But the most important optimization remains one of the oldest: tail recursion elimination. Understanding this simple optimization can enable you to take advantage of it, resulting in more efficient code when necessary.
A function is tail-recursive if its
recursive call is the final operation performed in order to
compute the return value. An example of such a function is the
the traditional contains
function.
contains n [] = False
contains n (x:xs) = if x == n then True else contains n xs
This is tail-recursive because the recursive call's return value is returned directly.
Tail-recursive functions are important because they can be
optimized into loop form: Rather than make a whole new function
call, we can simply alter the parameters in memory and jump
back to the top of the function. At left below, we have the naïve
translation of contains
into an imperative language,
while at right is the optimized version illustrating the
removal of tail-recursion.
boolean contains(int n, ListNode xs) {
if (xs == null) return false;
else if (xs.getValue() == n) return true;
else return contains(n, xs.getNext());
}
boolean contains(int n, ListNode xs) {
while (true) {
if (xs == null) return false;
else if (xs.getValue() == n) return true;
else xs = xs.getNext();
}
}
Since each function call consumes both additional space and additional time, the removal of tail recursion by the optimizer increases the performance of a function significantly — and it eliminates the possibility that the recursive function could overflow memory as it tries to remember variable values in each recursive call.
The elimination of tail recursion is not exclusive to functional language compilation: It is a standard optimization in imperative language compilers also. It is more important for functional languages, though, because they cannot have loops (since loops rely on a termination condition that will only hold true once the state changes) and must rely on recursion to express repetition of a process. It is so important, in fact, that the designers of Scheme actually wrote into the language definition a requirement that all genuine Scheme language systems must eliminate tail recursion.
To take advantage of this optimization, functional language
programmers often rewrite
functions so that they are tail recursive.
Below are two functions that are not intuitively tail-recursive
— one to determine the length of the list, and one to return
the reversal of a list — and the rewritten tail-recursive version.
(The intuitive length
function may initially appear
tail-recursive since the recursive call is the last item
appearing in the function definition. It is not tail-recursive,
though, because the return value of the recursive call is not
the return value for the function — it still must have 1
added to it. (Since integer addition is associative, an
aggressive compiler might conceivably optimize the length
function to the tail-recursive version automatically. I do not
know of compilers that do this, however.)
length [] = 0
length (_:xs) = 1 + length xs
length list = len list 0
where len [] n = n
len (x:xs) n = len xs (n + 1)
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]
reverse list = rev list []
where rev [] ret = ret
rev (x:xs) ret = rev xs (x:ret)
Notice how in both cases, the rewritten version involves creating a new function that takes an additional parameter that “accumulates” values as the function recurses downwards. This is a standard trick in rewriting functions to be tail-recursive.
The reverse
example is particularly interesting.
The intuitive version takes O(n²) time since appending a value
to a list takes O(n) time, and this is done for each
recursive call (1 + 2 + … + n = O(n²)).
The rewritten version, however, works very differently.
The below illustrates it at work.
reverse [2,3,4]
= rev [2,3,4] []
= rev [3,4] [2]
= rev [4] [3,2]
= rev [] [4,3,2]
= [4,3,2]
In fact, it takes only O(n) time to reverse the list, a dramatic speedup, even neglecting the tail-recursion elimination that would also be possible.
Java, by the way, does not optimize tail-recursion. That was a conscious and controversial design decision. They decided to omit it in order to do a “proper” recording of the stack trace on a run-time error and during debugging. But of course it means that some methods run more slowly than they otherwise would, and that they lead to a StackOverflowError when they could be compiled so as to use the stack very lightly.