Wrapping my head around Tail Recursion and TCO
A brief look into the methods used for optimising recursive functions
I’ve recently been going down the functional programming rabbit hole, and one concept which has been pretty fascinating to learn about is tail recursion (along with the optimisations it can provide).
Learning about it has also been a surprisingly nice way to get exposed to a bunch of different concepts including memory management as well.
In this post I’ll explain recursion and tail recursion from first principles, and go over some actual implementations on OCaml.
To be clear: I’m not some kind of authority on this subject, but you may find it useful to learn it from someone only a few steps ahead of where you are now. I’ll leave some other solid resources at the end of the post you can check out, too.
First quest on the agenda: recursion.
Recursion
Before we get into anything else let’s first talk about recursion.
A recursive function is a function which calls itself in its own function body. Usually, there is a base case (the condition which causes the function to terminate), and the recursive case: the function call.
A simple example that is commonly used to demo recursion is something like calculating a factorial. A factorial of 5, for example, is 5 * 4 * 3 * 2 * 1.
A recursive factorial function in OCaml might work like this:
let rec factorial n =
if n = 0 then 1
else n * factorial (n - 1)In this code, you:
Define a recursive function, which takes one argument (a number, n)
If the number passed in is 0, we just return 1
If the number isn’t 0, we call factorial again (going down by one each time) and multiply it by the previous number
This is relatively straight-forward if you’re familiar with recursion, and if not you may want to brush up on some other resources first.
So, what’s actually wrong with this function? Why would we want to update it from just a normal recursive function to a tail recursive one?
How this function actually works in memory is where we start running into problems (and can at least optimise it to some extent). This requires taking a closer look at how this program runs, and the issues that can arise.
Recursion, Computer Memory, and You
Time for a little detour into how this factorial function actually runs on your computer. This is going to be an any % speedrun, high-level overview of it. It’s not a comprehensive look, so keep that in mind.
So, what exactly does this mean?
Running your program
When you run a program on your computer, your operating system (OS) allocates it some memory, and loads it into RAM.
Your OS then provides you as a developer a neat little thing called virtual memory, which is an abstraction over it.
The virtual memory for your program has different areas which handle different parts of data and instructions in your program. Some for example handle static variables in your program, others handle dynamic memory (such as lists which might grow or shrink in size).
The one we are interested in is called the stack.
Stackin’ frames 🥞
Stack memory is an area dedicated to basically one thing for your program: handling function calls. Whenever your program runs a function, its local data is added to the stack in something called a stack frame.
At this point you may probably be wondering about this tangent I’m going on, and thinking “why does all of this matter, and what on Earth does it have to do with recursion?”
Coming back to the factorial function, let’s think about this in terms of memory and recursion.
The factorial function calls itself over, and over, and over. Each time it is called, it creates a new stack frame, which is added to stack memory. The recursive call will build up on the stack like this:
This doesn’t look so bad, for a factorial of 5. But what if you want to calculate the factorial of 962? If you’re a gamer, you’re probably well aware your RAM (and by proxy stack memory) is not infinite.
If the stack grows too large, or your recursive function doesn’t terminate, you can run out of stack memory, your program will crash, and you’ll get an infamous “stack overflow” error.
This is well-documented and understood these days, so mitigations exist in many programming languages. Instead, it will keep adding stack memory and you might get some other kind of problem instead (e.g. an integer overflow).
But if you look at this recursive stack call in general, it also just doesn’t look terribly efficient either. You have to build up all of these stack frames, and nothing is actually computed until the final recursive call when everything unravels.
Each function call stacks up and none of them ever “pop”, so they just continue growing. This can eat away at memory on your computer and cause poor performance. In terms of algorithmic complexity you can think of it as O(n).
So, finally onto why tail recursion is so based and function-pilled and how it can help optimise recursive functions (with some help from the compiler).
Introducing Tail Recursion
Tail recursion is a method or writing recursive functions in such a way that to avoid certain stack overflow errors, and can also provide a way for the compiler to optimise your recursive functions.
So if you can find a way around this and make it so each recursive function call actually terminates, you’ll pretty much never run out of stack space. The way to do this is by making sure the recursive call is always the last thing that happens.
But wait: didn’t we already do that?
This might look like the case with the first recursive function, but it’s not. The else statement calls the factorial function, and the n * multiplication has to wait until all of the function chains unravel.
To make it tail recursive, you have to move this multiplication part inside of the function call. This is usually done with the help of another argument, or sometimes another function called an auxiliary function (basically a helper function).
Here’s an example of a tail-recursive factorial function in OCaml:
let rec tail_factorial acc n =
if n = 0 then acc
else tail_factorial (acc * n) (n - 1)This time, you define a recursive function with takes an accumulator (which will initially be 0), and the number to calculate the factorial for.
The key difference here is the acc function “accumulates”, i.e. it keeps a running value of the multiplication before the recursive call happens.
What’s so ingenious about this is that although recursive, it now completely eliminates this stack of recursive calls and turns into something more akin to a “flat” loop.
In essence, the recursive call generates a stack frame, runs the calculation, and immediately returns so you’re back to a single stack frame, and repeats.
Pretty neat.
Further Reading
A lot of my understanding of recursion and tail recursion draws explanations presented in:
You may find the explanations there beneficial.
Learning about FP and tail recursion has been a joy for me at least, so I hope I could share that energy with you, fellow human.
Thanks for reading.
If you want to read more posts on functional programming, PL development, and my self-learning approach to these things, feel free to subscribe.








