The secret to understanding recursion

nested Russian dolls

Recursion is the process of solving a problem in terms of smaller versions of the same problem. Since the problem gets smaller each time, the process eventually terminates in a problem (the “base case”)  that can be solved directly. Be sure of three things:

  1. The problem gets smaller each time.
  2. You include a solution for the base case.
  3. Each case is handled correctly.

That’s really all there is to it.

But what about visualizing how the code runs? You don’t have to. And that’s the point: sometimes a recursive solution is easier precisely because you don’t have to understand in detail how it executes. Graham explains in his Lisp book why tracing the code execution is unnecessary if not harmful.

Students learning about recursion are sometimes encouraged to trace all the invocations of a recursive function on a piece of paper. … This exercise could be misleading: a programmer defining a recursive function usually does not think explicitly about the sequence of invocations that results from calling it. … How do you visualize all those invocations? You don’t have to.

22 thoughts on “The secret to understanding recursion

  1. That’s interesting, I’m not convinced: While it’s true that people already comfortable with recursion don’t explicitly think about the sequence of invocations in a recursive call, things may be quite different for beginning programmers.

    I seem to recall a certain amount of disbelief after encountering recursion for the first time. Maybe it’s a matter of how different people learn, but I remember that seeing a small but complete trace of a recursive call did wonders to convince me that the whole thing would work. It was only after seeing the guts of how it worked that I felt comfortable with the more formal “inductive” explanation.

  2. Kate, you have a good point. Maybe an instructor should give a couple examples that trace the execution to help persuade the skeptical, then point out that worrying about such details may hold you back in the long run.

  3. I was initially inclined to disagree here, too, but I wonder if a difference of background is behind the disagreement. I’m guessing that you were already a mathematician when you learned computer programming. I wonder if that accounts for your greater comfort with viewing a recursive function as a concise description of an algorithm, without having initially felt the need to see the full series of invocations that leads to its result.

  4. Isn’t it also important to note that recursion has to reduce the problem set using the same algorithm on each set?

    That’s the defining characteristic, but it’s also a good rule of thumb that when the recursive algorithm is too complex you might want to look for a non-recursive solution. Or when it is to costly, you might want to look for shortcuts outside your recursion. And don’t forget that recursion is sometimes good up to a point. You can recursively reduce a set to the point where it can be managed more easily (e.g. in memory) with a direct method, such as a linear search.

  5. It’s obvious: The secret to understanding recursion is knowing how to combine understanding of the first half of recursion with the understanding of the second half.

  6. Recursion is an elegant means by which to describe an algorithm, but it hides the assumption about infinite memory that running-time-efficiency algorithms make.

  7. Visualizing the trace doesn’t always help to understand the *correctness* of an algorithm, but I find it incredibly helpful to understand the *complexity* of an algorithm… and that’s clearly important in many scenarios.

    Typically visualizing the trace doesn’t help in terms of correctness when the algorithm is extremely declarative – and recursive solutions are often very declarative, in my experience. Often the recursive form mirrors how you would *describe* the problem in the first place, so it’s “obviously” correct.

  8. When I introduced students to the notion of recursion or, for that matter, induction, I appealed to the following visualization.

    Imagine a very (very) long line of people in which you take part. You are told that everybody is in the line is responsible for solving the problem of a specific size. Furthermore, every person is lazy but cooperative. Specifically, you are lazy but cooperative.

    How would you perform your task of solving the problem of a specific size? Reduce the problem to problems of smaller size and delegate them to the person who is responsible for the specific sizes.
    How they solve their problem is their problem. I do not really care. But in the end I get a result from the participants and I can assembly to present my answer.

    For me this is far more powerful then to trace the code.

  9. I also like to keep in mind that positive integers have a recursive definition — in other words many recursive processes can be replaced with numeric processes, which can take advantage of hardware optimized for handling of numbers.

    Also, when those numeric processes would lose too much information to be useful, it’s sometimes useful to think in terms of inductive processes (build from the base case until reaching the target) instead of recursive processes (descending from the target until you reach your base case).

    And, of course, precomputing a lookup table and/or memoizing can also be useful supplements to recursion (or sometimes can even replace recursion).

    And, of course, some languages have optimizations to replace some forms of recursion with something a bit more efficient (tail recursion, I’m looking at you). But building an algorithm so that this kind of optimization functions can sometimes have non-obvious implications for beginners (implications that can be better understood by working out on paper the steps that must be carried out to complete the operations — or, sometimes equivalently, rephrasing to use induction or iteration instead of recursion).

  10. We have problems to trace recursion because we are lazy, all the mathematics world demonstrate its lazyness since Mandelbrot show us the power of recursion

    If you want to trace a recursion you need to change your linear view of things into a fractal view (as Benoit said)

    I spend my last 9-10 years trying to figure out how God was planning reallity if he was a programmer (in fact I think he was if he even exist)

    The result: a programming language based on fractals since I change my point of view

    And you know what? I need to create debugging tools to follow the path with great success and with pyhton

    I’m not agree with you that recursivity is hard to follow, again, you need to fully change your mind from linear to fractal to achieve this

    The good news: as soon as you make the EFFORD you will understand that all our science is linear and linearity is a detail of the fractality so everything we know about our reality is even true but incompleted as it became a small part of the whole system

  11. Jonathan Decker

    With recursion, I think while the high-level concept takes time to grasp, but the implementation is almost always an application of a stack data-structure. I’ve seen students avoid function-stack recursion in favor of needlessly complex, overly specialized iterative implementations because its more intuitive based on what they’ve already learned about sequential programming.

    I wonder if it would be better to show students a stack-based implementation of a recursive method, before showing them the high-level pseudo-code and introducing the general theory. This approach may seem backwards, but it might help avoid the misdirection attributable to surprising students with a new paradigm.

  12. The only thing to look out is space complexity that is incurred in recursive calls. It would be dangerous if we process moderately large data structure (e.g. 10,000 level deep, unbalanced binary tree) using recursion, it is easy to run out of stack space.

    Sometimes compilers/runtimes helps with optimizing them away, some don’t. As a good craftsman, we need to know our tools well.

  13. I have noticed several “the only thing you need to know” comments here. I think that all of these comments carry an implicit “to accomplish ___” clause. And, perhaps, this sort of implicit clause should be made explicit.

    Consider, for example, this C:

    int recadd(int x, int y) {
    int sum;
    if (x>0) {
    x--;
    if (y>0) {
    y--;
    sum= recadd(x,y);
    sum++;
    sum++;
    return sum;
    } else if (y < 0) {
    y++;
    return recadd(x,y);
    } else {
    x++;
    return x;
    }
    } else if (x 0) {
    y--;
    return recadd(x,y);
    } else if (y < 0) {
    y++;
    sum= recadd(x,y);
    sum--;
    sum--;
    return sum;
    } else {
    x--;
    return x;
    }
    } else {
    return y;
    }
    }

    This routine satisfies various “all you need” statements I have seen here. It’s also silly — you can accomplish the result of executing this function using a single machine operation on any modern machine, and without using any stack.

    All too often, recursive implementations fall into this trap: they do things the hard way where a little appreciation of the purpose of the code could accomplish the same thing non-recursively. Sometimes you have to push back into the requirements — often people will over-enthusiastically include requirements which are not about accomplishing something useful but instead are about forcing the machine to behave a certain way. Sometimes you need a different data structure. Sometimes recursion is actually useful. Sometimes …

    Anyways, if you have time, one of the first things you should do, especially when given a recursive requirement, is try to take a step back and see the big picture. If you can understand the issue you can often eliminate a lot of useless code…

  14. IMHO I think with the first requirement is a simplification derived from dynamic programming strategies.

    Let’s consider the problem as an illustration: a system of linear equations Ax=b, A = A^T > 0, for given EPS find y: ||Ay – b|| < EPS (stop condition) using the conjugate gradient method (CGM). The method’s definition correct and can be made recursive (it’s weird, but works), recursion finishes in finite number of steps and items 2, 3 are satisfied. But the problem doesn’t get smaller each time if one considers it independently of the recursion.

    My version would be:
    1. Reduce the problem to the problem of the same type;
    2. Include a solution for the base case;
    3. Ensure that the process converges in finite time.

  15. Alessandro Martin

    IMO, as soon as one studies recursion (if properly done) in LISP it’s understood

  16. I recall in 9th grade English class (circa early 1970s) an admonition by the teacher that “A definition (of a word) should not include the word you’re definine.” I bit my lip (not even quite knowing how wise it was to do so!), but I had recently read one of Martin Gardner’s Mathematical Games columns in Scientific American that was surely about recursion (I don’t remember any of the column except the following): The word ancestor is neatly and concisely defined as:

    A mother or a father or an ancestor of a mother or father.

  17. Functional programming only works when the size of the interface doesn’t blow up. The way it manages that is to put the interface in the stack. Back before object-oriented, structured programming that managed to have high cohesion and low coupling had to send large data structured from the top of the stack all the way to the bottom of the stack. Or, you resorted to static variables. OO got rid of that. With functional programming, the only place for the data structure to go is into the stack. That doesn’t mean it has to be confusing.

    Recursion does the same thing.

    There should never be any requirements that say use a stack or use recursion or use functional programming. Yes, anything done iteratively can be done recursively. That’s not a battle I want to fight.

    Sum=Add(a,b):
    if a=1 then return(b+1)
    else Add(a-1, b)
    end
    end Add

    Trace that stack:
    Add(5,3)
    Add(4,3)
    Add(3,3)
    Add(2,3)
    Add(1,3) returns 4
    returns 5
    returns 6
    returns 7
    returns 8

    a=1 is the invariant.

    That should do it, so you can stop tracing stacks. Yes, make sure your invariant exists and is tested each time.

    Beware of the off-by-one error. Yeah, I made one and had to fix it.

  18. Although doing actual traces of the execution of a recursive algorithm may not always be a good idea, having an idea of the execution flow can help understand issues like the process growing without control and advantages to avoid it.
    For example, let’s consider the Fibonacci sequence, which is naturally defined recursively:
    F(0) = 0; F(1) =1
    F(n) = F(n-1) + F(n-2) for all n> 1
    It is clear when you visualize it that even for small values of n, this becomes a call tree, which can grow real quick real fast.
    Now, by looking at the call tree you can realize that many calls are repeated through the branches, which can be solved by caching partial results so they can be called at a later time without having to recalculate a branch. (Yes, I know functional programmers may want to kill me because of side effects, but just trying to illustrate the point).

  19. Well, a programmer needs to know something about how the internals work. The question is, what is enough to be able to implement solutions without having to visualize the entire stack?

Comments are closed.