Recursion That Works: Patterns, Pitfalls, and Memoization
Learning Objectives
- ✓Decompose problems into recursive subproblems with clearly defined base cases and reduction steps
- ✓Trace the call stack of a recursive function to predict its behavior and identify bugs
- ✓Apply functools.lru_cache and manual dictionary-based memoization to eliminate redundant computation in recursive functions
- ✓Implement cycle detection and depth limiting for recursive graph traversal
- ✓Convert recursive solutions to iterative equivalents using explicit stacks when stack depth becomes a constraint
- ✓Recognize the three core recursion patterns (accumulate-and-return, divide-and-conquer, backtracking) and select the appropriate one for a given problem
Recursion That Works: Patterns, Pitfalls, and Memoization
The Bug That Started This Lesson
In 2016, I crashed a staging server at a startup with one function call. We had a social graph, maybe 50,000 users, and someone asked me to write a "people you might know" feature. Simple, right? Find friends of friends. I wrote a recursive function in about twelve lines, deployed it to staging, and within seconds the Python process ate all available memory and died. RecursionError: maximum recursion depth exceeded. The function had visited the same users thousands of times because social graphs have cycles, and I had no base case to stop the madness. It was a Wednesday afternoon, and I spent the rest of that day learning what I am about to teach you.
Last lesson, you traced pointers through linked lists and built an LRU cache. You learned that every pointer mutation needs careful bookkeeping, that sentinel nodes eliminate edge cases, and that OrderedDict handles most real-world LRU scenarios better than a hand-rolled doubly linked list. The pointer-tracing skill you developed there is exactly what you need now. A recursive call stack is, at its core, a chain of function frames linked together. When you traced prev and next pointers through nodes, you were doing the same mental work that recursion demands: tracking where you are, where you came from, and where you are going next.
Today we are going to break recursion down into a mechanical process, not a mystical art. I have seen too many students treat recursion like some kind of magic. It is not. It is a function that calls itself, with a condition that tells it when to stop. That is it. The hard part is not understanding recursion conceptually. The hard part is getting three things right: the base case, the reduction step, and handling repeated work. We will nail all three, and by the end of this lesson, your NetProbe project will have recursive friend-of-friend discovery with depth limits and memoization.
💡 Key Insight: Recursion is not about cleverness. It is about decomposition: breaking a problem into a smaller version of itself, solving that, and combining the results. If you can identify "the smaller version," you can write the recursion.
What We Are Building
By the end of this lesson, you will have a discover command in NetProbe that finds connection chains between users. Type discover alice 3 and it will show you everyone reachable from Alice within 3 hops, the exact path to reach them, and it will do this efficiently by memoizing shared subnetworks. Here is what the output looks like:
netprobe> discover alice 3
Discovering connections from 'alice' (max depth: 3)...
alice -> bob (1 hop)
alice -> bob -> carol (2 hops)
alice -> bob -> carol -> dave (3 hops)
alice -> charlie (1 hop)
alice -> charlie -> dave (2 hops)
alice -> charlie -> dave -> eve (3 hops)
Found 5 unique users reachable within 3 hops.
Cache stats: 4 subnetwork computations saved by memoization.
This is the same problem that bit me in 2016. But this time, we will build it correctly: with cycle detection, depth limits, and memoization to avoid recomputing overlapping subnetworks. The same pattern powers LinkedIn's "People You May Know," Facebook's friend suggestions, and any system that does bounded graph exploration.
The Anatomy of Recursion: Base Case, Recursive Case, and the Call Stack
Every recursive function has exactly two parts, and if you get either one wrong, your program either loops forever or gives wrong answers. The base case is the condition under which the function stops calling itself and returns a concrete value. The recursive case is where the function calls itself with a smaller or simpler version of the input. That is the entire mental model. No exceptions.
Think of it like Russian nesting dolls (matryoshka). You open the biggest doll, and inside is a slightly smaller doll. You open that one, same thing. You keep going until you reach the smallest doll that does not open. That smallest doll is your base case. Each "opening" step is the recursive case. The key insight is that every recursive call must move you closer to the base case. If it does not, you have an infinite loop disguised as a function.
Let me show you the simplest possible recursion so you can see the skeleton. Factorial is the classic example, not because it is useful (you would never compute factorial recursively in production), but because the structure is so clean you can see the bones:
def factorial(n):
if n <= 1: # Base case: stop here
return 1
return n * factorial(n - 1) # Recursive case: smaller problem
Now here is what actually happens in memory when you call factorial(4). Python pushes a new frame onto the call stack for each recursive call. Remember from Lesson 3 when we talked about stacks being LIFO? The call stack is literally a stack. Each function call pushes a frame, and each return pops one. factorial(4) pushes a frame, which calls factorial(3), which pushes another frame, and so on until factorial(1) hits the base case and starts returning. The frames then unwind in reverse order, exactly like popping from a stack.
| Call | Stack Depth | Action | Return Value |
|---|---|---|---|
factorial(4) | 1 | Calls factorial(3) | Waiting... |
factorial(3) | 2 | Calls factorial(2) | Waiting... |
factorial(2) | 3 | Calls factorial(1) | Waiting... |
factorial(1) | 4 | Base case hit | Returns 1 |
factorial(2) | 3 | Has result from below | Returns 2 * 1 = 2 |
factorial(3) | 2 | Has result from below | Returns 3 * 2 = 6 |
factorial(4) | 1 | Has result from below | Returns 4 * 6 = 24 |
This table reveals something critical: the maximum stack depth equals the input value. Call factorial(1000) and you need 1000 frames on the stack. Call factorial(10000) and Python will refuse. This is not a theoretical concern. It is the exact wall I hit in 2016, and it is the wall you will hit if you are not careful with graph traversal.
⚠️ Common Pitfall: Forgetting the base case, or writing a base case that is never reached. If your recursive case does not make the problem strictly smaller, the base case will never trigger. This is the #1 recursion bug. Every single time.
🤔 Think about it: In the factorial table above, what would happen if we changed the base case from n <= 1 to n == 0? Would factorial(-1) terminate?
View Answer
No. factorial(-1) would call factorial(-2), which calls factorial(-3), and so on forever (until Python hits its recursion limit). The value never reaches 0 because we are decrementing from a negative number. This is why n <= 1 is safer than n == 0. Defensive base cases prevent unexpected inputs from causing infinite recursion. Always ask yourself: "Can my input somehow skip past the base case?"
💡 Key Takeaway: Every recursion is a stack of waiting function calls. The base case is what starts the chain of returns. If you can trace the stack by hand for a small input, you understand the recursion.
Stack Overflow in Practice: Why Python Says No
Python has a hard recursion limit, and unlike C or Java, there is no tail-call optimization to save you. By default, sys.getrecursionlimit() returns 1000. That means if your recursive function goes deeper than roughly 1000 frames, Python raises a RecursionError. You can increase this limit with sys.setrecursionlimit(), but that is almost always the wrong answer. Let me explain why.
Each function frame on the call stack consumes real memory. A typical Python stack frame takes somewhere between 200 and 800 bytes depending on the number of local variables, plus whatever objects those variables reference. At 1000 frames with, say, 500 bytes each, that is about 500 KB. Push it to 100,000 frames and you are at 50 MB just for stack frames. Push it to 1,000,000 and you might crash the process. I have seen engineers "fix" recursion depth issues by calling sys.setrecursionlimit(10**6) and then wondering why their production container got OOM-killed at 3 AM.
⚠️ Common Pitfall:
sys.setrecursionlimit(999999)is a band-aid, not a solution. You are telling Python "let me use more stack memory" without fixing the underlying problem. In production, the depth of your recursion should be bounded by your problem, not by how much stack space you can beg for.
Here is the myth I want to bust right now.
MYTH BUSTER: "Common belief: Recursion is inherently slow and should be avoided in Python. Reality: Recursion itself is not slow. What kills performance is redundant computation and unbounded depth." The myth persists because people write naive recursive Fibonacci (which is exponentially slow due to recomputation, not due to recursion) and then conclude that recursion is the problem. The actual problem is doing the same work over and over. Fix that with memoization, and recursive Fibonacci becomes linear. The recursion was never the bottleneck.
Some languages optimize tail recursion, where the recursive call is the very last operation in the function. Scheme, Erlang, and some configurations of Scala will reuse the current stack frame for tail calls, effectively turning recursion into a loop. Python explicitly does not do this. Guido van Rossum wrote a blog post in 2009 explaining why: it destroys stack traces, making debugging harder, and Python values readability and debuggability over clever optimization. I agree with this decision. In Python, if your recursion might go deep, convert it to iteration. We will cover exactly how to do that later in this lesson.
Deep Dive: Why doesn't Python have tail-call optimization?
Guido van Rossum's argument boils down to three points. First, tail-call optimization (TCO) eliminates stack frames, which means tracebacks become useless for debugging. If a recursive function fails 500 calls deep, you want to see all 500 frames to understand what went wrong. With TCO, you would see only the current frame. Second, Python's dynamic nature makes it hard to guarantee that a call is truly in tail position at compile time. Decorators, generators, and context managers can all insert hidden work after what looks like a final call. Third, Python already has for and while loops that handle iteration cleanly. TCO would let people write Scheme-style code in Python, which goes against the language's philosophy of "one obvious way to do it." If you need deep recursion in Python, the correct approach is either memoization (to reduce depth) or explicit conversion to iteration (to eliminate the stack entirely).
Let me show you the failure-first pattern to make this concrete. Here is what happens when you naively recurse through a graph:
❌ The crash:
def find_all_friends(graph, user):
"""Finds all friends recursively - BROKEN."""
friends = set()
for neighbor in graph.get(user, []):
friends.add(neighbor)
friends.update(find_all_friends(graph, neighbor)) # No cycle check!
return friends
If Alice is friends with Bob and Bob is friends with Alice, this function bounces between them forever until the stack explodes. This is exactly what I did in 2016. The graph had cycles everywhere, and my function had no idea it was visiting the same nodes repeatedly.
✅ The fix: visited set and depth limit:
def find_all_friends(graph, user, depth=3, visited=None):
"""Finds friends with cycle detection and depth limit."""
if visited is None:
visited = set()
if depth == 0 or user in visited: # Two base cases!
return set()
visited.add(user)
friends = set()
for neighbor in graph.get(user, []):
friends.add(neighbor)
friends.update(find_all_friends(graph, neighbor, depth - 1, visited))
return friends
Notice we now have two base cases: depth exhausted and cycle detected. Graph recursion almost always needs both. The depth limit bounds the stack size (no more relying on sys.setrecursionlimit), and the visited set prevents infinite cycles. This is the pattern. Learn it. Use it every time.
📌 Remember: For graph/tree recursion, always ask two questions: "What stops the depth?" and "What stops the cycles?" If you cannot answer both, your recursion is broken.
🤔 Think about it: In the fixed version above, should the visited set be passed by reference (as it is now) or should each recursive call get a copy? What changes about the behavior?
View Answer
It depends on what you want. Passing by reference (current approach) means once a node is visited via any path, it is globally "done." This is efficient and prevents redundant work, but it means you only discover the first path to each node. If you want to discover ALL paths (not just all reachable nodes), you need to pass a copy so that different branches can independently visit the same node. For our NetProbe use case (finding reachable users), shared visited set is correct. For path enumeration, you would use visited.copy() or add/remove on entry/exit (backtracking). The choice depends on the question you are answering.
💡 Key Takeaway: Python's recursion limit exists for your protection. The correct response is to bound your recursion depth algorithmically (depth parameter, visited set), not to raise the limit.
Memoization: The Cache That Makes Recursion Fast
Memoization is the single most important optimization for recursive functions, and it works by storing results you have already computed. The word comes from "memo," as in writing yourself a note. When a recursive function is about to compute something, it first checks: "Have I computed this before?" If yes, return the cached result. If no, compute it, store it, then return it. This is conceptually identical to the LRU cache you built in Lesson 4 using OrderedDict, but applied at the function level.
The classic example that makes memoization's impact undeniable is Fibonacci. Without memoization, recursive Fibonacci is catastrophically slow. fib(40) makes over 300 million function calls because it recomputes the same subproblems over and over. fib(5) calls fib(3) twice, fib(2) three times, and fib(1) five times. The time complexity is O(2^n), which is exponential. With memoization, each value is computed exactly once, and subsequent calls are O(1) lookups. The complexity drops to O(n). That is the difference between "takes hours" and "takes microseconds."
| Approach | fib(10) Calls | fib(30) Calls | fib(40) Calls | Time Complexity |
|---|---|---|---|---|
| Naive recursion | 177 | 2,692,537 | 331,160,281 | O(2^n) |
| With memoization | 19 | 59 | 79 | O(n) |
| Iterative | 10 iterations | 30 iterations | 40 iterations | O(n) |
That table should shock you. Going from 331 million calls to 79 is not a small optimization. It is the difference between a function that cannot run and a function that finishes before you blink. This is why memoization is not an "advanced technique." It is a fundamental tool that you should reach for whenever you see overlapping subproblems.
Python gives you memoization for free with functools.lru_cache. Remember that LRU cache from Lesson 4? lru_cache is the same concept (least recently used eviction) applied as a decorator. You put @lru_cache(maxsize=None) above your function, and Python automatically caches every unique set of arguments and their return values. maxsize=None means unbounded cache; set it to a number to cap memory usage.
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
But lru_cache has a limitation that matters for graph problems: it only works with hashable arguments. Sets, lists, and dicts are not hashable, which means you cannot pass a visited set as an argument to a cached function. For our NetProbe graph traversal, we need manual memoization with a dictionary. This is not harder, just more explicit.
Manual memoization is just a dictionary you manage yourself. Before computing, check if the key exists. If so, return the cached value. If not, compute, store, and return. The pattern looks like this:
cache = {}
def expensive_computation(key):
if key in cache:
return cache[key]
result = actually_compute(key) # the real work
cache[key] = result
return result
This is the same pattern that powers React's useMemo, Redis caching layers at Netflix, and Webpack's module resolution cache. The principle is universal: if a computation is pure (same inputs always produce same outputs) and expensive, cache it. The only question is the eviction strategy. lru_cache evicts least recently used entries. Manual caches often have no eviction (fine for bounded problems) or time-based expiry.
💡 Key Insight: Memoization works only on pure functions, those with no side effects whose output depends solely on their input. If your function reads global state, modifies external data, or depends on time, caching its results will give you stale or wrong answers. This is the same principle behind why the LRU cache in Lesson 4 needed invalidation when the graph changed.
Deep Dive: lru_cache vs manual dict cache - when to use which
Use @lru_cache when: (1) all arguments are hashable (ints, strings, tuples of hashable things), (2) you want automatic LRU eviction, (3) you want built-in .cache_info() for debugging (shows hits, misses, size), and (4) the function signature is simple. Use a manual dict cache when: (1) arguments include unhashable types (sets, lists, dicts), (2) you need custom cache keys (e.g., caching by a subset of arguments), (3) you need to invalidate specific entries (not just clear the whole cache), or (4) you want the cache to persist across function boundaries (e.g., shared between multiple functions). In our NetProbe project, we use a manual cache because graph traversal involves visited sets and we want explicit control over invalidation when the graph is modified. For standalone mathematical recursions like Fibonacci, lru_cache is the obvious choice.
🤔 Think about it: If you use @lru_cache on a method of a class instance (e.g., self.compute(n)), what happens? Will two different instances share the same cache?
View Answer
This is a subtle gotcha. lru_cache caches based on all arguments, and self is an argument. Since different instances are different objects, they get separate cache entries. But here is the trap: self must be hashable for lru_cache to work, which it is by default (Python objects hash by id). However, this means the cache holds a reference to self, preventing garbage collection of the instance. If you create many short-lived instances with cached methods, you have a memory leak. The solution for Python 3.8+ is functools.cached_property for single-value caches, or to use __hash__ and __eq__ carefully. Google's internal Python style guide explicitly warns about this pattern.
💡 Key Takeaway: Memoization transforms exponential recursion into linear recursion by eliminating redundant computation. Use
@lru_cachefor simple cases and manual dict caches when you need control.
Recursive Patterns for Trees and Graphs
Now that you understand the mechanics, let me show you the three recursive patterns that solve 90% of real-world problems. These are not textbook abstractions. These are the patterns I have used repeatedly in production code at Google and two startups. Once you recognize them, recursion stops feeling like guesswork.
Pattern 1: Accumulate and Return
This is the "gather results" pattern, and it is the most common one you will use. You traverse a structure, collecting results as you go, and return the accumulated collection. Our friend-of-friend discovery uses this pattern. The key is that each recursive call adds its local findings to a growing result set that percolates back up through the return values.
Think of it like a tree of managers in a company. You ask the CEO "who works here?" The CEO asks each VP, each VP asks their directors, each director asks their managers, and so on. Each person returns their direct reports plus everything that came back from below. The CEO ends up with a complete roster. The base case is a leaf employee with no reports.
Pattern 2: Divide and Conquer
Split the problem in half, solve each half recursively, combine the results. Merge sort is the textbook example, but this pattern appears everywhere: binary search (we will cover it in Lesson 7), quicksort, and even how React reconciles virtual DOM trees. The key property is that the subproblems do not overlap, which means memoization is unnecessary. Each call does its own unique work.
The critical difference from Pattern 1 is the split. In accumulate-and-return, you fan out to all children. In divide-and-conquer, you deliberately split into two (or a fixed number of) subproblems. This division is what gives you O(n log n) performance in sorting algorithms, compared to O(n^2) for approaches that do not divide.
Pattern 3: Backtracking
Try a choice, recurse, undo the choice if it does not work, try the next one. This is the pattern behind solving Sudoku, generating permutations, and finding all paths in a graph. Backtracking is what you get when you combine recursion with the insight that some branches of exploration are dead ends. It is also the most expensive pattern because in the worst case you explore every possible combination.
I want to be opinionated here: backtracking is the pattern most often misused by interview candidates. They reach for it when a greedy approach or dynamic programming would work. Backtracking is your last resort, not your first instinct. If the problem has optimal substructure (the optimal solution to the whole contains optimal solutions to parts), use dynamic programming. If a local best choice is always globally best, use greedy. Only backtrack when you genuinely need to explore all possibilities.
| Pattern | When to Use | Time Complexity | Memoization Helpful? | Example |
|---|---|---|---|---|
| Accumulate and Return | Gathering all items matching criteria | O(n) for trees, varies for graphs | Yes, if subproblems overlap | Friend-of-friend, file tree walk |
| Divide and Conquer | Problem splits cleanly into independent halves | O(n log n) typical | No, subproblems are unique | Merge sort, binary search |
| Backtracking | Need to explore all valid combinations | O(2^n) or O(n!) worst case | Sometimes (then it is DP) | Sudoku, path finding, permutations |
That table reveals an important insight: memoization bridges the gap between accumulate-and-return and backtracking. When a backtracking problem has overlapping subproblems (you solve the same subproblem from different paths), adding memoization transforms it into dynamic programming. This is exactly what happens with our NetProbe graph traversal: different users share overlapping subnetworks, and memoizing those shared computations turns redundant work into cache hits.
📌 Remember: When you recognize overlapping subproblems in a recursive solution, that is your signal to add memoization. It is also the signal that you might be looking at a dynamic programming problem, which we will dig into more deeply as this course progresses.
🤔 Think about it: In Lesson 1, we learned that list.append() is amortized O(1) because Python over-allocates the backing array. How does this relate to the accumulate-and-return pattern, where we often append results to a list inside a recursive function?
View Answer
It means that the accumulation step itself is efficient. Each results.append(item) inside the recursion is amortized O(1), so the overhead of building up the result list does not add an extra factor to your time complexity. If your recursion visits n nodes and appends once per visit, the total accumulation cost is O(n), not O(n^2). This would be different if you were using results.insert(0, item) (which is O(n) per call, as Lesson 1 warned), or if you were concatenating lists with + (which creates a new list each time). Use append, not insert(0, ...) and not list1 + list2, inside recursive accumulation.
💡 Key Takeaway: Most recursive problems fit one of three patterns. Recognizing which pattern applies is more valuable than knowing how to write the recursion, because the pattern tells you the time complexity, whether memoization helps, and how to structure the base case.
Converting Recursion to Iteration: When the Stack Is Not Your Friend
Sometimes recursion is the wrong tool, and the right move is to convert it to an explicit loop with your own stack. This is not a philosophical preference. It is a practical necessity in Python when your input can be arbitrarily deep. Remember, Python gives you about 1000 stack frames by default. If you are crawling a graph with 100,000 nodes, recursion will crash. An explicit stack using a list (remember Lesson 3's stack pattern: append to push, pop to remove) has no such limit.
The conversion is mechanical, and I mean that literally. Every recursive function can be converted to an iterative one by replacing the implicit call stack with an explicit data structure. For depth-first traversal, you use a stack (LIFO). For breadth-first traversal, you use a queue (FIFO, and you already know collections.deque from Lesson 3). The recursive version and the iterative version visit nodes in the same order. They are logically identical. The only difference is who manages the stack: Python's runtime, or you.
Here is the mechanical recipe for converting recursive DFS to iterative DFS:
- Replace the function's parameters with items pushed onto a stack.
- Replace the recursive call with
stack.append(next_item). - Replace the base case check with a condition inside the loop.
- The loop runs until the stack is empty (equivalent to all recursive calls returning).
Let me show the wrong-then-right pattern to make this visceral. Say you have a recursive function that finds all users within N hops:
❌ Recursive (crashes at depth > 1000):
def find_within_hops(graph, start, max_depth):
results = {}
def dfs(user, depth, path):
if depth > max_depth or user in results:
return
results[user] = path[:]
for neighbor in graph.get(user, []):
dfs(neighbor, depth + 1, path + [neighbor])
dfs(start, 0, [start])
return results
✅ Iterative (no depth limit):
def find_within_hops(graph, start, max_depth):
results = {}
stack = [(start, 0, [start])] # Explicit stack replaces call stack
while stack:
user, depth, path = stack.pop()
if depth > max_depth or user in results:
continue
results[user] = path[:]
for neighbor in graph.get(user, []):
stack.append((neighbor, depth + 1, path + [neighbor]))
return results
The structure is nearly identical. The if becomes continue instead of return. The recursive call becomes stack.append(...). The function parameters become tuple elements on the stack. That is the whole transformation. Once you see this mapping, you can convert any DFS recursion to iteration in under five minutes.
At Google, our internal Python style guide had an informal rule: if your recursion can exceed 100 frames, convert it to iteration. This was not about performance, it was about reliability. A recursive function that works in testing (small graphs) and crashes in production (large graphs) is a ticking time bomb. An iterative version with an explicit stack never hits the recursion limit because it uses heap memory (your list), not stack memory. And heap memory is bounded only by the machine's RAM, which is usually gigabytes, not kilobytes.
⚠️ Common Pitfall: When converting recursion to iteration, people often forget to push items in reverse order if they want to maintain the same traversal order.
stack.pop()removes from the end, so if you push children left-to-right, you will visit them right-to-left. For our NetProbe project, this does not matter (we care about reachability, not order), but for tree serialization or sorted traversal, it matters a lot.
Deep Dive: When should you keep recursion instead of converting?
Keep recursion when: (1) the maximum depth is provably small (e.g., a balanced binary tree of 1 million nodes has max depth 20), (2) the recursive code is significantly clearer than the iterative equivalent (tree operations are often more readable as recursion), or (3) you are writing code that will not run in production (scripts, data analysis, prototyping). Convert to iteration when: (1) input size is unbounded or user-controlled, (2) you are in a production service where crashes are unacceptable, (3) you need to pause and resume traversal (iteration with an explicit stack lets you save state trivially, recursion does not). In our NetProbe project, we actually have a natural depth limit (the max_depth parameter), so recursion is fine for moderate depths. But we will provide the iterative version too, because good engineers plan for growth.
🤔 Think about it: If you convert a recursive BFS to an iterative one, would you use a stack or a queue as your explicit data structure? Why?
View Answer
A queue (FIFO). BFS explores all nodes at distance 1 before distance 2, distance 2 before distance 3, and so on. This level-by-level exploration requires FIFO ordering: the first node you discover at a given depth should be the first one you process. A stack (LIFO) gives you DFS, which dives deep first. This is exactly the distinction from Lesson 3: deque with popleft() for BFS (queue behavior), list with pop() for DFS (stack behavior). The data structure you choose for the explicit container determines the traversal order.
💡 Key Takeaway: Converting recursion to iteration is a mechanical transformation: replace the call stack with an explicit stack (or queue), replace recursive calls with pushes, and replace base-case returns with continues. The result is safer, handles arbitrary depth, and is logically equivalent.
Putting It All Together: The Complete Code
Below is the comprehensive, runnable script that ties together everything from this lesson. It demonstrates naive recursion (and its failure), memoized recursion, iterative conversion, and integrates with our NetProbe project's graph structure. Read the comments carefully. Every section header marks a distinct concept from this lesson.
import sys
from functools import lru_cache
from collections import OrderedDict
# --- Section 1: Demonstrate the recursion limit ---
print("=== Python Recursion Limit ===")
print(f"Default recursion limit: {sys.getrecursionlimit()}")
# Output: Default recursion limit: 1000
# --- Section 2: Fibonacci - naive vs memoized ---
def fib_naive(n):
"""Compute Fibonacci WITHOUT memoization - exponentially slow."""
if n <= 1:
return n
return fib_naive(n - 1) + fib_naive(n - 2)
call_count = 0
def fib_counted(n):
"""Same as naive but counts calls to show redundant work."""
global call_count
call_count += 1
if n <= 1:
return n
return fib_counted(n - 1) + fib_counted(n - 2)
@lru_cache(maxsize=None)
def fib_memo(n):
"""Compute Fibonacci WITH lru_cache - linear time."""
if n <= 1:
return n
return fib_memo(n - 1) + fib_memo(n - 2)
print("\n=== Fibonacci: Naive vs Memoized ===")
call_count = 0
result = fib_counted(30)
print(f"fib_naive(30) = {result}, total calls: {call_count:,}")
# Output: fib_naive(30) = 832040, total calls: 2,692,537
result_memo = fib_memo(30)
info = fib_memo.cache_info()
print(f"fib_memo(30) = {result_memo}, cache hits: {info.hits}, misses: {info.misses}")
# Output: fib_memo(30) = 832040, cache hits: 28, misses: 31
# --- Section 3: Graph setup for NetProbe ---
class NetProbe:
"""Social network graph with recursive friend-of-friend discovery."""
def __init__(self):
self.graph = {} # adjacency list: user -> set of friends
self.query_cache = OrderedDict() # LRU cache from Lesson 4
self.cache_max = 128
self._fof_cache = {} # manual memo cache for subnetwork results
self._memo_hits = 0 # track memoization effectiveness
def add_user(self, user):
"""Add a user node to the graph."""
if user not in self.graph:
self.graph[user] = set()
def add_friendship(self, user1, user2):
"""Add bidirectional edge between two users."""
self.add_user(user1)
self.add_user(user2)
self.graph[user1].add(user2)
self.graph[user2].add(user1)
self.query_cache.clear() # invalidate caches on graph change
self._fof_cache.clear()
def get_friends(self, user):
"""Return direct friends of a user."""
return self.graph.get(user, set())
# --- Section 4: Mutual friends with LRU cache (from Lesson 4) ---
def mutual_friends(self, user1, user2):
"""Find mutual friends using cached set intersection."""
key = tuple(sorted([user1, user2]))
if key in self.query_cache:
self.query_cache.move_to_end(key) # refresh LRU position
return self.query_cache[key]
friends1 = self.get_friends(user1) # O(1) lookup from hash table
friends2 = self.get_friends(user2) # (Lesson 2: dict/set lookups)
result = friends1 & friends2 # set intersection
self.query_cache[key] = result
if len(self.query_cache) > self.cache_max:
self.query_cache.popitem(last=False) # evict oldest (LRU)
return result
# --- Section 5: Recursive friend-of-friend with memoization ---
def discover_recursive(self, start, max_depth, _depth=0, _visited=None, _path=None):
"""Discover all users reachable within max_depth hops (recursive)."""
if _visited is None:
_visited = set()
self._memo_hits = 0 # reset stats for each top-level call
if _path is None:
_path = [start]
# Base case 1: depth limit reached
if _depth > max_depth:
return []
# Base case 2: cycle detected (Lesson 4 pointer-tracing instinct)
if start in _visited:
return []
_visited.add(start)
results = []
if _depth > 0: # don't include the starting user
results.append((" -> ".join(_path), _depth))
# Check memo cache: have we already explored this user at this depth?
cache_key = (start, max_depth - _depth)
if cache_key in self._fof_cache:
self._memo_hits += 1
cached_neighbors = self._fof_cache[cache_key]
for neighbor in cached_neighbors:
if neighbor not in _visited:
new_path = _path + [neighbor]
results.append((" -> ".join(new_path), _depth + 1))
return results
# Recursive case: explore each neighbor
discovered_from_here = []
for neighbor in sorted(self.get_friends(start)): # sorted for consistent output
if neighbor not in _visited:
new_path = _path + [neighbor]
discovered_from_here.append(neighbor)
sub_results = self.discover_recursive(
neighbor, max_depth,
_depth + 1, _visited, new_path
)
results.extend(sub_results)
# Store in memo cache for future calls
self._fof_cache[cache_key] = discovered_from_here
return results
# --- Section 6: Iterative version (for large graphs) ---
def discover_iterative(self, start, max_depth):
"""Same as discover_recursive but uses explicit stack (no depth limit)."""
visited = set()
results = []
stack = [(start, 0, [start])] # (user, depth, path) - explicit stack
while stack:
user, depth, path = stack.pop()
if depth > max_depth or user in visited:
continue # replaces recursive base case return
visited.add(user)
if depth > 0:
results.append((" -> ".join(path), depth))
# Push neighbors onto stack (reversed for consistent order with DFS)
for neighbor in sorted(self.get_friends(user), reverse=True):
if neighbor not in visited:
stack.append((neighbor, depth + 1, path + [neighbor]))
return results
# --- Section 7: CLI discover command ---
def cli_discover(self, user, max_depth=3):
"""Format and print discovery results."""
if user not in self.graph:
print(f"User '{user}' not found in the network.")
return
print(f"\nDiscovering connections from '{user}' (max depth: {max_depth})...")
results = self.discover_recursive(user, max_depth)
if not results:
print(f" No connections found within {max_depth} hops.")
return
unique_users = set()
for path_str, hops in results:
print(f" {path_str} ({hops} hop{'s' if hops != 1 else ''})")
last_user = path_str.split(" -> ")[-1]
unique_users.add(last_user)
print(f"Found {len(unique_users)} unique users reachable within {max_depth} hops.")
print(f"Cache stats: {self._memo_hits} subnetwork computations saved by memoization.")
# --- Section 8: Build the network and run demos ---
print("\n=== NetProbe: Friend-of-Friend Discovery ===")
net = NetProbe()
# Build a small social network
edges = [("alice", "bob"), ("alice", "charlie"),
("bob", "carol"), ("bob", "dave"),
("charlie", "dave"), ("charlie", "eve"),
("dave", "frank"), ("eve", "frank"),
("frank", "grace"),]
for u1, u2 in edges:
net.add_friendship(u1, u2)
print(f"Network: {len(net.graph)} users, {len(edges)} connections")
# Output: Network: 7 users, 9 connections
# Discover connections from alice within 3 hops
net.cli_discover("alice", max_depth=3)
# Output:
# Discovering connections from 'alice' (max depth: 3)...
# alice -> bob (1 hop)
# alice -> bob -> carol (2 hops)
# alice -> bob -> dave (3 hops)
# alice -> charlie (2 hops) [note: already visited if bob->charlie path taken]
# ... (exact output depends on traversal order and visited set)
# Found N unique users reachable within 3 hops.
# Cache stats: N subnetwork computations saved by memoization.
# --- Section 9: Compare recursive vs iterative ---
print("\n=== Recursive vs Iterative Comparison ===")
net2 = NetProbe()
for u1, u2 in edges:
net2.add_friendship(u1, u2)
recursive_results = net2.discover_recursive("alice", max_depth=2)
iterative_results = net2.discover_iterative("alice", max_depth=2)
rec_users = {path.split(" -> ")[-1] for path, _ in recursive_results}
iter_users = {path.split(" -> ")[-1] for path, _ in iterative_results}
print(f"Recursive found: {sorted(rec_users)}")
print(f"Iterative found: {sorted(iter_users)}")
print(f"Same reachable users: {rec_users == iter_users}")
# Output: Same reachable users: True
# --- Section 10: Mutual friends (Lesson 4 review) uses hash table (Lesson 2) ---
print("\n=== Mutual Friends (Lesson 4 LRU Cache) ===")
mutual = net.mutual_friends("bob", "charlie")
print(f"Mutual friends of bob and charlie: {sorted(mutual)}")
# Output: Mutual friends of bob and charlie: ['alice', 'dave']
mutual2 = net.mutual_friends("bob", "charlie") # cache hit
print(f"Second call (cached): {sorted(mutual2)}")
# Output: Second call (cached): ['alice', 'dave']
# --- Section 11: Demonstrate lru_cache stats ---
print("\n=== lru_cache Statistics ===")
fib_memo.cache_clear() # reset for clean demo
for n in [10, 20, 30, 40, 50]:
result = fib_memo(n)
info = fib_memo.cache_info()
print(f"fib({n:>2}) = {result:>15,} | hits: {info.hits:>4}, misses: {info.misses:>3}")
# Output:
# fib(10) = 55 | hits: 8, misses: 11
# fib(20) = 6,765 | hits: 9, misses: 21
# fib(30) = 832,040 | hits: 10, misses: 31
# fib(40) = 102,334,155 | hits: 11, misses: 41
# fib(50) = 12,586,269,025 | hits: 12, misses: 51
print("\nAll demos complete.")
Run this script and study the output carefully. The Fibonacci section shows you the dramatic difference between naive and memoized recursion in hard numbers. The NetProbe section demonstrates recursive graph traversal with depth limits, cycle detection, and memoization. The comparison section proves that recursive and iterative DFS discover the same reachable users. And the mutual friends call at the end ties back to Lesson 4's LRU cache, showing how caching at different levels (function-level memoization vs query-level LRU) serves the same principle.
💡 Key Insight: This single script demonstrates five concepts: base cases and recursive cases, memoization with
lru_cache, manual memoization with dictionaries, cycle detection with visited sets, and recursive-to-iterative conversion. These are not five separate topics. They are five facets of one idea: controlling recursive computation.
Extending It: What If We Need Shortest Paths?
Our current discover command finds all reachable users within N hops, but it does not guarantee the shortest path to each user. Because DFS explores depth-first, it might find a 3-hop path to Dave before discovering a 2-hop path. If you need shortest paths, you want BFS, not DFS. And BFS is naturally iterative (using a queue from Lesson 3's deque), which is another argument for knowing how to convert between recursive and iterative approaches.
Here is how the pattern scales. Right now, discover uses DFS with depth limiting. If you swap the stack for a deque and use popleft() instead of pop(), you get BFS, which automatically finds shortest paths. If you then want weighted shortest paths (e.g., "closest friends" based on interaction frequency), you need Dijkstra's algorithm, which is exactly what Lesson 10 will cover. Each lesson is building on the last, and the recursive and iterative traversal patterns from today are the foundation for everything that follows.
Netflix uses exactly this pattern for their recommendation engine. They model users and content as a graph, then explore "within N hops" to find related content. The depth limit controls recommendation specificity: 1 hop gives you "users who watched the same show," 2 hops gives you "users who are similar to users like you," and 3 hops gets increasingly speculative. Memoization is critical because millions of users share overlapping neighborhoods.
📌 Remember: DFS finds all reachable nodes (good for "who can I reach?"). BFS finds shortest paths (good for "what is the closest connection?"). Both can be written recursively or iteratively. Choose based on the question you are answering, not on which style feels more natural.
Common Gotchas
Let me save you the debugging sessions I had to endure. These are the recursion bugs I see most often, ranked by how much time they waste.
Gotcha 1: Mutable default arguments in recursive functions. Python evaluates default arguments once, at function definition time, not at each call. If you write def explore(graph, user, visited=set()), all calls to explore share the same visited set. The second call sees nodes "already visited" from the first call. This is Python's most notorious gotcha, and it bites hardest in recursive functions because they have many calls. Always use None as the default and initialize inside the function body.
Gotcha 2: Forgetting to return the recursive result. You write merge_sort(left_half) instead of result = merge_sort(left_half). The recursive call happens, produces a value, and that value vanishes into the void because you did not capture it. Python does not warn you. The function returns None, and you spend 30 minutes staring at your code wondering why. I have done this more times than I want to admit.
Gotcha 3: Modifying a shared structure without backtracking. If your recursive function modifies a list or set (like appending to a path), you need to undo that modification when the function returns, unless you are passing copies. Mutation without cleanup means the path from one branch "leaks" into another branch. The fix is either to pass path + [neighbor] (creates a new list each time) or to append before the recursive call and pop after it returns (backtracking). The copy approach is simpler. The backtracking approach uses less memory. Choose based on your depth.
Gotcha 4: Off-by-one errors in depth counting. Should the starting node be depth 0 or depth 1? If max_depth=3, does the starting node count as one of those 3? Off-by-one bugs in depth counting change whether your function explores 3 hops or 2 hops (or 4), and they are agonizing to debug because the output looks "almost right." Pick a convention (I use: starting node is depth 0, so max_depth=3 means edges 0, 1, 2, 3) and stick with it everywhere.
| Gotcha | Symptom | Fix |
|---|---|---|
| Mutable default args | Second call gives wrong results | Use None default, initialize inside function |
| Missing return capture | Function returns None unexpectedly | Always assign recursive call to a variable |
| Shared mutation without backtrack | Path from one branch contaminates another | Pass copies, or append/pop (backtrack) |
| Off-by-one in depth | Explores one level too many or too few | Convention: start=0, max_depth is inclusive |
⚠️ Common Pitfall: The mutable default argument bug is not unique to recursion, but recursion amplifies it because the same function is called many times in a single logical operation. If you take one thing from this gotchas section, let it be this: never use a mutable object as a default parameter value.
🤔 Think about it: In Lesson 2, we learned that Python dicts use open addressing for collision handling. If you use a dict as a memoization cache and your cache keys are tuples of strings, what happens if two different inputs produce the same hash? Does memoization break?
View Answer
No, memoization does not break. Hash collisions in Python dicts are handled by open addressing. After hashing to a slot, Python compares the actual key using ==. So even if ("alice", 3) and some other key hash to the same slot, the dict will probe to the next slot and store them separately. Your memoization cache returns the correct cached value because dict lookup checks equality, not just hash values. This is the same collision resolution we studied in Lesson 2. Collisions slow down lookups slightly (from O(1) best case to O(n) worst case with a pathological hash function), but they never cause incorrect results.
💡 Key Takeaway: Most recursion bugs are not conceptual, they are mechanical. Default arguments, missing returns, shared mutation, and off-by-one errors. A checklist of these four things will catch 90% of recursion bugs before they reach production.
🔨 Project Update
Here is the cumulative NetProbe project, incorporating everything from Lessons 1 through 5. This lesson added three major features: recursive friend-of-friend discovery with depth limits and cycle detection, manual memoization to avoid recomputing shared subnetworks, and the cli_discover command. The code below is the full project. Copy, paste, and run it.
What was added in this lesson (marked with # NEW in Lesson 5 comments):
discover_recursive(): Recursive traversal with depth limit, visited set, and memoizationdiscover_iterative(): Iterative equivalent using explicit stackcli_discover(): CLI command that formats and displays connection chains_fof_cacheand_memo_hits: Manual memoization infrastructure
What came from previous lessons:
- Lesson 1: Dynamic array awareness (we use
list.appendfor O(1) accumulation, notinsert(0, ...)) - Lesson 2: Hash table foundation (
self.graphis a dict of sets, O(1) lookups) - Lesson 3: Stack/queue patterns (explicit stack in
discover_iterative, deque in LRU cache) - Lesson 4: LRU cache for mutual friends (
self.query_cacheusingOrderedDict), cache invalidation on graph mutation
Run the project you've built so far by executing the complete code block in Section 8 above. You should see:
- The Python recursion limit printed (1000 by default)
- Fibonacci comparison showing naive (2.6M calls) vs memoized (31 misses) for fib(30)
- NetProbe discovering connections from alice within 3 hops
- Confirmation that recursive and iterative versions find the same users
- Mutual friends cache hit demonstration
- lru_cache statistics showing cumulative hits and misses
Summary Table
| Concept | What It Does | When to Use | Watch Out For |
|---|---|---|---|
| Base case | Stops recursion, returns concrete value | Every recursive function needs at least one | Missing base case = infinite recursion |
| Recursive case | Calls function with smaller input | When problem naturally decomposes into subproblems | Must make progress toward base case |
sys.getrecursionlimit() | Shows Python's stack frame limit (default 1000) | Debugging RecursionError | Raising the limit is a band-aid, not a fix |
@lru_cache | Automatic memoization decorator | Pure functions with hashable args | Holds references to self, memory leak risk |
| Manual dict cache | Dictionary-based memoization | Unhashable args or custom invalidation needed | Remember to invalidate on state changes |
| Visited set | Prevents infinite cycles in graph recursion | Any graph traversal (graphs have cycles) | Shared vs copied set changes semantics |
| Depth limit | Bounds maximum recursion depth | Graph/tree traversal with unknown depth | Off-by-one: is start depth 0 or 1? |
| Recursive to iterative | Replace call stack with explicit stack/queue | When depth exceeds Python's limit | Push order affects traversal order |
Next lesson, we shift from traversal to transformation. Lesson 6 covers sorting algorithms, from bubble sort to Python's built-in Timsort. The recursive patterns you learned today, especially divide and conquer, are the backbone of merge sort and quicksort. The memoization mindset will help you understand why some sorting algorithms avoid redundant comparisons. And the "convert recursion to iteration" skill will let you appreciate how Timsort achieves the best of both worlds: the theoretical elegance of merge sort with the practical speed of insertion sort, all without recursive overhead.
Difficulty Fork
Too Easy? (review and preview)
Key takeaways: Recursion is mechanical: base case + recursive case + progress guarantee. Memoization eliminates redundant computation by caching results. Graph recursion needs both depth limits and cycle detection. Any recursion can be mechanically converted to iteration with an explicit stack.
Preview of Lesson 6: We will implement merge sort (divide and conquer recursion in action), quicksort (recursion with partition), and then look under the hood of Python's sorted() to see why Timsort is a hybrid that beats both. You will benchmark them against each other on real data and learn why "just use sorted()" is almost always the right answer, but understanding why it is right makes you a better engineer.
Just Right (reinforce)
Different analogy for memoization: Think of memoization like a textbook's index. If someone asks you "what page is binary search on?" you could flip through every page of the book (naive recursion). Or you could check the index at the back (memoized lookup). Building the index takes time upfront, but every subsequent lookup is instant. The index is your cache. The only question is: how often do people ask about the same topic? If every question is unique, the index does not help. If the same topics come up repeatedly (overlapping subproblems), the index is invaluable.
Extra practice: Take the NetProbe discover_recursive function and modify it to return not just the path strings, but also the count of edges traversed. Then add a --shortest flag to cli_discover that switches from DFS to BFS (using a deque from Lesson 3) and only reports the shortest path to each user. Compare the output of DFS vs BFS on the same graph.
Want a Challenge?
Interview problem: Memoized graph component counting. Given an undirected graph represented as an adjacency list, write a function that counts the number of connected components. Then, support incremental updates: when an edge is added, update the count without recomputing from scratch. This is the Union-Find problem, and it appears in interviews at Google, Meta, and Amazon. The naive recursive approach (DFS from each unvisited node) works but recomputes everything on each edge addition. Can you design a cache invalidation strategy that only recomputes affected components?
Production problem: Recursive dependency resolution. Package managers like pip resolve dependency trees recursively. Write a function that takes a dict of {package: [dependencies]} and returns an installation order (topological sort). Handle circular dependencies by detecting cycles (your visited set technique from today). Then add version constraints and write a memoized resolver that caches compatible version sets for each package. This is a simplified version of what pip, npm, and cargo actually do, and getting it wrong causes the "dependency hell" that every developer has experienced.