The Geometry of Optimal Substructure
Dynamic programming is not a trick. It is a discipline: when a problem's optimal answer can be assembled from optimal answers to smaller versions of the same problem, the naive exponential recursion becomes a polynomial-time algorithm — provided you can name the subproblem. This module teaches you to name it.
Module overview
State · Transition · Base
Every DP solution names a state (what subproblem?), a transition (how does this state's answer depend on smaller states?), and base cases (what's the smallest solvable state?). If you can write these three, the code writes itself.
Memoise or Tabulate
Memoised DP is recursion with a cache. Tabulated DP is a bottom-up loop filling an array. Same algorithm; different frame.
Space Rolling
If the transition depends only on the last row or two, the full 2D table is overkill. Drop to two 1D arrays and save a factor of n in memory.
Why DP is the capstone
Everything before this module — arrays, linked lists, strings, hash maps, trees — gives you structures to store data in. Dynamic programming gives you a way to store answers. The insight that answers to small subproblems can be remembered and reused is small in statement and enormous in consequence. It is the difference between computing Fibonacci(50) in 89 operations versus 20 billion. The difference between aligning two DNA sequences in minutes versus longer than the age of the universe.
Most students find DP difficult because they approach it through a list of archetypal problems — knapsack, longest common subsequence, edit distance — and try to memorise the code. The code is unmemorable. What is worth learning is the three-part decomposition: state, transition, base case. Once you can name those three things for a new problem, writing the solution is routine.
Start with a recurrence, not with code
Here is the discipline. Before writing any code, write a mathematical recurrence. Define a function f(params) whose output is the answer to a subproblem. State the base case: the value of f for the smallest inputs. State the recursive case: the value of f in terms of calls to f on smaller inputs. If the recurrence is correct, the code is mechanical.
Example: the longest increasing subsequence of a[0..n). Define f(i) = length of the longest increasing subsequence ending at index i. Base case: f(0) = 1 — the subsequence consisting of just a[0]. Transition: f(i) = 1 + max(f(j) for j < i if a[j] < a[i]), or 1 if no such j exists. Answer: max(f(i)) over all i. The code:
def lis(a):
n = len(a)
dp = [1] * n
for i in range(n):
for j in range(i):
if a[j] < a[i]:
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
O(n²) time, O(n) memory. There is a faster O(n log n) algorithm that maintains the smallest possible tail of an increasing subsequence of each length — but the recurrence above is the correct place to start, and the optimisation is a second-pass concern.
Memoisation versus tabulation
These are two implementation styles of the same idea. Memoisation writes the recurrence recursively and caches its return values; tabulation fills an array bottom-up. They compute the same answers, with the same asymptotic time and space. Pick the one that makes the transitions clearer.
# Memoised — top-down
from functools import cache
def fib(n):
@cache
def f(k):
if k < 2: return k
return f(k - 1) + f(k - 2)
return f(n)
# Tabulated — bottom-up
def fib_tab(n):
if n < 2: return n
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
Memoisation wins when the state space is sparse — you only pay for states you actually reach. Tabulation wins when the state space is dense and you want to avoid the recursion-stack overhead. For interview work, start memoised; once correct, convert to tabulated only if asked.
State design is the whole game
The hardest part of DP is not the code — it is identifying what the state is. Consider the "edit distance" problem: given two strings, find the minimum number of insertions, deletions, and substitutions to turn one into the other. The state is (i, j) = edit distance between the first i characters of string A and the first j characters of string B. That single choice — two indices into the prefixes — is the insight; the transition is then almost forced:
def edit_distance(a, b):
n, m = len(a), len(b)
dp = [[0]*(m+1) for _ in range(n+1)]
for i in range(n+1): dp[i][0] = i # delete i chars
for j in range(m+1): dp[0][j] = j # insert j chars
for i in range(1, n+1):
for j in range(1, m+1):
if a[i-1] == b[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = 1 + min(dp[i-1][j], # delete
dp[i][j-1], # insert
dp[i-1][j-1]) # substitute
return dp[n][m]
O(n·m) time and space. You can drop to O(m) space by keeping only the previous row, because the transition depends only on dp[i-1][*] and dp[i][j-1]. That rolling optimisation is the second-most-common DP technique after memoisation.
0/1 knapsack — the canonical shape
Given items with weights and values, and a capacity W, choose a subset that maximises total value without exceeding capacity. State: f(i, w) = max value using the first i items with capacity w. Transition: f(i, w) = max(f(i-1, w), value[i] + f(i-1, w - weight[i])) — either skip item i, or take it. Base: f(0, w) = 0. Time O(n·W), space O(n·W), reducible to O(W) with rolling rows. The pattern generalises to coin change, subset sum, partition equal subset sum, and most "choose items subject to a budget" problems.
Complexity, reasoned
The running time of a DP is (number of states) × (work per state). For LIS, n states × O(n) transitions = O(n²). For edit distance, n·m states × O(1) transitions = O(n·m). For knapsack, n·W states × O(1) transitions = O(n·W). Space is the number of states, reducible if the transition uses only a recent slice. That is the whole accounting — there is no further magic.
One subtlety: DP is only polynomial if the state space is polynomial in the input size. Knapsack's O(n·W) is called "pseudo-polynomial" because W is a number given in the input, not a length; if W is specified as a 64-bit integer, n·W is not polynomial in the input bit-length. This is why knapsack is NP-hard despite having a DP — the DP is not polynomial, just tractable for small W.
When to reach for DP
Use DP when the problem has (a) an optimisation or counting goal, (b) a decomposition into smaller versions of itself, and (c) overlapping subproblems — the same smaller version is reached by multiple paths. If the subproblems are all distinct, recursion without memoisation is already optimal (that's divide-and-conquer, not DP). If there is no obvious recursive decomposition, the problem is probably not a DP candidate — try graph algorithms or greedy instead.
Where beginners go wrong
- Writing code before the recurrence. Every misplaced index and off-by-one is the shadow of a hazy recurrence. Write the math first.
- Leaking state into the cache.
@cachekeys on function arguments. If your function uses a closure over a mutable variable, the cache will return stale answers. - Confusing the answer with the last DP value. Sometimes the answer is
dp[n]; sometimes it'smax(dp); sometimes it's a reconstruction of the path. Know which before the tests run. - Optimising space prematurely. Write the full 2D table first. Verify correctness. Then — and only then — consider rolling rows.
What to read next
Continue with Module 07 · Graphs, which introduces structures where DP appears as shortest-path algorithms (Dijkstra, Bellman-Ford) and as DAG-DP. The concept companion to this module is Sliding Window in Depth, whose "expand and contract" rhythm is itself a form of DP with constant space.
Where each idea gets its own chapter
Each deep-dive expands a single idea from this module into a standalone piece with worked examples, complexity proofs, and the template that shows up in practice.
State Design Templates
The hardest part of dynamic programming isn't writing the loop — it's identifying the state. Across the canon, six state shapes cover almost every problem you'll see: 1D linear, 2D…
Knapsack Variants
0/1 vs unbounded vs bounded vs fractional — four flavours of knapsack with surprisingly different solutions. The 0/1 version is NP-hard but pseudo-polynomial DP. The unbounded vers…
Memoisation vs Tabulation
Memoisation writes the recurrence as a recursive function with a cache; tabulation fills an array bottom-up. Same algorithm, same complexity, different ergonomics. This piece walks…
Problems that exercise this module
Problems marked with a link have a full walkthrough — approach, code, complexity, and the "where beginners go wrong" section. Stubs are on the queue; they will light up as the walkthroughs land.