Skip to content
Module 06 · Capstone

The Geometry of Optimal Substructure

Dynamic programming is not a trick. It is a discipline: when a problem's optimal answer can be assembled from optimal answers to smaller versions of the same problem, the naive exponential recursion becomes a polynomial-time algorithm — provided you can name the subproblem. This module teaches you to name it.

AdvancedPrerequisites · treesReading time · 28 min
3 concepts2 walkthroughs4 stubs

Module overview

Three-Part Decomposition

State · Transition · Base

Every DP solution names a state (what subproblem?), a transition (how does this state's answer depend on smaller states?), and base cases (what's the smallest solvable state?). If you can write these three, the code writes itself.

Two Styles

Memoise or Tabulate

Memoised DP is recursion with a cache. Tabulated DP is a bottom-up loop filling an array. Same algorithm; different frame.

Optimisation

Space Rolling

If the transition depends only on the last row or two, the full 2D table is overkill. Drop to two 1D arrays and save a factor of n in memory.

Why DP is the capstone

Everything before this module — arrays, linked lists, strings, hash maps, trees — gives you structures to store data in. Dynamic programming gives you a way to store answers. The insight that answers to small subproblems can be remembered and reused is small in statement and enormous in consequence. It is the difference between computing Fibonacci(50) in 89 operations versus 20 billion. The difference between aligning two DNA sequences in minutes versus longer than the age of the universe.

Most students find DP difficult because they approach it through a list of archetypal problems — knapsack, longest common subsequence, edit distance — and try to memorise the code. The code is unmemorable. What is worth learning is the three-part decomposition: state, transition, base case. Once you can name those three things for a new problem, writing the solution is routine.

Start with a recurrence, not with code

Here is the discipline. Before writing any code, write a mathematical recurrence. Define a function f(params) whose output is the answer to a subproblem. State the base case: the value of f for the smallest inputs. State the recursive case: the value of f in terms of calls to f on smaller inputs. If the recurrence is correct, the code is mechanical.

Example: the longest increasing subsequence of a[0..n). Define f(i) = length of the longest increasing subsequence ending at index i. Base case: f(0) = 1 — the subsequence consisting of just a[0]. Transition: f(i) = 1 + max(f(j) for j < i if a[j] < a[i]), or 1 if no such j exists. Answer: max(f(i)) over all i. The code:

def lis(a):
    n = len(a)
    dp = [1] * n
    for i in range(n):
        for j in range(i):
            if a[j] < a[i]:
                dp[i] = max(dp[i], dp[j] + 1)
    return max(dp)

O(n²) time, O(n) memory. There is a faster O(n log n) algorithm that maintains the smallest possible tail of an increasing subsequence of each length — but the recurrence above is the correct place to start, and the optimisation is a second-pass concern.

Memoisation versus tabulation

These are two implementation styles of the same idea. Memoisation writes the recurrence recursively and caches its return values; tabulation fills an array bottom-up. They compute the same answers, with the same asymptotic time and space. Pick the one that makes the transitions clearer.

# Memoised — top-down
from functools import cache

def fib(n):
    @cache
    def f(k):
        if k < 2: return k
        return f(k - 1) + f(k - 2)
    return f(n)

# Tabulated — bottom-up
def fib_tab(n):
    if n < 2: return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    return dp[n]

Memoisation wins when the state space is sparse — you only pay for states you actually reach. Tabulation wins when the state space is dense and you want to avoid the recursion-stack overhead. For interview work, start memoised; once correct, convert to tabulated only if asked.

State design is the whole game

The hardest part of DP is not the code — it is identifying what the state is. Consider the "edit distance" problem: given two strings, find the minimum number of insertions, deletions, and substitutions to turn one into the other. The state is (i, j) = edit distance between the first i characters of string A and the first j characters of string B. That single choice — two indices into the prefixes — is the insight; the transition is then almost forced:

def edit_distance(a, b):
    n, m = len(a), len(b)
    dp = [[0]*(m+1) for _ in range(n+1)]
    for i in range(n+1): dp[i][0] = i    # delete i chars
    for j in range(m+1): dp[0][j] = j    # insert j chars
    for i in range(1, n+1):
        for j in range(1, m+1):
            if a[i-1] == b[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(dp[i-1][j],    # delete
                                   dp[i][j-1],    # insert
                                   dp[i-1][j-1])  # substitute
    return dp[n][m]

O(n·m) time and space. You can drop to O(m) space by keeping only the previous row, because the transition depends only on dp[i-1][*] and dp[i][j-1]. That rolling optimisation is the second-most-common DP technique after memoisation.

0/1 knapsack — the canonical shape

Given items with weights and values, and a capacity W, choose a subset that maximises total value without exceeding capacity. State: f(i, w) = max value using the first i items with capacity w. Transition: f(i, w) = max(f(i-1, w), value[i] + f(i-1, w - weight[i])) — either skip item i, or take it. Base: f(0, w) = 0. Time O(n·W), space O(n·W), reducible to O(W) with rolling rows. The pattern generalises to coin change, subset sum, partition equal subset sum, and most "choose items subject to a budget" problems.

Complexity, reasoned

The running time of a DP is (number of states) × (work per state). For LIS, n states × O(n) transitions = O(n²). For edit distance, n·m states × O(1) transitions = O(n·m). For knapsack, n·W states × O(1) transitions = O(n·W). Space is the number of states, reducible if the transition uses only a recent slice. That is the whole accounting — there is no further magic.

One subtlety: DP is only polynomial if the state space is polynomial in the input size. Knapsack's O(n·W) is called "pseudo-polynomial" because W is a number given in the input, not a length; if W is specified as a 64-bit integer, n·W is not polynomial in the input bit-length. This is why knapsack is NP-hard despite having a DP — the DP is not polynomial, just tractable for small W.

When to reach for DP

Use DP when the problem has (a) an optimisation or counting goal, (b) a decomposition into smaller versions of itself, and (c) overlapping subproblems — the same smaller version is reached by multiple paths. If the subproblems are all distinct, recursion without memoisation is already optimal (that's divide-and-conquer, not DP). If there is no obvious recursive decomposition, the problem is probably not a DP candidate — try graph algorithms or greedy instead.

Where beginners go wrong

  • Writing code before the recurrence. Every misplaced index and off-by-one is the shadow of a hazy recurrence. Write the math first.
  • Leaking state into the cache. @cache keys on function arguments. If your function uses a closure over a mutable variable, the cache will return stale answers.
  • Confusing the answer with the last DP value. Sometimes the answer is dp[n]; sometimes it's max(dp); sometimes it's a reconstruction of the path. Know which before the tests run.
  • Optimising space prematurely. Write the full 2D table first. Verify correctness. Then — and only then — consider rolling rows.

What to read next

Continue with Module 07 · Graphs, which introduces structures where DP appears as shortest-path algorithms (Dijkstra, Bellman-Ford) and as DAG-DP. The concept companion to this module is Sliding Window in Depth, whose "expand and contract" rhythm is itself a form of DP with constant space.

Concept deep-dives

Where each idea gets its own chapter

Each deep-dive expands a single idea from this module into a standalone piece with worked examples, complexity proofs, and the template that shows up in practice.

Practice · Curated problem set

Problems that exercise this module

Problems marked with a link have a full walkthrough — approach, code, complexity, and the "where beginners go wrong" section. Stubs are on the queue; they will light up as the walkthroughs land.

01Climbing Stairs1D · fibonacci-shapeeasy
02House Robber1D · take-or-skipeasy
03Longest Common Subsequence2D-string · dpmedium04Coin Changeunbounded · minmedium
05Partition Equal Subset Sumsubset-summedium
06Edit Distance2D · substringhard