Data Structures & Algorithms

A comprehensive FAANG interview preparation guide covering every essential data structure, algorithm, and problem-solving pattern you need to master.

13 Modules • 27 Diagrams • 50+ Code Examples • Complete Interview Prep

Module 01

Complexity Analysis

The mathematical foundation for evaluating algorithm efficiency — Big-O, recurrences, amortized analysis, and space complexity.

Big-O, Omega & Theta Notation

Asymptotic notation describes how an algorithm's resource usage scales as input size approaches infinity. The three primary notations each serve a distinct purpose:

Big-O (Upper Bound)

f(n) = O(g(n)) means there exist constants c > 0 and n₀ such that f(n) ≤ c · g(n) for all n ≥ n₀.

Big-O gives the worst-case upper bound. When we say an algorithm is O(n²), we mean that in the worst case, the running time will not grow faster than some constant times n².

Big-Omega (Lower Bound)

f(n) = Ω(g(n)) means there exist constants c > 0 and n₀ such that f(n) ≥ c · g(n) for all n ≥ n₀.

Omega provides the best-case lower bound. Comparison-based sorting, for instance, has a lower bound of Ω(n log n).

Big-Theta (Tight Bound)

f(n) = Θ(g(n)) means f(n) = O(g(n)) AND f(n) = Ω(g(n)). This gives us an exact asymptotic characterization.

Example: Merge sort is Θ(n log n) in all cases — it always does roughly the same amount of work regardless of input order.

Little-o and Little-omega

  • f(n) = o(g(n)) — f grows strictly slower than g (not tight). Example: n = o(n²)
  • f(n) = ω(g(n)) — f grows strictly faster than g. Example: n² = ω(n)
💡
Interview Tip: When interviewers say "What's the time complexity?", they almost always mean Big-O (worst-case upper bound). But be prepared to discuss best-case and average-case when asked to compare algorithms like quicksort vs merge sort.

Growth Rate Comparison

Understanding relative growth rates is critical for choosing appropriate algorithms. Here is the standard hierarchy from fastest to slowest growing:

ComplexityNamen=10n=100n=1000n=10⁶Example
O(1)Constant1111Hash table lookup
O(log n)Logarithmic371020Binary search
O(n)Linear10100100010⁶Linear scan
O(n log n)Linearithmic3366499662×10⁷Merge sort
O(n²)Quadratic10010⁴10⁶10¹²Bubble sort
O(2ⁿ)Exponential102410³⁰------All subsets
O(n!)Factorial3.6M---------All permutations
Input Size (n) Operations O(1) O(log n) O(n) O(n log n) O(n²) O(2ⁿ)
Figure 1.1 — Growth rate curves for common time complexities
Gotcha: O(log n) is NOT always binary search. Many algorithms are O(log n): balanced BST operations, exponentiation by squaring, Euclidean GCD algorithm, and any process that halves the problem size each step. Don't automatically assume binary search when you see logarithmic complexity.

Loop Analysis Patterns

Most interview-level complexity analysis reduces to counting loop iterations. Master these patterns:

Single Loop

for i in range(n):       # O(n) — linear
    process(i)

for i in range(0, n, 2):  # O(n/2) = O(n) — still linear
    process(i)

i = 1
while i < n:              # O(log n) — i doubles each step
    process(i)
    i *= 2

Nested Independent Loops

for i in range(n):          # Outer: n iterations
    for j in range(m):      # Inner: m iterations
        process(i, j)       # Total: O(n * m)

Nested Dependent Loops

for i in range(n):
    for j in range(i):      # j depends on i
        process(i, j)
# Total: 0 + 1 + 2 + ... + (n-1) = n(n-1)/2 = O(n²)

Log-Linear Pattern

for i in range(n):          # n iterations
    j = 1
    while j < n:             # log(n) iterations
        process(i, j)
        j *= 2              # Total: O(n log n)

Tricky Harmonic Series

for i in range(1, n+1):
    for j in range(0, n, i):  # n/1 + n/2 + n/3 + ... + n/n
        process(i, j)
# Total: n * H(n) = n * (ln n + O(1)) = O(n log n)

Recurrence Relations & Master Theorem

Recurrence relations express the runtime of recursive algorithms. The Master Theorem provides a shortcut for solving many common forms.

Standard Form: T(n) = aT(n/b) + f(n)

  • a = number of subproblems per recursive call
  • b = factor by which input is divided
  • f(n) = cost of dividing + combining

The Three Cases

Let c_crit = log_b(a) (the critical exponent):

CaseConditionResultIntuition
Case 1f(n) = O(n^(c_crit - ε))T(n) = Θ(n^c_crit)Leaves dominate — recursion does most work
Case 2f(n) = Θ(n^c_crit · log^k n)T(n) = Θ(n^c_crit · log^(k+1) n)Equal work at each level
Case 3f(n) = Ω(n^(c_crit + ε))T(n) = Θ(f(n))Root dominates — combine cost prevails

Common Recurrence Examples

AlgorithmRecurrencea, b, f(n)CaseSolution
Binary SearchT(n) = T(n/2) + O(1)1, 2, O(1)Case 2 (k=0)Θ(log n)
Merge SortT(n) = 2T(n/2) + O(n)2, 2, O(n)Case 2 (k=0)Θ(n log n)
StrassenT(n) = 7T(n/2) + O(n²)7, 2, O(n²)Case 1Θ(n^2.807)
KaratsubaT(n) = 3T(n/2) + O(n)3, 2, O(n)Case 1Θ(n^1.585)
Linear selectT(n) = T(n/5) + T(7n/10) + O(n)Akra-BazziΘ(n)

Amortized Analysis

Amortized analysis determines the average cost per operation over a worst-case sequence. It differs from average-case analysis because it makes no probabilistic assumptions — it guarantees the average even in the worst case.

Three Methods

1. Aggregate Method

Compute the total cost of n operations, then divide by n.

Example — Dynamic Array (ArrayList/vector): When the array is full, we double its capacity. For n insertions: most cost O(1), but capacity doublings cost 1 + 2 + 4 + 8 + ... + n = 2n - 1. Total = n + (2n - 1) < 3n. Amortized cost = 3n/n = O(1) per insertion.

2. Accounting (Banker's) Method

Assign each operation an "amortized cost" that may differ from its actual cost. Overcharged operations build "credit" that pays for expensive future operations.

Example: Charge each array insert $3: $1 for the insert itself, $1 saved to copy this element later, $1 saved to copy an existing element. When doubling occurs, saved credits exactly cover the copying cost.

3. Potential Method

Define a potential function Φ that maps data structure states to non-negative reals. Amortized cost = actual cost + ΔΦ.

Example: For dynamic array, let Φ = 2 × size - capacity. Normal insert: actual=1, ΔΦ=2, amortized=3. Doubling: actual=n+1, ΔΦ=2-(n-1)=-n+3, amortized=3.

Space Complexity

Space complexity measures the total memory an algorithm requires as a function of input size.

Types of Space

  • Auxiliary space: Extra space used beyond the input — this is what interviewers usually ask about
  • Total space: Auxiliary space + input space
  • Stack space: Memory used by recursive call frames — each frame stores local variables, return address, and parameters

In-Place Algorithms

An algorithm is in-place if it uses O(1) auxiliary space (or O(log n) for recursive stack). Examples:

  • In-place: Quicksort (O(log n) stack), heapsort, insertion sort, two-pointer techniques
  • Not in-place: Merge sort (O(n) auxiliary array), counting sort (O(k) counting array)

Recursive Space Analysis

# Factorial — O(n) stack space (n recursive frames)
def factorial(n):
    if n <= 1: return 1
    return n * factorial(n - 1)

# Binary search — O(log n) stack space
def binary_search(arr, target, lo, hi):
    if lo > hi: return -1
    mid = (lo + hi) // 2
    if arr[mid] == target: return mid
    elif arr[mid] < target: return binary_search(arr, target, mid+1, hi)
    else: return binary_search(arr, target, lo, mid-1)

# Fibonacci naive — O(n) stack (depth of recursion tree)
# NOTE: O(2^n) time but only O(n) space (tree depth, not breadth)
def fib(n):
    if n <= 1: return n
    return fib(n-1) + fib(n-2)
Growth rates f(n) across n = 1,2,4,8,16,32,64,128,256
Dynamic array: push_back (amortized O(1)) vs push_front (O(n))
Module 02

Arrays & Strings

The most fundamental data structures — master contiguous memory, two-pointer techniques, sliding windows, prefix sums, and string matching algorithms.

Array Internals

An array stores elements in contiguous memory locations. Given base address B and element size S, the address of element at index i is: addr(i) = B + i × S. This formula enables O(1) random access.

Key Properties

  • Fixed size (static arrays) vs dynamic resizing (ArrayList, vector, Python list)
  • Cache-friendly: Sequential access patterns exploit CPU cache lines (typically 64 bytes), making array traversal extremely fast in practice
  • Insert/delete at arbitrary position: O(n) due to shifting
  • Append (dynamic array): O(1) amortized
Contiguous Memory Layout Base: 0x1000 42 [0] 0x1000 17 [1] 0x1004 85 [2] 0x1008 33 [3] 0x100C 91 [4] 0x1010 56 [5] 0x1014 74 [6] 0x1018 Cache Line 1 (64B)
Figure 2.1 — Array elements stored in contiguous memory with cache line boundaries

Two Pointers

Both pointers only move monotonically, together O(n) visits. No auxiliary storage beyond a few index variables.

The two-pointer technique uses two indices to traverse an array, reducing O(n²) brute force to O(n). Two main variants:

Opposite Direction (Converging Pointers)

Start pointers at both ends and move them inward. Classic use case: finding pairs in a sorted array.

def two_sum_sorted(arr, target):
    """Find pair in sorted array summing to target. O(n) time, O(1) space."""
    left, right = 0, len(arr) - 1
    while left < right:
        curr_sum = arr[left] + arr[right]
        if curr_sum == target:
            return [left, right]
        elif curr_sum < target:
            left += 1       # Need larger sum
        else:
            right -= 1      # Need smaller sum
    return [-1, -1]

def is_palindrome(s):
    """Check palindrome. O(n) time, O(1) space."""
    left, right = 0, len(s) - 1
    while left < right:
        if s[left] != s[right]:
            return False
        left += 1
        right -= 1
    return True

Same Direction (Fast-Slow / Read-Write)

Both pointers move in the same direction. Used for in-place array modification and partitioning.

def remove_duplicates(arr):
    """Remove duplicates from sorted array in-place. O(n) time, O(1) space."""
    if not arr: return 0
    write = 1                        # write pointer
    for read in range(1, len(arr)):   # read pointer
        if arr[read] != arr[read - 1]:
            arr[write] = arr[read]
            write += 1
    return write

def dutch_national_flag(arr):
    """Sort array of 0s, 1s, 2s. O(n) time, O(1) space. (3-pointer variant)"""
    lo, mid, hi = 0, 0, len(arr) - 1
    while mid <= hi:
        if arr[mid] == 0:
            arr[lo], arr[mid] = arr[mid], arr[lo]
            lo += 1; mid += 1
        elif arr[mid] == 1:
            mid += 1
        else:
            arr[mid], arr[hi] = arr[hi], arr[mid]
            hi -= 1

Sliding Window

Each index enters and leaves the window at most once; HashMap state for a K-distinct window adds O(k) space where k is the alphabet size.

Sliding window maintains a contiguous subarray/substring and slides it across the input. Reduces O(n×k) brute force to O(n).

Fixed-Size Window

def max_sum_subarray(arr, k):
    """Maximum sum of subarray of size k. O(n) time."""
    window_sum = sum(arr[:k])
    max_sum = window_sum
    for i in range(k, len(arr)):
        window_sum += arr[i] - arr[i - k]  # slide: add right, remove left
        max_sum = max(max_sum, window_sum)
    return max_sum

Variable-Size Window

def min_subarray_len(target, arr):
    """Smallest subarray with sum >= target. O(n) time."""
    left = 0
    curr_sum = 0
    min_len = float('inf')
    for right in range(len(arr)):
        curr_sum += arr[right]             # expand window
        while curr_sum >= target:          # shrink until invalid
            min_len = min(min_len, right - left + 1)
            curr_sum -= arr[left]
            left += 1
    return min_len if min_len != float('inf') else 0

def longest_unique_substring(s):
    """Longest substring without repeating chars. O(n) time."""
    seen = {}
    left = 0
    max_len = 0
    for right, ch in enumerate(s):
        if ch in seen and seen[ch] >= left:
            left = seen[ch] + 1           # jump left past duplicate
        seen[ch] = right
        max_len = max(max_len, right - left + 1)
    return max_len
Sliding Window State Machine EXPAND right++ SHRINK left++ RECORD update answer valid? optimal? next element
Figure 2.2 — Sliding window state transitions: expand, check validity, shrink, record answer

Prefix Sums & Difference Arrays

1D Prefix Sum

Precompute cumulative sums to answer any range sum query in O(1):

# Build prefix sum: O(n)
prefix = [0] * (len(arr) + 1)
for i in range(len(arr)):
    prefix[i+1] = prefix[i] + arr[i]

# Query sum of arr[l..r] inclusive: O(1)
range_sum = prefix[r+1] - prefix[l]

Difference Array

The inverse of prefix sums. Allows O(1) range updates and O(n) final reconstruction:

# Apply +val to arr[l..r] inclusive
diff[l] += val
diff[r+1] -= val   # if r+1 within bounds

# Reconstruct with prefix sum
for i in range(1, len(diff)):
    diff[i] += diff[i-1]

2D Prefix Sum

For matrix range sum queries. Uses inclusion-exclusion principle:

# Build 2D prefix sum
for i in range(1, rows+1):
    for j in range(1, cols+1):
        prefix[i][j] = matrix[i-1][j-1] + prefix[i-1][j] + prefix[i][j-1] - prefix[i-1][j-1]

# Query sum of submatrix (r1,c1) to (r2,c2)
sub_sum = prefix[r2+1][c2+1] - prefix[r1][c2+1] - prefix[r2+1][c1] + prefix[r1][c1]

String Matching: KMP & Rabin-Karp

KMP never backtracks in the text; the failure-function preprocessing is O(m). Rabin-Karp averages O(n+m) with a rolling hash but falls to O(nm) on adversarial hash collisions.

KMP (Knuth-Morris-Pratt) — O(n + m)

KMP avoids redundant comparisons by precomputing a failure function (also called the LPS array — longest proper prefix which is also a suffix).

def compute_lps(pattern):
    """Build failure/LPS array. O(m) time."""
    m = len(pattern)
    lps = [0] * m
    length = 0
    i = 1
    while i < m:
        if pattern[i] == pattern[length]:
            length += 1
            lps[i] = length
            i += 1
        elif length > 0:
            length = lps[length - 1]  # fall back
        else:
            lps[i] = 0
            i += 1
    return lps

def kmp_search(text, pattern):
    """KMP pattern search. O(n+m) time."""
    lps = compute_lps(pattern)
    i = j = 0
    results = []
    while i < len(text):
        if text[i] == pattern[j]:
            i += 1; j += 1
        if j == len(pattern):
            results.append(i - j)
            j = lps[j - 1]
        elif i < len(text) and text[i] != pattern[j]:
            if j > 0:
                j = lps[j - 1]
            else:
                i += 1
    return results

Rabin-Karp — O(n + m) average

Uses a rolling hash to compare pattern hash with window hashes. Worst case O(nm) due to hash collisions.

def rabin_karp(text, pattern):
    """Rabin-Karp with rolling hash. O(n+m) average."""
    n, m = len(text), len(pattern)
    base, mod = 256, 10**9 + 7
    p_hash = 0
    t_hash = 0
    h = pow(base, m - 1, mod)
    for i in range(m):
        p_hash = (p_hash * base + ord(pattern[i])) % mod
        t_hash = (t_hash * base + ord(text[i])) % mod
    for i in range(n - m + 1):
        if p_hash == t_hash:
            if text[i:i+m] == pattern:  # verify to avoid collision
                return i
        if i < n - m:
            t_hash = ((t_hash - ord(text[i]) * h) * base + ord(text[i+m])) % mod
    return -1
Gotcha: String immutability causing O(n²) concatenation. In Python and Java, strings are immutable. Repeated concatenation with += creates a new string each time, turning an O(n) loop into O(n²). Use ''.join(list) in Python or StringBuilder in Java instead.
Sliding window: smallest subarray with sum ≥ 15 (input [4,2,5,1,3,7,6])
Two-pointer: sorted two-sum on [2,3,5,7,8,11] target=10
Prefix sum over [3,1,4,1,5,9,2,6]
KMP failure function (LPS) for pattern "abababc"
Module 03

Linked Lists

Pointer-based data structures — master node manipulation, cycle detection, reversal algorithms, and understand their trade-offs against arrays.

Types & Node Structure

Singly Linked List

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next     # pointer to next node

Doubly Linked List

class DListNode:
    def __init__(self, val=0, prev=None, next=None):
        self.val = val
        self.prev = prev     # pointer to previous node
        self.next = next     # pointer to next node

Circular Linked List

The last node's next pointer points back to the head, forming a cycle. Useful for round-robin scheduling, circular buffers.

OperationSinglyDoublyArray
Access by indexO(n)O(n)O(1)
Insert at headO(1)O(1)O(n)
Insert at tailO(n) / O(1)*O(1)O(1) amortized
Delete given nodeO(n)O(1)O(n)
SearchO(n)O(n)O(n)
Space per elementval + 1 ptrval + 2 ptrsval only

*O(1) if tail pointer is maintained

Singly Linked List head 5 next 12 next 8 next 3 next null
Figure 3.1 — Singly linked list with head pointer and null-terminated tail

Floyd's Cycle Detection (Tortoise and Hare)

Slow and fast pointers collide within at most one cycle traversal; no hashmap needed, so memory is two pointers total.

Detect if a linked list contains a cycle using O(1) space. Use two pointers moving at different speeds.

Algorithm

def has_cycle(head):
    """Detect cycle. O(n) time, O(1) space."""
    slow = fast = head
    while fast and fast.next:
        slow = slow.next           # 1 step
        fast = fast.next.next      # 2 steps
        if slow == fast:
            return True            # they meet inside cycle
    return False

def find_cycle_start(head):
    """Find where cycle begins. O(n) time, O(1) space."""
    slow = fast = head
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            # Reset one pointer to head
            slow = head
            while slow != fast:
                slow = slow.next
                fast = fast.next   # both move 1 step now
            return slow            # cycle start
    return None

Why It Works (Proof Sketch)

Let the distance from head to cycle start be F, cycle length be C, and the meeting point be a steps into the cycle.

  • When they meet: slow traveled F + a, fast traveled F + a + kC (some laps)
  • Since fast moves twice as far: 2(F + a) = F + a + kC, so F + a = kC
  • Therefore F = kC - a, meaning if we start one pointer at head and one at meeting point (both at speed 1), they meet at the cycle start after F steps
1 head 2 3 cycle start 4 5 meet here 6 7 F = 2 (distance to cycle start) C = 5 (cycle length)
Figure 3.2 — Floyd's algorithm: slow (green) and fast (red) pointers meet inside the cycle

Reversal Techniques

Iterative Reversal (3-pointer technique)

def reverse_list(head):
    """Reverse singly linked list. O(n) time, O(1) space."""
    prev = None
    curr = head
    while curr:
        nxt = curr.next      # save next
        curr.next = prev     # reverse pointer
        prev = curr          # advance prev
        curr = nxt           # advance curr
    return prev              # new head

Recursive Reversal

def reverse_recursive(head):
    """Reverse recursively. O(n) time, O(n) stack space."""
    if not head or not head.next:
        return head
    new_head = reverse_recursive(head.next)
    head.next.next = head   # make next node point back
    head.next = None        # break original forward link
    return new_head

Reverse in K-Groups

def reverse_k_group(head, k):
    """Reverse nodes in groups of k. O(n) time, O(1) space."""
    # Check if k nodes remain
    count = 0
    node = head
    while node and count < k:
        node = node.next
        count += 1
    if count < k:
        return head    # fewer than k nodes, don't reverse

    # Reverse k nodes
    prev = None
    curr = head
    for _ in range(k):
        nxt = curr.next
        curr.next = prev
        prev = curr
        curr = nxt

    # head is now tail of reversed group
    head.next = reverse_k_group(curr, k)  # recurse on remaining
    return prev    # prev is new head of reversed group
Gotcha: Losing head reference during reversal. A common mistake is forgetting to save curr.next before overwriting it. Always use a temporary variable. Also, the original head becomes the new tail after reversal — don't return the wrong pointer.
Floyd's tortoise & hare on a list with cycle start at node 3
In-place reversal — 3-pointer pattern (prev, curr, next)
Module 04

Stacks & Queues

LIFO and FIFO abstractions — master monotonic stacks, queue variants, and their deep connection to recursion and BFS/DFS.

Stack Fundamentals

A stack is a Last-In, First-Out (LIFO) data structure. Think of a stack of plates — you can only add or remove from the top.

Core Operations (all O(1))

  • push(x): Add element to top
  • pop(): Remove and return top element
  • peek() / top(): View top element without removing
  • isEmpty(): Check if stack is empty

Classic Stack Problems

def valid_parentheses(s):
    """Check if brackets are balanced. O(n) time, O(n) space."""
    stack = []
    mapping = {')': '(', '}': '{', ']': '['}
    for ch in s:
        if ch in mapping:
            if not stack or stack[-1] != mapping[ch]:
                return False
            stack.pop()
        else:
            stack.append(ch)
    return len(stack) == 0

def eval_rpn(tokens):
    """Evaluate Reverse Polish Notation. O(n) time."""
    stack = []
    ops = {'+': lambda a,b: a+b, '-': lambda a,b: a-b,
           '*': lambda a,b: a*b, '/': lambda a,b: int(a/b)}
    for t in tokens:
        if t in ops:
            b, a = stack.pop(), stack.pop()
            stack.append(ops[t](a, b))
        else:
            stack.append(int(t))
    return stack[0]

Monotonic Stack

Every index is pushed and popped at most once across the whole pass; the stack can hold up to n indices in the worst case.

A monotonic stack maintains elements in sorted order (either increasing or decreasing). It processes "next greater element" and "next smaller element" problems in O(n).

Next Greater Element

def next_greater_element(arr):
    """Find next greater element for each position. O(n) time."""
    n = len(arr)
    result = [-1] * n
    stack = []  # stores indices, maintains decreasing values
    for i in range(n):
        while stack and arr[stack[-1]] < arr[i]:
            idx = stack.pop()
            result[idx] = arr[i]   # arr[i] is next greater for arr[idx]
        stack.append(i)
    return result

Largest Rectangle in Histogram

def largest_rectangle(heights):
    """Classic monotonic stack problem. O(n) time."""
    stack = []  # stores indices, maintains increasing heights
    max_area = 0
    heights.append(0)  # sentinel to flush remaining bars
    for i, h in enumerate(heights):
        while stack and heights[stack[-1]] > h:
            height = heights[stack.pop()]
            width = i if not stack else i - stack[-1] - 1
            max_area = max(max_area, height * width)
        stack.append(i)
    return max_area
Monotonic Stack: Processing [2, 1, 5, 6, 2, 3] Step 1: push 2 2 Step 2: pop 2, push 1 1 Step 3: push 5 5 1 Step 4: push 6 6 5 1 Step 5: pop 6,5; push 2 2 1 Stack maintains increasing order from bottom to top. Each element is pushed/popped at most once = O(n) total.
Figure 4.1 — Monotonic (increasing) stack processing steps

Queue Variants

Standard Queue (FIFO)

First-In, First-Out. Primary operations: enqueue(x), dequeue(), peek(). Used for BFS, task scheduling, and buffering.

Double-Ended Queue (Deque)

Supports O(1) insertion and removal at both ends. Python's collections.deque is implemented as a doubly-linked list of fixed-size blocks.

from collections import deque
d = deque()
d.append(x)       # add to right — O(1)
d.appendleft(x)   # add to left  — O(1)
d.pop()            # remove from right — O(1)
d.popleft()        # remove from left  — O(1)

Sliding Window Maximum (Monotonic Deque)

def max_sliding_window(nums, k):
    """Max in each window of size k. O(n) time using deque."""
    dq = deque()  # stores indices, values in decreasing order
    result = []
    for i, num in enumerate(nums):
        # Remove elements outside window
        while dq and dq[0] <= i - k:
            dq.popleft()
        # Remove smaller elements (they'll never be the max)
        while dq and nums[dq[-1]] < num:
            dq.pop()
        dq.append(i)
        if i >= k - 1:
            result.append(nums[dq[0]])
    return result

Circular Buffer

Fixed-size array implementing a queue. Uses head and tail pointers that wrap around using modular arithmetic: next = (index + 1) % capacity. Used in OS kernel buffers and embedded systems.

Priority Queue (Bridge to Heaps)

A queue where elements are dequeued in priority order, not insertion order. Implemented using a heap (covered in Module 7). Key operations: insert O(log n), extract-min/max O(log n), peek O(1).

Gotcha: Every recursive solution can be converted to iterative with an explicit stack. Recursion implicitly uses the call stack. If you hit stack overflow or need to control memory, convert to iteration by maintaining your own stack. DFS with a stack is a classic example.
Monotonic stack: next greater element for [2,1,2,4,3,1]
Circular buffer deque — head and tail wrap modulo capacity
Module 05

Hash Tables

The workhorse of O(1) lookups — hash function design, collision resolution strategies, load factors, and probabilistic extensions like Bloom filters.

Hash Function Design

A hash function maps keys of arbitrary size to fixed-size integers (bucket indices). A good hash function must satisfy:

  • Deterministic: Same key always produces the same hash
  • Uniform distribution: Keys are spread evenly across buckets to minimize collisions
  • Avalanche effect: Small changes in input produce drastic changes in output
  • Efficient: O(1) to compute (for fixed-size keys)

Common Hash Functions

# Division method: h(k) = k mod m
# Choose m as a prime not close to a power of 2
def hash_div(key, m):
    return key % m

# Multiplication method: h(k) = floor(m * (k*A mod 1))
# Knuth suggests A = (sqrt(5)-1)/2 ≈ 0.6180339887
def hash_mult(key, m):
    A = 0.6180339887
    return int(m * ((key * A) % 1))

# String hashing (polynomial rolling hash)
def hash_string(s, m):
    h = 0
    base = 31
    for ch in s:
        h = (h * base + ord(ch)) % m
    return h
Hash Table: Key → Hash Function → Bucket "apple" "banana" "cherry" Hash Function h(k) = k % m Buckets [0] [1] apple [2] [3] banana [4] cherry
Figure 5.1 — Hash table mapping: keys pass through a hash function to determine bucket indices

Collision Resolution

When two keys hash to the same bucket, we have a collision. Two primary strategies:

1. Separate Chaining

Each bucket contains a linked list (or tree) of all entries that hash to that bucket.

  • Insert: Hash to bucket, prepend to chain — O(1)
  • Search: Hash to bucket, traverse chain — O(chain length)
  • Worst case: All keys hash to same bucket — O(n) per operation
  • Load factor α = n/m (entries/buckets). Average chain length = α

2. Open Addressing

All entries stored in the table itself. On collision, probe for the next empty slot.

Probing StrategyProbe SequenceProsCons
Linear probingh(k) + iCache-friendly, simplePrimary clustering
Quadratic probingh(k) + i²Reduces clusteringSecondary clustering, may miss slots
Double hashingh₁(k) + i·h₂(k)Best distributionTwo hash functions needed
Chaining vs Open Addressing (Linear Probing) Separate Chaining [0] [1] A D [2] B [3] [4] C Open Addressing (Linear Probing) [0] empty [1] A [2] D ← probed [3] B [4] C A and D collide at bucket [1]. Chaining stores both in a linked list; linear probing puts D in [2].
Figure 5.2 — Collision resolution: separate chaining vs open addressing with linear probing

Load Factor & Dynamic Resizing

The load factor α = n/m (elements / buckets) determines performance. As α increases, collisions increase and performance degrades.

ImplementationDefault Load FactorResize TriggerGrowth Factor
Java HashMap0.75α > 0.752x
Python dict0.667α > 2/3~2x (next power of 2)
C++ unordered_map1.0α > 1.0~2x
Go map6.5 (bucket load)avg bucket load > 6.52x

Resizing process: Allocate a new array (typically 2x size), rehash every existing key into the new array. This single resize is O(n), but over n insertions, the amortized cost per insertion remains O(1).

Bloom Filters

A Bloom filter is a probabilistic data structure for set membership testing. It can tell you:

  • "Definitely NOT in the set" — guaranteed correct (no false negatives)
  • "Probably in the set" — may be wrong (false positives possible)

How It Works

Uses a bit array of m bits and k independent hash functions. To insert element x: set bits at positions h₁(x), h₂(x), ..., hₖ(x) to 1. To query x: check if ALL k bit positions are 1.

False Positive Rate

After inserting n elements: FPR ≈ (1 - e^(-kn/m))^k

Optimal number of hash functions: k = (m/n) · ln(2)

Bloom Filter: Bit Array with Multiple Hash Functions Bit Array (m=16): 0 1 0 1 0 1 0 0 1 0 insert("cat") h1("cat")=1 h2("cat")=5 h3("cat")=8 Three hash functions set bits 1, 5, and 8. To check membership, verify all three bits are set.
Figure 5.3 — Bloom filter: multiple hash functions set bits in a shared bit array
Gotcha: HashMap is O(1) average but O(n) worst case. When all keys hash to the same bucket, operations degrade to O(n). Java 8+ mitigates this by converting chains to balanced trees (Red-Black) when a bucket reaches 8 entries, giving O(log n) worst case per bucket. Also note: hash-based data structures have no ordering guarantees — use TreeMap/TreeSet if you need sorted iteration.
Hash collision: chaining vs linear-probing open addressing (capacity 8)
Bloom filter: k=3 hash functions set bits for keys "cat" and "dog"
Module 06

Trees

The most versatile family of data structures — binary trees, BSTs, self-balancing trees, B-Trees, tries, and advanced query structures like segment trees and BITs.

Binary Tree Traversals

Four fundamental ways to visit every node in a binary tree:

Depth-First Traversals

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def inorder(root):
    """Left → Root → Right (gives sorted order for BST)"""
    if not root: return []
    return inorder(root.left) + [root.val] + inorder(root.right)

def preorder(root):
    """Root → Left → Right (serialize tree structure)"""
    if not root: return []
    return [root.val] + preorder(root.left) + preorder(root.right)

def postorder(root):
    """Left → Right → Root (delete tree, evaluate expressions)"""
    if not root: return []
    return postorder(root.left) + postorder(root.right) + [root.val]

Breadth-First (Level-Order)

from collections import deque

def level_order(root):
    """Level by level traversal using queue."""
    if not root: return []
    result = []
    queue = deque([root])
    while queue:
        level = []
        for _ in range(len(queue)):
            node = queue.popleft()
            level.append(node.val)
            if node.left: queue.append(node.left)
            if node.right: queue.append(node.right)
        result.append(level)
    return result
Binary Tree Traversal Orders 4 2 1 3 6 5 7 Inorder: [1, 2, 3, 4, 5, 6, 7] Preorder: [4, 2, 1, 3, 6, 5, 7] Postorder: [1, 3, 2, 5, 7, 6, 4] Level-order: [[4],[2,6],[1,3,5,7]]
Figure 6.1 — Binary tree with all four traversal orderings shown

Binary Search Trees (BST)

Balanced trees (AVL/RB) keep height logarithmic, so search/insert/delete each touch O(log n) nodes; a degenerate skewed BST collapses to a linked list giving O(n).

A BST maintains the BST invariant: for every node, all values in the left subtree are smaller, and all values in the right subtree are larger.

Core Operations

def bst_search(root, target):
    """Search BST. O(h) time where h = tree height."""
    if not root or root.val == target:
        return root
    if target < root.val:
        return bst_search(root.left, target)
    return bst_search(root.right, target)

def bst_insert(root, val):
    """Insert into BST. O(h) time."""
    if not root:
        return TreeNode(val)
    if val < root.val:
        root.left = bst_insert(root.left, val)
    elif val > root.val:
        root.right = bst_insert(root.right, val)
    return root

def bst_delete(root, val):
    """Delete from BST. O(h) time. Three cases."""
    if not root: return None
    if val < root.val:
        root.left = bst_delete(root.left, val)
    elif val > root.val:
        root.right = bst_delete(root.right, val)
    else:
        # Case 1: No children (leaf)
        if not root.left and not root.right:
            return None
        # Case 2: One child
        if not root.left: return root.right
        if not root.right: return root.left
        # Case 3: Two children — replace with inorder successor
        successor = root.right
        while successor.left:
            successor = successor.left
        root.val = successor.val
        root.right = bst_delete(root.right, successor.val)
    return root

Inorder Successor & Predecessor

  • Inorder successor: If right subtree exists, it's the leftmost node in right subtree. Otherwise, it's the nearest ancestor for which the node is in the left subtree.
  • Inorder predecessor: If left subtree exists, it's the rightmost node in left subtree.

AVL Trees

The balance invariant keeps height ≤ 1.44 log n, so all operations trace a single root-to-leaf path with O(1) rotations per level.

An AVL tree is a self-balancing BST where the balance factor (height difference between left and right subtrees) of every node is -1, 0, or 1.

Four Rotation Types

ImbalancePatternFixDescription
Left-Left (LL)BF > 1, left child BF ≥ 0Right rotateSingle right rotation at imbalanced node
Right-Right (RR)BF < -1, right child BF ≤ 0Left rotateSingle left rotation at imbalanced node
Left-Right (LR)BF > 1, left child BF < 0Left rotate child, then right rotateDouble rotation
Right-Left (RL)BF < -1, right child BF > 0Right rotate child, then left rotateDouble rotation
AVL Right Rotation (LL Case) Before (unbalanced) 30 20 10 T3 T2 right rotate(30) After (balanced) 20 10 30 T2 T3
Figure 6.2 — AVL right rotation to fix LL imbalance: node 20 becomes the new root

Red-Black Trees

A Red-Black tree is a self-balancing BST with less strict balancing than AVL. Each node stores a color bit (red or black), and the tree satisfies five properties:

  1. Every node is either red or black
  2. The root is black
  3. Every leaf (NIL sentinel) is black
  4. Red property: A red node cannot have a red child (no two consecutive reds)
  5. Black-height property: Every path from a node to its descendant NIL nodes has the same number of black nodes

These properties guarantee that the longest path is at most 2x the shortest path, giving O(log n) height.

Insertion Cases

After BST insert (new node is red), fix violations by case analysis on the uncle node's color:

  • Case 1: Uncle is red → recolor parent, uncle, and grandparent
  • Case 2: Uncle is black, node is inner child → rotate to make it case 3
  • Case 3: Uncle is black, node is outer child → rotate grandparent and recolor

B-Trees & B+ Trees

B-Tree (Order m)

A self-balancing multi-way search tree designed for disk-based storage systems:

  • Each node can have up to m children and m-1 keys
  • All leaves are at the same depth (perfectly balanced)
  • Internal nodes (except root) have at least ⌈m/2⌉ children
  • Keys within a node are sorted
  • Insert: If a node overflows (m keys), split into two nodes and push the median up

B+ Tree

Variant of B-Tree where all data records live in leaf nodes and leaves are connected via a linked list for efficient range queries.

B+ Tree with Leaf Chain (Order 3) 10 | 20 3 | 5 | 8 12 | 15 | 18 22 | 25 | 30 Leaf nodes linked for range queries Used by MySQL InnoDB, PostgreSQL B-tree indexes
Figure 6.3 — B+ Tree structure with linked leaf nodes enabling efficient range scans

Tries (Prefix Trees)

Lookup follows L character edges and is independent of dictionary size; space scales with the alphabet fan-out times the number of edges across all stored strings.

A trie stores strings character by character, with each edge representing a character. Enables O(L) insert/search/prefix-search where L is the string length — independent of the number of strings stored.

class TrieNode:
    def __init__(self):
        self.children = {}       # char → TrieNode
        self.is_end = False      # marks end of a word

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def search(self, word):
        node = self._find(word)
        return node is not None and node.is_end

    def starts_with(self, prefix):
        return self._find(prefix) is not None

    def _find(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return None
            node = node.children[ch]
        return node
Trie storing: "cat", "car", "card", "dog", "dot" root c c a a t t* r r* d d* d d o o g g* t t* * = end of word
Figure 6.4 — Trie storing five words with shared prefixes; asterisk marks word endings

Segment Trees & Binary Indexed Trees (BIT/Fenwick)

A range query decomposes into at most 2·log n canonical segments; Fenwick trees use the same log-depth traversal of implicit binary indices.

Segment Tree

Supports range queries and point/range updates in O(log n). Each node represents a range of the original array.

class SegmentTree:
    def __init__(self, arr):
        self.n = len(arr)
        self.tree = [0] * (4 * self.n)
        self._build(arr, 1, 0, self.n - 1)

    def _build(self, arr, node, start, end):
        if start == end:
            self.tree[node] = arr[start]
            return
        mid = (start + end) // 2
        self._build(arr, 2*node, start, mid)
        self._build(arr, 2*node+1, mid+1, end)
        self.tree[node] = self.tree[2*node] + self.tree[2*node+1]

    def query(self, node, start, end, l, r):
        """Range sum query [l, r]. O(log n)."""
        if r < start or end < l: return 0
        if l <= start and end <= r: return self.tree[node]
        mid = (start + end) // 2
        return self.query(2*node, start, mid, l, r) + \
               self.query(2*node+1, mid+1, end, l, r)

    def update(self, node, start, end, idx, val):
        """Point update. O(log n)."""
        if start == end:
            self.tree[node] = val
            return
        mid = (start + end) // 2
        if idx <= mid:
            self.update(2*node, start, mid, idx, val)
        else:
            self.update(2*node+1, mid+1, end, idx, val)
        self.tree[node] = self.tree[2*node] + self.tree[2*node+1]

Binary Indexed Tree (Fenwick Tree)

Simpler alternative to segment tree for prefix sums with point updates. Uses clever bit manipulation (lowbit = x & -x) to navigate the tree.

class BIT:
    def __init__(self, n):
        self.n = n
        self.tree = [0] * (n + 1)

    def update(self, i, delta):
        """Add delta to index i. O(log n)."""
        while i <= self.n:
            self.tree[i] += delta
            i += i & (-i)   # add lowest set bit

    def prefix_sum(self, i):
        """Sum of arr[1..i]. O(log n)."""
        s = 0
        while i > 0:
            s += self.tree[i]
            i -= i & (-i)   # remove lowest set bit
        return s

    def range_sum(self, l, r):
        return self.prefix_sum(r) - self.prefix_sum(l - 1)
Segment Tree for Array [1, 3, 5, 7, 9, 11] 36 [0-5] 9 [0-2] 27 [3-5] 4 [0-1] 5 [2] 16 [3-4] 11 [5] 1 3 7 9
Figure 6.5 — Segment tree for range sum queries; each node stores the sum of its range
FeatureSegment TreeBIT (Fenwick)
Range queryO(log n)O(log n)
Point updateO(log n)O(log n)
Range updateO(log n) with lazy propagationO(log n) with range trick
SpaceO(4n)O(n)
Code complexityHigherLower
FlexibilityAny associative operationMainly prefix sums
Gotcha: BST inorder traversal always gives sorted output. This is a direct consequence of the BST invariant. If you perform an inorder traversal and the output is not sorted, the tree is not a valid BST. This property is the basis for the "Validate BST" interview question.
BST insert 5,3,7,1,4,6,8 and in-order traversal
AVL left rotation after inserting 10, 20, 30 (right-right case)
Red-Black: insert 4 into [10B, 5R] causes recolor at uncle + left-rotate
B+ tree: internal routing nodes + leaves connected as a linked list
Trie storing {cat, car, cart, dog}
Segment tree: build on [1,3,5,7,9,11]; range-sum query [1..4]
Module 07

Heaps

Priority-based data structures — binary heap internals, array representation, build-heap optimization, and classic applications like Top-K and median finding.

Binary Heap Fundamentals

A binary heap is a complete binary tree satisfying the heap property:

  • Max-Heap: Every parent ≥ its children (root is maximum)
  • Min-Heap: Every parent ≤ its children (root is minimum)
  • Complete: All levels filled except possibly the last, which is filled left-to-right

Array Representation

A complete binary tree maps perfectly to an array with zero-based indexing:

  • Parent of i: (i - 1) // 2
  • Left child of i: 2*i + 1
  • Right child of i: 2*i + 2
Max-Heap: Tree View ↔ Array View 90 70 50 40 60 20 30 Array representation: 90 70 50 40 60 20 30 [0] [1] [2] [3] [4] [5] [6]
Figure 7.1 — Max-heap shown as both a tree and its corresponding array representation

Heap Operations

Sift-up and sift-down walk a single root-to-leaf path of length log n; peek is a single array index read.

Insert (Bubble Up / Sift Up)

Add element at the end, then bubble it up by swapping with parent while heap property is violated. O(log n).

Extract Min/Max (Bubble Down / Sift Down)

Replace root with last element, then bubble it down by swapping with the smaller (min-heap) or larger (max-heap) child. O(log n).

import heapq

# Python heapq is a MIN-HEAP
heap = []
heapq.heappush(heap, 5)     # insert: O(log n)
heapq.heappush(heap, 3)
heapq.heappush(heap, 7)
min_val = heapq.heappop(heap) # extract min: O(log n) → returns 3
peek = heap[0]               # peek min: O(1) → returns 5

# For MAX-HEAP, negate values
max_heap = []
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
max_val = -heapq.heappop(max_heap)  # → 5

# Build heap from list: O(n) — NOT O(n log n)!
arr = [4, 1, 7, 3, 8, 5]
heapq.heapify(arr)  # in-place O(n) transformation

Build Heap: O(n) Analysis

The naive analysis says n inserts × O(log n) each = O(n log n). But the tighter analysis using sift-down from the bottom up:

  • n/2 nodes at leaves — sift-down cost 0
  • n/4 nodes at height 1 — sift-down cost 1
  • n/8 nodes at height 2 — sift-down cost 2
  • Total: ∑ n/(2^(h+1)) × h = n × ∑ h/2^(h+1) = n × 2 = O(n)
Build Heap: Sift-Down from Bottom Up Step 1: Start at last non-leaf (index n/2 - 1) Step 2: Sift down each node to its correct position Step 3: Move upward to root, sifting down each Level 0 (root) 1 node × cost h = O(log n) Level 1 2 nodes × cost (h-1) Level h-1 n/4 nodes × cost 1 Level h (leaves) n/2 nodes × cost 0 Total = O(n) Most nodes are near the bottom with low cost
Figure 7.2 — Build heap costs: most nodes are near leaves with minimal sift-down distance

Applications: Top-K & Median Finding

Top-K Largest Elements

Use a min-heap of size K. The root is always the K-th largest element seen so far.

def top_k_largest(nums, k):
    """Find k largest elements. O(n log k) time, O(k) space."""
    heap = nums[:k]
    heapq.heapify(heap)         # min-heap of size k
    for num in nums[k:]:
        if num > heap[0]:       # larger than smallest in heap
            heapq.heapreplace(heap, num)  # pop min, push new
    return heap

Running Median (Two-Heap Technique)

Maintain a max-heap for the lower half and a min-heap for the upper half. Median is always accessible from the heap tops.

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negate values) for lower half
        self.hi = []  # min-heap for upper half

    def add_num(self, num):
        heapq.heappush(self.lo, -num)
        # Ensure lo's max <= hi's min
        heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def find_median(self):
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2
Gotcha: Build-heap is O(n), NOT O(n log n). This is one of the most commonly misremembered complexities. The key insight is that most nodes are near the bottom of the tree, where sift-down does very little work. Always use heapify() instead of n individual inserts when building a heap from scratch.
Build-heap on [3,1,6,5,2,4] — sift-down from last internal node
Top-3 from stream [4,7,2,9,1,6,8] with a size-3 min-heap
Module 08

Graphs

The most general data structure — representations, traversals, shortest paths, topological ordering, union-find, and minimum spanning trees.

Graph Representations

RepresentationSpaceAdd EdgeCheck EdgeIterate NeighborsBest For
Adjacency MatrixO(V²)O(1)O(1)O(V)Dense graphs, quick edge lookups
Adjacency ListO(V+E)O(1)O(degree)O(degree)Sparse graphs (most common)
Edge ListO(E)O(1)O(E)O(E)Kruskal's MST, simple representation
Same Graph in Three Representations Graph 0 1 2 3 Adjacency Matrix 0 1 2 3 0[0 1 1 1] 1[1 0 0 1] 2[1 0 0 1] 3[1 1 1 0] Adjacency List 0: [1, 2, 3] 1: [0, 3] 2: [0, 3] 3: [0, 1, 2] Edge List (0,1) (0,2) (0,3) (1,3) (2,3)
Figure 8.1 — The same undirected graph represented as adjacency matrix, adjacency list, and edge list

BFS vs DFS

Each vertex and edge is explored at most once; the visited set and queue/stack together hold at most V elements at a time.

from collections import deque

def bfs(graph, start):
    """Breadth-First Search. O(V+E) time, O(V) space."""
    visited = {start}
    queue = deque([start])
    order = []
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)
    return order

def dfs(graph, start):
    """Depth-First Search (iterative). O(V+E) time, O(V) space."""
    visited = set()
    stack = [start]
    order = []
    while stack:
        node = stack.pop()
        if node in visited: continue
        visited.add(node)
        order.append(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                stack.append(neighbor)
    return order
BFS (Level-by-Level) vs DFS (Depth-First) BFS: Queue-based 1 2 3 4 5 Order: 1→2→3→4→5 DFS: Stack-based 1 2 3 4 5 Order: 1→2→4→5→3
Figure 8.2 — BFS explores neighbors level by level; DFS goes deep before backtracking

When to Use Which

Use BFS WhenUse DFS When
Finding shortest path (unweighted)Detecting cycles
Level-order traversal neededTopological sorting
Finding connected componentsSolving mazes / puzzles
Shortest transformation sequenceFinding strongly connected components
Multi-source shortest pathPath existence check

Shortest Path Algorithms

With a binary heap each vertex is popped once (V log V) and each edge can trigger a decrease-key (E log V); adjacency list gives linear scan per vertex.

Dijkstra's Algorithm — O((V+E) log V)

Greedy approach using a priority queue. Works only with non-negative edge weights.

import heapq

def dijkstra(graph, src):
    """Dijkstra's shortest path. graph[u] = [(v, weight), ...]"""
    dist = {src: 0}
    pq = [(0, src)]  # (distance, node)
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist.get(u, float('inf')):
            continue  # skip outdated entry
        for v, w in graph[u]:
            new_dist = d + w
            if new_dist < dist.get(v, float('inf')):
                dist[v] = new_dist
                heapq.heappush(pq, (new_dist, v))
    return dist

Bellman-Ford — O(V · E)

Handles negative edge weights. Relaxes all edges V-1 times. Detects negative cycles with one more pass.

def bellman_ford(n, edges, src):
    """edges = [(u, v, weight), ...]. Returns dist[] or None if negative cycle."""
    dist = [float('inf')] * n
    dist[src] = 0
    for _ in range(n - 1):       # V-1 relaxation passes
        for u, v, w in edges:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
    # Check for negative cycles
    for u, v, w in edges:
        if dist[u] + w < dist[v]:
            return None   # negative cycle detected!
    return dist

Floyd-Warshall — O(V³)

All-pairs shortest paths using dynamic programming. Works with negative weights (no negative cycles).

def floyd_warshall(dist):
    """dist[i][j] = weight of edge i→j (inf if no edge). Modifies in-place."""
    n = len(dist)
    for k in range(n):         # intermediate vertex
        for i in range(n):     # source
            for j in range(n): # destination
                dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
    return dist
Edge Relaxation: dist[v] = min(dist[v], dist[u] + w(u,v)) Before relaxation u d=5 w=3 v d=10 After relaxation u d=5 w=3 v d=8 dist[v] updated from 10 to 5+3=8 because going through u is shorter
Figure 8.3 — Edge relaxation: the fundamental operation in all shortest path algorithms

Topological Sort

A linear ordering of vertices in a DAG (Directed Acyclic Graph) such that for every edge u→v, u comes before v. Used for task scheduling, course prerequisites, build systems.

Kahn's Algorithm (BFS-based)

def topological_sort_kahn(graph, n):
    """Kahn's BFS-based topo sort. O(V+E)."""
    in_degree = [0] * n
    for u in range(n):
        for v in graph[u]:
            in_degree[v] += 1
    queue = deque([v for v in range(n) if in_degree[v] == 0])
    order = []
    while queue:
        u = queue.popleft()
        order.append(u)
        for v in graph[u]:
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)
    return order if len(order) == n else []  # empty = cycle exists

DFS-based Topological Sort

def topological_sort_dfs(graph, n):
    """DFS-based topo sort using post-order reversal. O(V+E)."""
    visited = [False] * n
    stack = []
    def dfs(u):
        visited[u] = True
        for v in graph[u]:
            if not visited[v]:
                dfs(v)
        stack.append(u)  # add after all descendants processed
    for u in range(n):
        if not visited[u]:
            dfs(u)
    return stack[::-1]  # reverse post-order

Union-Find (Disjoint Set Union)

With both path compression and union-by-rank, the amortized cost per operation is the inverse-Ackermann function α(n) ≤ 4 for any practical n.

Tracks a collection of disjoint sets with near-constant time operations. Essential for Kruskal's MST, connected components, and cycle detection.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.count = n  # number of components

    def find(self, x):
        """Find root with path compression. Amortized O(α(n))."""
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        """Union by rank. Amortized O(α(n))."""
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False  # already connected
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        self.count -= 1
        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)

Minimum Spanning Trees

A spanning tree connects all vertices with minimum total edge weight. Two classic algorithms:

FeatureKruskal'sPrim's
ApproachEdge-centric (sort all edges)Vertex-centric (grow from source)
Data StructureUnion-FindPriority Queue
TimeO(E log E)O((V+E) log V)
Better forSparse graphs (E ≈ V)Dense graphs (E ≈ V²)
def kruskal(n, edges):
    """Kruskal's MST. edges = [(weight, u, v), ...]. O(E log E)."""
    edges.sort()
    uf = UnionFind(n)
    mst = []
    total = 0
    for w, u, v in edges:
        if uf.union(u, v):
            mst.append((u, v, w))
            total += w
            if len(mst) == n - 1: break
    return total, mst
Gotcha: Dijkstra fails with negative edge weights. Dijkstra is a greedy algorithm that assumes once a vertex's shortest distance is finalized, it won't improve. Negative edges can violate this. Use Bellman-Ford instead for graphs with negative edges, or Johnson's algorithm for all-pairs with negative edges.
BFS from A — visit order A,B,C,D,E,F (layers 0,1,1,2,2,3)
Dijkstra from A on 5-node weighted graph
Kahn's algorithm: topo sort of DAG {A,B,C,D,E}
Union-Find: union(1,2), union(3,4), union(2,3); find(1) with path compression
Module 09

Dynamic Programming

Break complex problems into overlapping subproblems with optimal substructure — memoization, tabulation, classic 1D/2D patterns, knapsack, interval DP, and space optimization.

Core Concepts

Dynamic programming (DP) is an optimization technique that solves problems by combining solutions to overlapping subproblems. Two properties must hold for DP to apply:

1. Overlapping Subproblems

The problem can be broken down into subproblems that are reused multiple times. Without overlap, you just have divide-and-conquer (like merge sort), not DP.

Example: Computing fib(5) recursively calls fib(3) twice and fib(2) three times. These repeated computations are overlapping subproblems.

2. Optimal Substructure

An optimal solution to the problem can be constructed from optimal solutions to its subproblems. In other words, you can solve a problem by combining the best answers to smaller versions of itself.

Example: The shortest path from A to C through B is the shortest path from A to B plus the shortest path from B to C. This is optimal substructure.

PropertyMeaningWithout It
Overlapping SubproblemsSame subproblem is solved multiple timesDivide-and-conquer (no caching needed)
Optimal SubstructureOptimal solution built from optimal sub-solutionsGreedy or brute force may be needed

Memoization vs Tabulation

There are two ways to implement DP: top-down (memoization) and bottom-up (tabulation).

FeatureMemoization (Top-Down)Tabulation (Bottom-Up)
DirectionStarts from the original problem, recurses downStarts from base cases, builds up
ImplementationRecursive + hash map/array cacheIterative + DP table
Subproblems solvedOnly those actually needed (lazy)All subproblems in the table (eager)
Stack overheadO(n) call stack (risk of stack overflow)No recursion, O(1) stack
Cache performancePoor — jumps around in memoryExcellent — sequential memory access
Ease of codingOften more intuitive (just add cache to recursion)Requires understanding fill order
Space optimizationHarder to reduce spaceEasy to use rolling array
# Top-Down (Memoization)
def fib_memo(n, memo={}):
    if n <= 1: return n
    if n not in memo:
        memo[n] = fib_memo(n - 1, memo) + fib_memo(n - 2, memo)
    return memo[n]

# Bottom-Up (Tabulation)
def fib_tab(n):
    if n <= 1: return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i - 1] + dp[i - 2]
    return dp[n]

# Space-Optimized (Rolling Variables)
def fib_opt(n):
    if n <= 1: return n
    prev2, prev1 = 0, 1
    for _ in range(2, n + 1):
        prev2, prev1 = prev1, prev2 + prev1
    return prev1
Fibonacci Recursion Tree — Overlapping Subproblems fib(5) fib(4) fib(3) fib(3) fib(2) fib(2) fib(1) fib(2) fib(1) fib(1) fib(0) Repeated: fib(3) called 2x Repeated: fib(2) called 3x Without memoization: O(2^n) calls With memoization: O(n) calls
Figure 9.1 — Fibonacci recursion tree showing overlapping subproblems (highlighted nodes are computed multiple times without memoization)

1D DP Patterns

The simplest DP problems involve a single dimension — usually an index into an array or a running total. The key is defining dp[i] clearly.

Climbing Stairs

You can climb 1 or 2 steps at a time. How many distinct ways to reach step n?

State: dp[i] = number of ways to reach step i. Transition: dp[i] = dp[i-1] + dp[i-2].

def climb_stairs(n):
    """O(n) time, O(1) space with rolling variables."""
    if n <= 2: return n
    a, b = 1, 2
    for _ in range(3, n + 1):
        a, b = b, a + b
    return b

Coin Change (Minimum Coins)

Given coins of different denominations, find the fewest coins to make a target amount.

State: dp[i] = min coins to make amount i. Transition: dp[i] = min(dp[i - c] + 1) for each coin c.

def coin_change(coins, amount):
    """O(amount * len(coins)) time, O(amount) space."""
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    for i in range(1, amount + 1):
        for c in coins:
            if c <= i and dp[i - c] + 1 < dp[i]:
                dp[i] = dp[i - c] + 1
    return dp[amount] if dp[amount] != float('inf') else -1

Longest Increasing Subsequence (LIS)

State: dp[i] = length of LIS ending at index i. Transition: dp[i] = max(dp[j] + 1) for all j < i where nums[j] < nums[i].

def lis(nums):
    """O(n^2) DP solution."""
    n = len(nums)
    dp = [1] * n
    for i in range(1, n):
        for j in range(i):
            if nums[j] < nums[i]:
                dp[i] = max(dp[i], dp[j] + 1)
    return max(dp)

import bisect
def lis_optimized(nums):
    """O(n log n) using patience sorting / binary search."""
    tails = []  # tails[i] = smallest tail of all increasing subseqs of length i+1
    for x in nums:
        pos = bisect.bisect_left(tails, x)
        if pos == len(tails):
            tails.append(x)
        else:
            tails[pos] = x
    return len(tails)

House Robber

Cannot rob two adjacent houses. Maximize total loot.

State: dp[i] = max money from houses 0..i. Transition: dp[i] = max(dp[i-1], dp[i-2] + nums[i]).

def rob(nums):
    """O(n) time, O(1) space."""
    prev2 = prev1 = 0
    for num in nums:
        prev2, prev1 = prev1, max(prev1, prev2 + num)
    return prev1

2D DP Patterns

When the state depends on two parameters (two indices, two strings, grid coordinates), we use a 2D DP table.

Longest Common Subsequence (LCS)

State: dp[i][j] = length of LCS of s1[0..i-1] and s2[0..j-1].

Transition:

  • If s1[i-1] == s2[j-1]: dp[i][j] = dp[i-1][j-1] + 1 (match, extend)
  • Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) (skip one character from either string)
def lcs(s1, s2):
    """O(m*n) time and space."""
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    return dp[m][n]
LCS Table: s1 = "ABCBDAB", s2 = "BDCAB" → LCS = "BCAB" (length 4) B D C A B 0 0 0 0 0 0 A 0 0 0 0 1 1 B 0 1 1 1 1 2 C 0 1 1 2 2 2 B 0 1 1 2 2 3 D 0 1 2 2 2 3 A 0 1 2 2 3 3 B 0 1 2 2 3 4 Backtrack to find LCS: Diagonal = match Left/Up = skip char Green cells mark the diagonal matches: B, C, A, B LCS = "BCAB" (4)
Figure 9.2 — LCS DP table for "ABCBDAB" and "BDCAB". Green cells show diagonal matches used in backtracking.

Edit Distance (Levenshtein Distance)

Minimum insertions, deletions, and substitutions to transform one string into another.

State: dp[i][j] = edit distance between s1[0..i-1] and s2[0..j-1].

def edit_distance(s1, s2):
    """O(m*n) time and space."""
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(m + 1): dp[i][0] = i
    for j in range(n + 1): dp[0][j] = j
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                dp[i][j] = 1 + min(
                    dp[i - 1][j],     # delete
                    dp[i][j - 1],     # insert
                    dp[i - 1][j - 1]  # replace
                )
    return dp[m][n]

Unique Paths (Grid Traversal)

Count paths from top-left to bottom-right of an m x n grid, moving only right or down.

State: dp[i][j] = number of paths to reach cell (i, j). Transition: dp[i][j] = dp[i-1][j] + dp[i][j-1].

def unique_paths(m, n):
    """O(m*n) time, O(n) space with 1D rolling."""
    dp = [1] * n
    for i in range(1, m):
        for j in range(1, n):
            dp[j] += dp[j - 1]
    return dp[n - 1]

Knapsack Family

Every (item, capacity) pair contributes a constant-work cell; traversing capacities right-to-left for 0/1 knapsack lets us reuse a single 1-D row.

The knapsack problem is a fundamental DP archetype with several variants, each requiring a different approach.

VariantItem Reuse?ApproachTimeSpace
0/1 KnapsackNo (each item once)DP with 2D or 1D rollingO(n · W)O(W)
Unbounded KnapsackYes (unlimited copies)DP — iterate coins/items forwardO(n · W)O(W)
Fractional KnapsackYes (fractions allowed)Greedy — sort by value/weight ratioO(n log n)O(1)

0/1 Knapsack

Given n items with weights and values, maximize total value without exceeding capacity W. Each item can be taken at most once.

State: dp[i][w] = max value using items 0..i-1 with capacity w.

Transition: dp[i][w] = max(dp[i-1][w], dp[i-1][w - wt[i]] + val[i]) if wt[i] ≤ w.

def knapsack_01(weights, values, W):
    """0/1 Knapsack. O(n*W) time, O(W) space with 1D rolling."""
    n = len(weights)
    dp = [0] * (W + 1)
    for i in range(n):
        # Traverse backwards to avoid using same item twice
        for w in range(W, weights[i] - 1, -1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    return dp[W]

Unbounded Knapsack

Same as 0/1 but each item can be used unlimited times. The key difference: iterate forward through weights.

def knapsack_unbounded(weights, values, W):
    """Unbounded Knapsack. O(n*W) time, O(W) space."""
    dp = [0] * (W + 1)
    for w in range(1, W + 1):
        for i in range(len(weights)):
            if weights[i] <= w:
                dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    return dp[W]

Fractional Knapsack (Greedy)

Items can be broken into fractions. Greedy works because the problem has the greedy choice property: always take the item with the highest value-to-weight ratio.

def knapsack_fractional(weights, values, W):
    """Fractional Knapsack (greedy). O(n log n) time."""
    items = sorted(zip(values, weights), key=lambda x: x[0] / x[1], reverse=True)
    total = 0
    for v, w in items:
        if W >= w:
            total += v
            W -= w
        else:
            total += v * (W / w)  # take fraction
            break
    return total
💡
Key Insight — 0/1 vs Unbounded Loop Order: In 0/1 knapsack, iterate weights backwards to ensure each item is used at most once. In unbounded knapsack, iterate forwards to allow reuse. This single difference determines the variant.

Interval & String DP

Interval DP

Problems where the state is defined by a contiguous interval [i, j]. The subproblems are smaller intervals, and we try all possible split points.

Classic example: Matrix Chain Multiplication — find the optimal way to parenthesize a chain of matrices to minimize scalar multiplications.

def matrix_chain(dims):
    """Matrix chain multiplication. dims[i] = rows of matrix i, dims[n] = cols of last.
    O(n^3) time, O(n^2) space."""
    n = len(dims) - 1
    dp = [[0] * n for _ in range(n)]
    for length in range(2, n + 1):        # chain length
        for i in range(n - length + 1):    # start
            j = i + length - 1              # end
            dp[i][j] = float('inf')
            for k in range(i, j):           # split point
                cost = dp[i][k] + dp[k + 1][j] + dims[i] * dims[k + 1] * dims[j + 1]
                dp[i][j] = min(dp[i][j], cost)
    return dp[0][n - 1]

Palindrome Partitioning

Another interval DP classic — minimum cuts to partition a string into palindromes.

def min_palindrome_cuts(s):
    """Minimum cuts to partition into palindromes. O(n^2)."""
    n = len(s)
    # is_pal[i][j] = True if s[i..j] is palindrome
    is_pal = [[False] * n for _ in range(n)]
    for i in range(n - 1, -1, -1):
        for j in range(i, n):
            if s[i] == s[j] and (j - i < 2 or is_pal[i + 1][j - 1]):
                is_pal[i][j] = True

    dp = list(range(n))  # dp[i] = min cuts for s[0..i]
    for i in range(1, n):
        if is_pal[0][i]:
            dp[i] = 0
        else:
            for j in range(i):
                if is_pal[j + 1][i]:
                    dp[i] = min(dp[i], dp[j] + 1)
    return dp[n - 1]

State Design Principles

  • Define the state clearly: What does dp[i] (or dp[i][j]) represent? Write it in plain English before coding.
  • Identify the transition: How does dp[i] relate to smaller subproblems? What choices do you make at each step?
  • Set base cases: What are the trivial subproblems you can solve directly?
  • Determine fill order: For tabulation, ensure all dependencies are computed before the current cell.
  • Optimize space: If dp[i] only depends on dp[i-1] (or a constant number of previous rows), use a rolling array to reduce space from O(n) to O(1) or from O(n×m) to O(m).

Space Optimization: Rolling Array

When each row of the DP table only depends on the previous row, keep only two rows (or even one with careful iteration order) to reduce space.

# LCS with O(min(m,n)) space using rolling array
def lcs_space_optimized(s1, s2):
    if len(s1) < len(s2): s1, s2 = s2, s1  # s2 is shorter
    m, n = len(s1), len(s2)
    prev = [0] * (n + 1)
    for i in range(1, m + 1):
        curr = [0] * (n + 1)
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                curr[j] = prev[j - 1] + 1
            else:
                curr[j] = max(prev[j], curr[j - 1])
        prev = curr
    return prev[n]
Gotcha: Not every recursive problem is DP. DP requires overlapping subproblems. If every subproblem is unique (like in merge sort or binary search), caching provides no benefit — it's just divide-and-conquer. Ask yourself: "Will I solve the same subproblem more than once?" If yes, it's DP. If no, it's divide-and-conquer.
LIS table on [10,9,2,5,3,7,101,18] — dp[i] = LIS ending at i
0/1 Knapsack — items [(w=1,v=1),(w=3,v=4),(w=4,v=5),(w=5,v=7)], capacity 7
Edit distance "horse" to "ros" — answer 3
Module 10

Sorting & Searching

Master every comparison and non-comparison sort, binary search patterns, and selection algorithms — the bedrock of algorithmic problem solving.

Comparison Sorts

Decision-tree argument proves any comparison sort needs Ω(n log n) comparisons; quicksort is in-place O(log n) stack, mergesort needs an O(n) buffer.

Comparison-based sorts determine order by comparing pairs of elements. The theoretical lower bound for any comparison sort is Ω(n log n) in the worst case (proved via decision tree argument).

AlgorithmBestAverageWorstSpaceStable?In-Place?
Bubble SortO(n)O(n²)O(n²)O(1)YesYes
Selection SortO(n²)O(n²)O(n²)O(1)NoYes
Insertion SortO(n)O(n²)O(n²)O(1)YesYes
Merge SortO(n log n)O(n log n)O(n log n)O(n)YesNo
Quick SortO(n log n)O(n log n)O(n²)O(log n)NoYes
Heap SortO(n log n)O(n log n)O(n log n)O(1)NoYes
Tim SortO(n)O(n log n)O(n log n)O(n)YesNo

Merge Sort

Divide-and-conquer: split array in half, recursively sort each half, merge. Guaranteed O(n log n) but needs O(n) extra space.

def merge_sort(arr):
    """O(n log n) time, O(n) space. Stable."""
    if len(arr) <= 1: return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:   # <= for stability
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result
Merge Sort: Divide & Conquer Tree [38, 27, 43, 3, 9, 82, 10] [38, 27, 43, 3] [9, 82, 10] [38, 27] [43, 3] [9, 82] [10] MERGE PHASE (bottom-up) [27, 38] [3, 43] [9, 82] [10] [3, 27, 38, 43] [9, 10, 82] [3, 9, 10, 27, 38, 43, 82]
Figure 10.1 — Merge sort divide-and-conquer tree: split until singletons, then merge upward

Quick Sort

Choose a pivot, partition array into elements < pivot and ≥ pivot, recursively sort partitions. In-place, O(n log n) average, but O(n²) worst case on sorted input.

import random

def quicksort(arr, lo=0, hi=None):
    """In-place quicksort with random pivot. O(n log n) average."""
    if hi is None: hi = len(arr) - 1
    if lo >= hi: return
    pivot_idx = partition(arr, lo, hi)
    quicksort(arr, lo, pivot_idx - 1)
    quicksort(arr, pivot_idx + 1, hi)

def partition(arr, lo, hi):
    # Random pivot to avoid O(n^2) on sorted input
    rand = random.randint(lo, hi)
    arr[rand], arr[hi] = arr[hi], arr[rand]
    pivot = arr[hi]
    i = lo
    for j in range(lo, hi):
        if arr[j] < pivot:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    arr[i], arr[hi] = arr[hi], arr[i]
    return i

Tim Sort

Hybrid of merge sort and insertion sort, used by Python (sorted(), list.sort()) and Java (Arrays.sort() for objects). Key ideas:

  • Divide input into small "runs" (naturally sorted subsequences or forced to min length ~32)
  • Sort each run with insertion sort (fast for small/nearly-sorted data)
  • Merge runs using a modified merge sort with galloping mode for highly unbalanced merges
  • O(n) best case (already sorted), O(n log n) worst case, always stable

Non-Comparison Sorts

These algorithms break the Ω(n log n) barrier by not comparing elements. Instead, they exploit the structure of the data (integers, bounded range).

AlgorithmTimeSpaceStable?When to Use
Counting SortO(n + k)O(n + k)YesIntegers with small range k
Radix SortO(d · (n+k))O(n + k)YesFixed-length integers/strings, d digits
Bucket SortO(n + k)O(n + k)YesUniformly distributed floating-point numbers

Counting Sort

Count occurrences of each value, then reconstruct the sorted array. Only works for non-negative integers in a bounded range.

def counting_sort(arr, max_val):
    """O(n + k) time and space, where k = max_val + 1."""
    count = [0] * (max_val + 1)
    for x in arr:
        count[x] += 1
    # Build prefix sums for stable placement
    for i in range(1, len(count)):
        count[i] += count[i - 1]
    output = [0] * len(arr)
    for x in reversed(arr):  # reversed for stability
        count[x] -= 1
        output[count[x]] = x
    return output

Radix Sort

Sort digit by digit from least significant to most significant, using a stable sub-sort (usually counting sort) at each digit position.

def radix_sort(arr):
    """LSD Radix Sort. O(d * (n + 10)) where d = number of digits."""
    if not arr: return arr
    max_val = max(arr)
    exp = 1
    while max_val // exp > 0:
        arr = counting_sort_by_digit(arr, exp)
        exp *= 10
    return arr

def counting_sort_by_digit(arr, exp):
    count = [0] * 10
    for x in arr:
        count[(x // exp) % 10] += 1
    for i in range(1, 10):
        count[i] += count[i - 1]
    output = [0] * len(arr)
    for x in reversed(arr):
        digit = (x // exp) % 10
        count[digit] -= 1
        output[count[digit]] = x
    return output

Binary Search Variants

Every iteration halves the active range; lower_bound / upper_bound share the same halving loop with different comparator predicate.

Binary search is far more than just "find an element in a sorted array." Mastering its variants is essential for interviews — these patterns appear in dozens of problems.

Classic Binary Search

def binary_search(arr, target):
    """Standard binary search. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2  # avoid overflow
        if arr[mid] == target: return mid
        elif arr[mid] < target: lo = mid + 1
        else: hi = mid - 1
    return -1

Leftmost (Lower Bound)

Find the first position where arr[pos] ≥ target. Equivalent to C++ lower_bound and Python bisect_left.

def bisect_left(arr, target):
    """Find leftmost index where arr[i] >= target. O(log n)."""
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] < target: lo = mid + 1
        else: hi = mid
    return lo

Rightmost (Upper Bound)

Find the first position where arr[pos] > target. Equivalent to C++ upper_bound and Python bisect_right.

def bisect_right(arr, target):
    """Find leftmost index where arr[i] > target. O(log n)."""
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] <= target: lo = mid + 1
        else: hi = mid
    return lo

Rotated Sorted Array

Array was sorted then rotated. One half is always sorted — determine which, then decide which half to search.

def search_rotated(arr, target):
    """Search in rotated sorted array. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] == target: return mid
        if arr[lo] <= arr[mid]:       # left half is sorted
            if arr[lo] <= target < arr[mid]:
                hi = mid - 1
            else:
                lo = mid + 1
        else:                          # right half is sorted
            if arr[mid] < target <= arr[hi]:
                lo = mid + 1
            else:
                hi = mid - 1
    return -1

Peak Finding

Find any peak element (greater than its neighbors) in O(log n).

def find_peak(arr):
    """Find a peak element index. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] < arr[mid + 1]:
            lo = mid + 1   # peak is to the right
        else:
            hi = mid         # peak is at mid or left
    return lo
Binary Search Decision Tree: Searching for 7 in [1, 3, 5, 7, 9, 11, 13] mid=3, val=7 target < 7 target > 7 mid=1, val=3 mid=5, val=11 val=1 val=5 val=9 val=13 Found 7 at index 3 in 1 comparison (best case = middle element) Worst case: log₂(7) = 3 comparisons (any leaf node)
Figure 10.2 — Binary search decision tree for a 7-element sorted array

Quick-Select

Find the k-th smallest element without fully sorting. Uses the same partition logic as quicksort, but only recurses into one half — giving O(n) average time.

import random

def quickselect(arr, k):
    """Find k-th smallest element (0-indexed). O(n) average, O(n^2) worst."""
    def select(lo, hi, k):
        if lo == hi: return arr[lo]
        # Random pivot
        pivot_idx = random.randint(lo, hi)
        arr[pivot_idx], arr[hi] = arr[hi], arr[pivot_idx]
        pivot = arr[hi]
        i = lo
        for j in range(lo, hi):
            if arr[j] < pivot:
                arr[i], arr[j] = arr[j], arr[i]
                i += 1
        arr[i], arr[hi] = arr[hi], arr[i]

        if k == i: return arr[i]
        elif k < i: return select(lo, i - 1, k)
        else: return select(i + 1, hi, k)

    return select(0, len(arr) - 1, k)

Median of Medians (O(n) Worst Case)

To guarantee O(n) worst case, choose the pivot using the median-of-medians algorithm:

  • Divide array into groups of 5
  • Find the median of each group (O(1) per group, O(n/5) total)
  • Recursively find the median of these medians — use that as the pivot
  • This guarantees the pivot eliminates at least 30% of elements per step
  • Recurrence: T(n) = T(n/5) + T(7n/10) + O(n) = O(n)

In practice, random pivot is preferred because the constant factor of median-of-medians is large. But median-of-medians is important theoretically — it proves selection can be done in linear time.

Gotcha: Quicksort degrades to O(n²) on sorted/nearly-sorted input. The naive pivot choice (first or last element) creates maximally unbalanced partitions. Fix this with: (1) random pivot — simple and effective, (2) median-of-three — take median of first, middle, and last elements, or (3) introsort — switch to heapsort when recursion depth exceeds 2 log n (used by C++ std::sort).
Quicksort — Lomuto partition of [3,7,1,8,2,5,4,6] (pivot = 6)
Mergesort: merge [1,4,5,8] with [2,3,6,7]
lower_bound for 2 in [1,2,2,2,3,4] — returns 1 (first 2)
Sort comparison — best/avg/worst, stable, in-place
Module 11

Advanced Topics

Backtracking, greedy algorithms, bit manipulation, and mathematical foundations — the techniques that round out your interview toolkit.

Backtracking

Branching factor b to depth d gives the worst-case search tree; only the current path is held on the call stack so memory is linear in depth.

Backtracking systematically explores all potential solutions by building candidates incrementally and abandoning ("pruning") a candidate as soon as it's determined that it cannot lead to a valid solution.

General Framework

def backtrack(candidate, state):
    if is_solution(candidate):
        output(candidate)
        return
    for next_choice in expand(candidate):
        if is_valid(next_choice):       # PRUNE invalid branches
            make_choice(next_choice)
            backtrack(candidate, state)
            undo_choice(next_choice)      # BACKTRACK

N-Queens Problem

Place n queens on an n×n board such that no two queens attack each other (same row, column, or diagonal).

def solve_n_queens(n):
    """Find all valid N-Queens arrangements."""
    solutions = []
    cols = set()
    diag1 = set()  # row - col (anti-diagonal)
    diag2 = set()  # row + col (diagonal)

    def place(row, queens):
        if row == n:
            solutions.append(queens[:])
            return
        for col in range(n):
            if col in cols or (row - col) in diag1 or (row + col) in diag2:
                continue   # PRUNE: under attack
            cols.add(col)
            diag1.add(row - col)
            diag2.add(row + col)
            place(row + 1, queens + [col])
            cols.remove(col)
            diag1.remove(row - col)
            diag2.remove(row + col)

    place(0, [])
    return solutions
4-Queens Backtracking: Pruned Decision Tree Row 0 Q at (0,0) X X (1,2) X X backtrack! Q at (0,1) X (1,3) (2,0) (3,2) Solution: [1,3,0,2] Q at (0,2) (1,0) (2,3) (3,1) Solution: [2,0,3,1] Q at (0,3) (1,1) all pruned Valid path Pruned
Figure 11.1 — 4-Queens backtracking tree with pruned (X) and valid branches. Two solutions found: [1,3,0,2] and [2,0,3,1].

Permutations & Combinations

def permutations(nums):
    """Generate all permutations. O(n! * n)."""
    result = []
    def backtrack(path, remaining):
        if not remaining:
            result.append(path[:])
            return
        for i in range(len(remaining)):
            path.append(remaining[i])
            backtrack(path, remaining[:i] + remaining[i+1:])
            path.pop()
    backtrack([], nums)
    return result

def combinations(nums, k):
    """Generate all C(n,k) combinations. O(C(n,k) * k)."""
    result = []
    def backtrack(start, path):
        if len(path) == k:
            result.append(path[:])
            return
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1, path)
            path.pop()
    backtrack(0, [])
    return result

def subsets(nums):
    """Generate all 2^n subsets (power set)."""
    result = []
    def backtrack(start, path):
        result.append(path[:])
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1, path)
            path.pop()
    backtrack(0, [])
    return result

Greedy Algorithms

Greedy algorithms make the locally optimal choice at each step, hoping to find the global optimum. They work when the problem has the greedy choice property and optimal substructure.

FeatureGreedyDynamic Programming
ApproachMake best local choice, never reconsiderConsider all subproblems, pick the best
CorrectnessMust prove (exchange argument / matroid)Always correct if recurrence is right
EfficiencyUsually faster (often O(n log n) or O(n))Can be slower due to filling DP table
When it worksActivity selection, MST, Huffman, Dijkstra0/1 knapsack, LCS, edit distance
When it fails0/1 knapsack, LCS, coin change (general)Rarely fails (but may be too slow)

Activity Selection / Interval Scheduling

Select the maximum number of non-overlapping intervals. Greedy strategy: always pick the activity that finishes earliest.

def activity_selection(intervals):
    """Max non-overlapping intervals. O(n log n)."""
    intervals.sort(key=lambda x: x[1])  # sort by end time
    selected = [intervals[0]]
    for start, end in intervals[1:]:
        if start >= selected[-1][1]:    # no overlap
            selected.append((start, end))
    return selected

Huffman Coding

Build an optimal prefix-free binary code for data compression. Greedy: always merge the two nodes with the lowest frequencies.

import heapq

def huffman(freq):
    """Build Huffman tree. freq = {char: count}. O(n log n)."""
    heap = [[count, [char, ""]] for char, count in freq.items()]
    heapq.heapify(heap)
    while len(heap) > 1:
        lo = heapq.heappop(heap)
        hi = heapq.heappop(heap)
        for pair in lo[1:]:
            pair[1] = '0' + pair[1]
        for pair in hi[1:]:
            pair[1] = '1' + pair[1]
        merged = [lo[0] + hi[0]] + lo[1:] + hi[1:]
        heapq.heappush(heap, merged)
    return {char: code for char, code in heap[0][1:]}

Bit Manipulation

Bit operations are the fastest operations a CPU can perform — each maps to a single hardware instruction. Mastering these techniques unlocks elegant solutions to many interview problems.

Fundamental Operations

OperationCodeWhat It Does
Set bit ix | (1 << i)Turn bit i ON
Clear bit ix & ~(1 << i)Turn bit i OFF
Toggle bit ix ^ (1 << i)Flip bit i
Check bit i(x >> i) & 1Is bit i set?
Clear lowest set bitx & (x - 1)Turns off the rightmost 1
Isolate lowest set bitx & (-x)Extracts the rightmost 1
Check power of 2x & (x - 1) == 0True if exactly one bit set
Count set bitsbin(x).count('1')Number of 1-bits (popcount)

Classic XOR Tricks

XOR has unique properties: a ^ a = 0, a ^ 0 = a, and XOR is commutative and associative.

def find_single_number(nums):
    """Every element appears twice except one. Find it. O(n) time, O(1) space."""
    result = 0
    for num in nums:
        result ^= num  # pairs cancel out (a ^ a = 0)
    return result

def swap_without_temp(a, b):
    """Swap two values using XOR — no temporary variable needed."""
    a ^= b   # a = a ^ b
    b ^= a   # b = (a ^ b) ^ b = a
    a ^= b   # a = (a ^ b) ^ a = b
    return a, b

def missing_number(nums, n):
    """Find missing number in [0..n]. XOR all values with all indices."""
    xor = n
    for i, num in enumerate(nums):
        xor ^= i ^ num
    return xor

Counting Set Bits (Brian Kernighan's Algorithm)

def count_set_bits(n):
    """Count number of 1-bits. O(number of set bits)."""
    count = 0
    while n:
        n &= n - 1   # clear the lowest set bit
        count += 1
    return count

def is_power_of_two(n):
    """Check if n is a power of 2. O(1)."""
    return n > 0 and (n & (n - 1)) == 0

Bitmask DP — Subset Enumeration

Use an integer as a bitmask to represent subsets. Iterate over all subsets of a set of size n in O(2^n).

# Iterate over all subsets of {0, 1, ..., n-1}
for mask in range(1 << n):
    # Check which elements are in this subset
    for i in range(n):
        if mask & (1 << i):
            # element i is in the subset
            pass

# Iterate over all submasks of a given mask
submask = mask
while submask > 0:
    # process submask
    submask = (submask - 1) & mask

Math for Interviews

GCD — Euclidean Algorithm

def gcd(a, b):
    """Greatest Common Divisor. O(log(min(a,b)))."""
    while b:
        a, b = b, a % b
    return a

def lcm(a, b):
    """Least Common Multiple."""
    return a * b // gcd(a, b)

Modular Arithmetic

Essential for problems with large numbers where answers must be returned modulo 10^9+7:

  • (a + b) % m = ((a % m) + (b % m)) % m
  • (a * b) % m = ((a % m) * (b % m)) % m
  • (a - b) % m = ((a % m) - (b % m) + m) % m (add m to avoid negatives)
  • Modular inverse: a^(-1) mod m = a^(m-2) mod m (when m is prime, by Fermat's little theorem)
MOD = 10**9 + 7

def mod_pow(base, exp, mod):
    """Fast modular exponentiation. O(log exp)."""
    result = 1
    base %= mod
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        exp >>= 1
        base = base * base % mod
    return result

def mod_inverse(a, mod):
    """Modular inverse using Fermat's little theorem (mod must be prime)."""
    return mod_pow(a, mod - 2, mod)

Sieve of Eratosthenes

def sieve(n):
    """Find all primes up to n. O(n log log n)."""
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(n**0.5) + 1):
        if is_prime[i]:
            for j in range(i*i, n + 1, i):
                is_prime[j] = False
    return [i for i in range(2, n + 1) if is_prime[i]]

Combinatorics — nCr with DP

def nCr(n, r, mod=10**9+7):
    """Compute C(n,r) mod p using Pascal's triangle. O(n*r)."""
    if r > n: return 0
    if r == 0 or r == n: return 1
    dp = [[0] * (r + 1) for _ in range(n + 1)]
    for i in range(n + 1):
        dp[i][0] = 1
        for j in range(1, min(i, r) + 1):
            dp[i][j] = (dp[i-1][j-1] + dp[i-1][j]) % mod
    return dp[n][r]

# Faster with factorials and modular inverse (O(n) precompute)
def nCr_fast(n, r, mod=10**9+7):
    if r > n: return 0
    fact = [1] * (n + 1)
    for i in range(2, n + 1): fact[i] = fact[i-1] * i % mod
    inv_fact = [1] * (n + 1)
    inv_fact[n] = mod_pow(fact[n], mod - 2, mod)
    for i in range(n - 1, -1, -1): inv_fact[i] = inv_fact[i+1] * (i+1) % mod
    return fact[n] * inv_fact[r] % mod * inv_fact[n-r] % mod
Gotcha: Greedy doesn't always give optimal solution. The classic counterexample is the coin change problem with denominations like {1, 3, 4}. Greedy for amount 6 would pick 4+1+1 (3 coins), but the optimal is 3+3 (2 coins). To prove greedy correctness, use the exchange argument: show that swapping any non-greedy choice for the greedy choice doesn't worsen the solution.
N-Queens 5x5 — one of 10 solutions, placed row-by-row with backtracking
Bit idioms cheat card

Fenwick Tree (Binary Indexed Tree)

Each update or prefix query walks idx += idx & -idx (or idx -= idx & -idx), which toggles at most one set bit per step — so the loop runs ceil(log2 n) times. No recursion, no pointers, single array.

The Fenwick tree (BIT) is what you reach for when you need point update + prefix aggregate on an array and a segment tree feels heavy. The structure is implicit: index i owns the range (i - lowbit(i), i], where lowbit(i) = i & -i. There is no tree object — just a 1-indexed array and two four-line loops.

Two classic extensions interviewers probe: the 2D BIT (same trick nested, O(log² n) per op), and the range-update + range-query variant built from two BITs B1, B2 that maintain invariants sum[1..i] = i·query(B1,i) - query(B2,i). That two-BIT trick converts "add v on [l,r] and query sum on [l,r]" into four O(log n) calls — often the difference between a segment-tree with lazy and a one-screen solution.

When NOT to use: non-invertible operations (min/max can't be range-updated cleanly because there is no inverse). For those, fall back to a segment tree.

# Python — point update, prefix sum
class BIT:
    def __init__(self, n):
        self.n = n
        self.t = [0] * (n + 1)           # 1-indexed
    def update(self, i, delta):            # add delta at index i
        while i <= self.n:
            self.t[i] += delta
            i += i & -i                    # climb to parent in implicit tree
    def query(self, i):                    # sum of a[1..i]
        s = 0
        while i > 0:
            s += self.t[i]
            i -= i & -i                    # jump to predecessor range
        return s
    def range_sum(self, l, r):             # inclusive [l,r]
        return self.query(r) - self.query(l - 1)

# Range update + range query via two BITs
class RURQ:
    def __init__(self, n):
        self.B1, self.B2 = BIT(n), BIT(n)
    def _upd(self, B, i, v):
        B.update(i, v)
    def range_add(self, l, r, v):
        self.B1.update(l, v);      self.B1.update(r + 1, -v)
        self.B2.update(l, v * (l - 1))
        self.B2.update(r + 1, -v * r)
    def prefix(self, i):
        return i * self.B1.query(i) - self.B2.query(i)
    def range_sum(self, l, r):
        return self.prefix(r) - self.prefix(l - 1)
// Java — minimal BIT
class FenwickTree {
    int[] t; int n;
    FenwickTree(int n) { this.n = n; t = new int[n + 1]; }
    void update(int i, int v) { for (; i <= n; i += i & -i) t[i] += v; }
    int  query (int i) { int s = 0; for (; i > 0; i -= i & -i) s += t[i]; return s; }
}
Fenwick prefix-sum walk — update(5, +3) then query(7)

Sqrt Decomposition

Array split into ⌈n/B⌉ blocks of size B=⌈√n⌉. A range query touches at most two partial blocks (B work) and n/B full blocks (√n work); the balance B=√n minimizes B + n/B = 2√n.

Sqrt decomposition is the poor-cousin of segment trees that is often faster in cache and dramatically simpler to write under pressure. Store the array plus a per-block aggregate (sum, min, max, xor…). Updates touch one cell plus recompute one block in O(B). Range queries peel the partial prefix/suffix blocks cell-by-cell and sweep the full interior blocks via their precomputed aggregate.

Versus a segment tree: segment tree is O(log n) per op but fatter constants and more code. Use sqrt when (a) n ≤ 10⁵ and constants matter, (b) the operation is awkward to merge (e.g. "mode of range"), or (c) the problem is offline — then you unlock Mo's algorithm, which reorders q queries by (block-of-L, R) and reuses an incremental data structure to achieve O((n+q)√n).

Typical interview signal: "given q up to 10⁵ range queries on a 10⁵ array, count distinct / count mode / count elements equal to x" — if online needs are absent, Mo's algorithm is the canonical answer.

import math

class SqrtSum:
    def __init__(self, a):
        self.a = a[:]
        self.n = len(a)
        self.B = max(1, int(math.isqrt(self.n)))
        self.block = [0] * ((self.n + self.B - 1) // self.B)
        for i, v in enumerate(a):
            self.block[i // self.B] += v

    def update(self, i, v):                    # a[i] = v
        self.block[i // self.B] += v - self.a[i]
        self.a[i] = v

    def range_sum(self, l, r):                 # inclusive [l,r]
        s = 0
        bl, br = l // self.B, r // self.B
        if bl == br:
            return sum(self.a[l:r+1])
        s += sum(self.a[l:(bl+1)*self.B])         # partial head
        for b in range(bl + 1, br):
            s += self.block[b]                     # full middle
        s += sum(self.a[br*self.B:r+1])           # partial tail
        return s
Sqrt decomposition — n=9, B=3

Persistent Data Structures

Path-copying on a balanced tree clones only the O(log n) nodes on the root-to-leaf path of the mutated index — every other node is shared across versions.

A persistent data structure lets you keep every historical version after updates, so queries can address any snapshot. Two canonical implementations: fat-node (store a list of (version, value) at each node — O(log m) to read a version) and path-copying (on update, clone the path; old root survives unchanged). Path-copying is the interview-friendly one.

The killer FAANG application is the persistent segment tree: build one tree per prefix of the array, where each tree is created from the previous by a point update in O(log n) time and O(log n) new nodes. A range query becomes "compare rootr and rootl-1" inside the tree, giving k-th smallest in range [l,r] in O(log n) after O(n log n) build — the standard "kth smallest in range" problem.

Persistent arrays (e.g. Clojure / Scala vectors) use the same idea with branch factor 32: a 32-way trie where updates copy at most ⌈log₃₂ n⌉ ≤ 7 nodes for any n ≤ 2³⁵. That is the "effectively-constant" copy cost behind immutable collections in functional languages.

# Persistent segment tree for kth-smallest in a range
# Each version is an extra O(log n) nodes; count of a value in [l,r]
# is diff of cumulative counts between version r and version l-1.
class Node:
    __slots__ = ("lo", "hi", "cnt", "l", "r")

def build(lo, hi):
    n = Node(); n.lo, n.hi, n.cnt, n.l, n.r = lo, hi, 0, None, None
    if lo < hi:
        m = (lo + hi) // 2
        n.l = build(lo, m); n.r = build(m + 1, hi)
    return n

def insert(prev, x):
    # Return a new root that shares everything with prev except the path to x.
    n = Node(); n.lo, n.hi = prev.lo, prev.hi
    if prev.lo == prev.hi:
        n.cnt = prev.cnt + 1; n.l = n.r = None; return n
    m = (prev.lo + prev.hi) // 2
    if x <= m:
        n.l, n.r = insert(prev.l, x), prev.r      # share right subtree
    else:
        n.l, n.r = prev.l, insert(prev.r, x)      # share left subtree
    n.cnt = n.l.cnt + n.r.cnt
    return n

def kth(u, v, k):                                  # kth smallest in (l-1, r]
    if u.lo == u.hi: return u.lo
    left = v.l.cnt - u.l.cnt
    if k <= left: return kth(u.l, v.l, k)
    return kth(u.r, v.r, k - left)
Path-copying — one update clones the spine, shares the rest

Suffix Array + LCP (Kasai)

Doubling: sort suffixes by their first 2ᵏ characters using pairs of previous-round ranks; log n rounds, each O(n log n) sort — O(n log² n). Radix-sort inside each round drops it to O(n log n). Kasai then walks suffixes in string order, amortizing LCP shrink to O(n).

A suffix array SA[0..n-1] is the lexicographic ordering of all suffixes of a string. Paired with the LCP array (lcp[i] = longest-common-prefix of SA[i] and SA[i-1]), it unlocks most string problems a suffix tree does, but with half the constants and a clean array layout.

Kasai's algorithm is the cleanest way to build LCP in O(n): walk the string in original position order, maintain a counter h that never increases by more than 1 per step, and use the inverse permutation rank[] to know where each suffix landed in SA.

Classic queries: longest repeated substring = max(lcp); longest common substring of s and t = max(lcp[i]) where SA[i] and SA[i-1] originate from different strings (concatenate s + "#" + t + "$"); distinct substrings = n(n+1)/2 − sum(lcp); pattern search = binary search on SA in O(m log n).

# Python — O(n log^2 n) suffix array (doubling + sort), O(n) Kasai LCP.
def suffix_array(s):
    n = len(s)
    sa = list(range(n))
    rank = [ord(c) for c in s]
    tmp = [0] * n
    k = 1
    while True:
        key = lambda i: (rank[i], rank[i + k] if i + k < n else -1)
        sa.sort(key=key)
        tmp[sa[0]] = 0
        for j in range(1, n):
            tmp[sa[j]] = tmp[sa[j-1]] + (1 if key(sa[j]) != key(sa[j-1]) else 0)
        rank = tmp[:]
        if rank[sa[-1]] == n - 1:           # all ranks unique
            break
        k *= 2
    return sa

def kasai(s, sa):
    n = len(s)
    rank = [0] * n
    for i, p in enumerate(sa): rank[p] = i
    h, lcp = 0, [0] * n
    for i in range(n):
        if rank[i] > 0:
            j = sa[rank[i] - 1]
            while i + h < n and j + h < n and s[i+h] == s[j+h]:
                h += 1
            lcp[rank[i]] = h
            if h: h -= 1                     # key amortization step
        else:
            h = 0
    return lcp
Suffix array of "banana$"

Convex Hull Trick (CHT) / Li Chao Tree

DP recurrences of shape dp[i] = minj<i (mj·xi + bj) reduce to evaluating a family of lines at a query point — the answer is the lower envelope of those lines. A monotonic deque or Li Chao tree maintains it in O(1) or O(log n) per op respectively.

CHT is the DP-optimization trick for quadratic recurrences where the transition is linear in some precomputed quantity. If adding lines arrive in order of slope and queries arrive in order of x (both monotonic), the classic deque-based CHT runs in amortized O(1) per op. If either ordering breaks, use the Li Chao tree: a segment tree over the x-range that stores at most one "best" line per node and rebalances on insert, giving O(log V) per op where V is the x-coordinate range.

The generalization to O(n·√n) problems like "split array into k parts minimizing sum of costs" is Divide & Conquer optimization or Knuth's optimization — related but needs different monotonicity (quadrangle inequality).

# Monotonic-deque CHT — add lines in decreasing slope, query in increasing x.
def bad(l1, l2, l3):
    # l2 is redundant iff intersection(l1,l3) is left of intersection(l1,l2).
    return (l3[1] - l1[1]) * (l1[0] - l2[0]) \
        <= (l2[1] - l1[1]) * (l1[0] - l3[0])

class CHT:                                            # min lower envelope
    def __init__(self): self.d = []                    # list of (m, b)
    def add(self, m, b):
        while len(self.d) >= 2 and bad(self.d[-2], self.d[-1], (m, b)):
            self.d.pop()
        self.d.append((m, b))
    def query(self, x):                                # x must be non-decreasing
        while len(self.d) >= 2 and \
              self.d[0][0]*x + self.d[0][1] >= self.d[1][0]*x + self.d[1][1]:
            self.d.pop(0)
        m, b = self.d[0]
        return m * x + b
Lower envelope of 4 lines — only L1, L3, L4 survive

Heavy-Light Decomposition (HLD)

Any root-to-leaf path crosses at most O(log n) "light" edges — each light edge halves the subtree size. Each chain lives in a contiguous segment-tree range, so a path query visits O(log n) chains × O(log n) per segment-tree op.

HLD converts tree path queries into a logarithmic number of array-range queries. For each node, pick the child whose subtree is largest — that edge is heavy, everything else is light. Heavy edges form disjoint chains; lay the chains back-to-back in an array and back them with a segment tree or BIT. Answering path(u, v) becomes: jump u and v upward one chain at a time toward their LCA, querying each chain's segment-tree range.

Canonical problem: "given a weighted tree, support point-update on a node's weight and query sum/max on the path between two nodes". That's HLD in its purest form — O(log² n) per operation.

Implementation sketch: two DFS passes. First computes sizes and chooses the heavy child; second assigns DFS-in indices and chain-head pointers. Then path-query bounces upward by u = parent[chainHead[u]].

# HLD — path sum with point updates on nodes. Skeleton shown.
def hld(n, adj, root=0):
    parent = [-1]*n; depth = [0]*n; size = [1]*n; heavy = [-1]*n
    # iterative DFS for size + heavy child
    order, stack = [], [root]
    while stack:
        u = stack.pop(); order.append(u)
        for v in adj[u]:
            if v != parent[u]:
                parent[v] = u; depth[v] = depth[u] + 1; stack.append(v)
    for u in reversed(order):
        for v in adj[u]:
            if v != parent[u]:
                size[u] += size[v]
                if heavy[u] == -1 or size[v] > size[heavy[u]]:
                    heavy[u] = v

    pos = [0]*n; head = [0]*n; t = 0
    def decompose(u, h):
        nonlocal t
        head[u], pos[u] = h, t; t += 1
        if heavy[u] != -1: decompose(heavy[u], h)
        for v in adj[u]:
            if v != parent[u] and v != heavy[u]:
                decompose(v, v)
    decompose(root, root)
    return parent, depth, head, pos                    # plug into segment tree over pos
HLD — two heavy chains, rest are singletons

DSU on Tree (Small-to-Large Merging)

Each node is re-added to the "big" map O(log n) times — every re-add doubles the size of the set containing it, so there are at most log n promotions per node.

"DSU on tree" (also called small-to-large, or Sack) answers subtree aggregate queries without Heavy-Light or persistent structures. For every node: solve light children first and clear their data; then solve the heavy child and keep its data; then walk every subtree of every light child and re-add into the heavy child's data structure; record the answer for the current node.

The canonical interview problem: "given a rooted tree where each node has a color, answer for every node: how many distinct colors appear in its subtree?" DSU-on-tree solves it in O(n log n) with a single hash map and a counter of "distinct" — trivial compared to HLD.

# Python sketch — count distinct colors per subtree.
def solve(n, adj, color, root=0):
    size = [1] * n; big = [-1] * n
    def dfs_size(u, p):
        for v in adj[u]:
            if v != p:
                dfs_size(v, u); size[u] += size[v]
                if big[u] == -1 or size[v] > size[big[u]]: big[u] = v
    dfs_size(root, -1)

    cnt = {}; distinct = [0]; ans = [0] * n
    def add(u, p, delta):
        c = color[u]
        if delta == 1:
            if cnt.get(c, 0) == 0: distinct[0] += 1
            cnt[c] = cnt.get(c, 0) + 1
        else:
            cnt[c] -= 1
            if cnt[c] == 0: distinct[0] -= 1
        for v in adj[u]:
            if v != p: add(v, u, delta)

    def dfs(u, p, keep):
        for v in adj[u]:
            if v != p and v != big[u]: dfs(v, u, False)   # light first, then clear
        if big[u] != -1: dfs(big[u], u, True)             # heavy, keep data
        # add self + light subtrees on top of heavy data
        c = color[u]
        if cnt.get(c, 0) == 0: distinct[0] += 1
        cnt[c] = cnt.get(c, 0) + 1
        for v in adj[u]:
            if v != p and v != big[u]: add(v, u, 1)
        ans[u] = distinct[0]
        if not keep: add(u, p, -1)                        # discard
    dfs(root, -1, False)
    return ans
Small-to-large — heavy subtree kept, light subtrees re-added

Mo's Algorithm

Sort queries by (⌊l/B⌋, r). Across all queries, the left pointer moves O(n) per block and there are n/B blocks = O(n·n/B). The right pointer moves O(n) per block = O(n²/B). Pick B = √n → total O((n+q)√n).

Mo's is the canonical answer to "given an immutable array and q offline range queries, compute an aggregate that supports O(1) add / remove one element". Examples: count distinct values, count elements equal to x, count pairs (i,j) with a[i]==a[j] in [l,r] (HackerEarth classic), xor of all subarray elements occurrences.

The algorithm: maintain pointers (L, R) and a running aggregate. Sort queries, then for each query expand/contract pointers one step at a time, calling add(val) or remove(val). Each call is O(1) against your frequency map, and the total number of calls is bounded by the sorting trick.

from math import isqrt

def mo(a, queries):
    n, q = len(a), len(queries)
    B = max(1, isqrt(n))
    order = sorted(range(q), key=lambda i: (queries[i][0] // B, queries[i][1]))
    cnt = [0] * (max(a) + 1); distinct = 0
    L, R = 0, -1
    ans = [0] * q

    def add(x):
        nonlocal distinct
        if cnt[x] == 0: distinct += 1
        cnt[x] += 1
    def rem(x):
        nonlocal distinct
        cnt[x] -= 1
        if cnt[x] == 0: distinct -= 1

    for idx in order:
        l, r = queries[idx]
        while R < r: R += 1; add(a[R])
        while L > l: L -= 1; add(a[L])
        while R > r: rem(a[R]); R -= 1
        while L < l: rem(a[L]); L += 1
        ans[idx] = distinct
    return ans
Mo's ordering — queries sorted by (block-of-L, R)

Treap & Implicit Treap

Random priorities make the expected tree depth O(log n). Split/merge preserve BST-key order and heap-priority order simultaneously; since priority is random, height is the same as a randomly-built BST — expected O(log n).

A treap is a BST by key and a heap by a random priority assigned at insert. Priorities make the tree effectively balanced in expectation, without AVL's or red-black's rotation case-work. The primitives are split(t, k) (split into "≤ k" and "> k") and merge(l, r) (l's keys all ≤ r's keys); every other op is phrased on top.

The real interview weapon is the implicit treap: index nodes by subtree size instead of a key, so the tree represents a sequence and split/merge slice and glue subranges in O(log n). This gives O(log n) "rotate subrange [l,r]", "insert at index k", "reverse [l,r]" (with lazy reverse flag) — operations that arrays and linked lists can't do together.

# Implicit treap — sequence with O(log n) split / merge / reverse-range
import random

class Node:
    __slots__ = ("val", "pri", "sz", "l", "r", "rev")
    def __init__(self, v):
        self.val, self.pri, self.sz = v, random.random(), 1
        self.l = self.r = None; self.rev = False

def sz(t): return t.sz if t else 0
def upd(t):
    if t: t.sz = 1 + sz(t.l) + sz(t.r)
def push(t):
    if t and t.rev:
        t.l, t.r = t.r, t.l
        if t.l: t.l.rev ^= True
        if t.r: t.r.rev ^= True
        t.rev = False

def split(t, k):                       # first k go left, rest go right
    if not t: return (None, None)
    push(t)
    if sz(t.l) < k:
        a, b = split(t.r, k - sz(t.l) - 1); t.r = a; upd(t); return (t, b)
    else:
        a, b = split(t.l, k); t.l = b; upd(t); return (a, t)

def merge(a, b):
    if not a or not b: return a or b
    push(a); push(b)
    if a.pri > b.pri:
        a.r = merge(a.r, b); upd(a); return a
    else:
        b.l = merge(a, b.l); upd(b); return b

def reverse_range(root, l, r):
    a, tmp = split(root, l)
    b, c = split(tmp, r - l + 1)
    if b: b.rev ^= True
    return merge(merge(a, b), c)
Implicit treap — reverse([2,4]) = split, flip rev, merge
Module 12

System Design Data Structures

Production-grade data structures that power real systems — LRU cache, consistent hashing, skip lists, LSM trees, and Bloom filters at scale.

LRU Cache

An LRU (Least Recently Used) cache evicts the least recently accessed item when capacity is full. The classic implementation uses a HashMap + Doubly Linked List for O(1) get and put.

Design

  • HashMap: key → pointer to DLL node. Provides O(1) lookup.
  • Doubly Linked List: Maintains access order. Most recently used at head, least recently used at tail.
  • get(key): Look up in map, move node to head, return value. O(1).
  • put(key, value): If exists, update and move to head. If new and at capacity, evict tail. Insert at head. O(1).
class Node:
    def __init__(self, key=0, val=0):
        self.key, self.val = key, val
        self.prev = self.next = None

class LRUCache:
    def __init__(self, capacity):
        self.cap = capacity
        self.cache = {}             # key -> Node
        self.head = Node()          # dummy head (MRU side)
        self.tail = Node()          # dummy tail (LRU side)
        self.head.next = self.tail
        self.tail.prev = self.head

    def _remove(self, node):
        node.prev.next = node.next
        node.next.prev = node.prev

    def _add_to_head(self, node):
        node.next = self.head.next
        node.prev = self.head
        self.head.next.prev = node
        self.head.next = node

    def get(self, key):
        if key not in self.cache: return -1
        node = self.cache[key]
        self._remove(node)
        self._add_to_head(node)     # mark as recently used
        return node.val

    def put(self, key, value):
        if key in self.cache:
            self._remove(self.cache[key])
        node = Node(key, value)
        self.cache[key] = node
        self._add_to_head(node)
        if len(self.cache) > self.cap:
            lru = self.tail.prev     # evict LRU
            self._remove(lru)
            del self.cache[lru.key]
LRU Cache: HashMap + Doubly Linked List (capacity=3) HashMap key1 → key2 → key3 → HEAD key1:val1 Most Recent key2:val2 key3:val3 Evict next TAIL get(key1): move node to head | put(new): add to head, evict tail if full All operations O(1) — HashMap for lookup, DLL for order
Figure 12.1 — LRU Cache structure: HashMap maps keys to DLL nodes. Head = MRU, Tail = LRU (eviction target).

Consistent Hashing

Consistent hashing distributes data across servers such that adding or removing a server only remaps ~K/N keys (K = total keys, N = total servers), compared to traditional hashing which remaps nearly all keys.

How It Works

  • Hash ring: Both servers and keys are hashed onto a circular space [0, 2^32)
  • Key assignment: Each key is assigned to the first server encountered clockwise on the ring
  • Adding a server: Only keys between the new server and its predecessor are remapped
  • Virtual nodes: Each physical server is assigned multiple positions on the ring (e.g., 100-200 vnodes) to ensure even distribution
Consistent Hashing Ring with Virtual Nodes Hash Ring [0, 2^32) A A' A'' B B' C C' k1 k2 k3 Key Assignment: k1 → Server A' (next CW) k2 → Server B (next CW) k3 → Server C' (next CW) Virtual Nodes: A → A, A', A'' (3 vnodes) B → B, B' (2 vnodes) C → C, C' (2 vnodes) Adding Server D: Only keys between D and its predecessor are remapped ≈ K/N keys move (not all!)
Figure 12.2 — Consistent hashing ring with virtual nodes. Keys are assigned to the next server clockwise. Virtual nodes ensure even distribution.

Skip Lists

A skip list is a probabilistic alternative to balanced BSTs. It's a multi-level linked list where each level is a "fast lane" that skips over multiple elements.

Properties

  • Level 0 (bottom): contains all elements in sorted order
  • Each higher level contains a random subset of the level below (each element is promoted with probability p = 0.5)
  • Search: start at top-left, go right until overshooting, then drop down one level. Repeat.
  • Expected height: O(log n). Search, insert, delete: all O(log n) expected.
  • Used by Redis (sorted sets), LevelDB, and MemSQL
Skip List: Multi-Level Express Lanes L3 L2 L1 L0 H 3 6 9 12 17 25 NIL Search 17: H→12 (L3) → 17 (L2) → Found! Only 2 hops instead of 5
Figure 12.3 — Skip list with 4 levels. Higher levels act as express lanes, enabling O(log n) search.

LSM Trees

Log-Structured Merge Trees are the backbone of modern write-optimized databases. They convert random writes into sequential I/O, achieving write throughput orders of magnitude better than B-Trees.

Write Path

  • Step 1: Write to a Write-Ahead Log (WAL) for durability (append-only, sequential)
  • Step 2: Insert into an in-memory sorted structure called a memtable (usually a red-black tree or skip list)
  • Step 3: When memtable reaches a size threshold, flush it to disk as a sorted, immutable SSTable (Sorted String Table)

Read Path

  • Step 1: Check the memtable (fastest — in memory)
  • Step 2: Check the most recent SSTable, then older SSTables
  • Step 3: Use Bloom filters on each SSTable to skip files that definitely don't contain the key
  • Step 4: Use sparse index / block index within each SSTable for efficient lookup

Compaction Strategies

StrategyHow It WorksWrite AmpRead AmpSpace AmpUsed By
Size-Tiered (STCS)Merge similarly-sized SSTablesLowerHigherHigherCassandra (default)
Leveled (LCS)Each level is 10x previous; no overlap within levelHigherLowerLowerLevelDB, RocksDB
LSM Tree: Write & Read Paths WRITE PATH Write(k,v) WAL Memtable (in-memory, sorted) flush when full DISK L0: SSTable SSTable SSTable (may overlap) compaction L1: SSTable (no overlap within level) SSTable (sorted, disjoint) L2: 10x larger — multiple non-overlapping SSTables READ PATH 1. Memtable 2. Bloom filter Check each level top-down Bloom filters skip empty SSTables
Figure 12.4 — LSM tree architecture: writes go to WAL + memtable, then flush to SSTables. Reads check memtable first, then disk levels.

Bloom Filters at Scale

A Bloom filter is a space-efficient probabilistic data structure that answers "Is x in the set?" with:

  • Definitely not in the set (no false negatives — 100% certain)
  • Probably in the set (false positives possible)

How It Works

A bit array of m bits, with k hash functions. To add an element, hash it k times and set those k bit positions. To query, check all k positions — if any is 0, the element is definitely absent.

Optimal Parameters

  • False positive rate: p = (1 - e^(-kn/m))^k
  • Optimal hash count: k = (m/n) · ln(2)
  • Bits per element for 1% FP rate: ~9.6 bits (with k=7 hash functions)
  • Bits per element for 0.1% FP rate: ~14.4 bits (with k=10 hash functions)

Applications at Scale

SystemUse CaseWhy Bloom Filter
LevelDB / RocksDBSkip SSTables during readsAvoid expensive disk I/O for absent keys
Cassandra / HBaseSkip partitions without target keyReduce read amplification
Chrome / FirefoxSafe Browsing malicious URL checkCheck millions of URLs with minimal memory
Medium / AkamaiRecommendation deduplicationDon't recommend articles already seen
Bitcoin / EthereumSPV clients filter relevant transactionsLight clients don't store full blockchain
Gotcha: LRU Cache is the most common data structure design question. Know it cold — HashMap + Doubly Linked List, O(1) everything. Practice implementing it from scratch in under 15 minutes. Variants include LFU cache (frequency-based eviction) and TTL-based caches. The follow-up is often "make it thread-safe" (use a read-write lock or concurrent hash map + lock striping).
Data-structure chooser for system-design interviews

Bloom Filter & Cuckoo Filter

With m bits, n items, k hashes chosen as k = (m/n)·ln 2, false-positive probability ≈ (1 − e−kn/m)k. At 1% FPR this is about 9.6 bits/element — independent of item size.

A Bloom filter answers "is x possibly in the set?" with zero false negatives and a tunable false-positive rate. It is the canonical "cache in front of a cache": Bigtable and Cassandra front every SSTable with one so negative reads skip disk; CDN edge nodes use them to decide whether to origin-fetch.

Variants interviewers poke at: Counting Bloom filter replaces each bit with a small counter (4 bits is typical) so you can delete — at 4× the memory. Cuckoo filter stores short fingerprints in a cuckoo hash table: supports delete without counters, lower space at the same FPR for >0.1% rates, and cache-friendlier lookups (2 buckets instead of k hashes).

# Minimal Bloom filter
class Bloom:
    def __init__(self, m, k):
        self.bits = bytearray(m // 8 + 1); self.m = m; self.k = k
    def _idx(self, x, i):
        return (hash((i, x)) & 0x7FFFFFFFFFFFFFFF) % self.m
    def add(self, x):
        for i in range(self.k):
            b = self._idx(x, i); self.bits[b >> 3] |= 1 << (b & 7)
    def __contains__(self, x):
        return all((self.bits[(b := self._idx(x, i)) >> 3] >> (b & 7)) & 1
                   for i in range(self.k))
Bloom filter — insert "cat" with k=3 hashes

HyperLogLog

Count leading-zero runs in m=2^p hashed buckets; the expected maximum run-length of the bucket storing n/m items is log₂(n/m). The harmonic mean of 2^maxZeros across buckets, corrected by a constant, estimates n with relative error ≈ 1.04/√m.

HyperLogLog (HLL) estimates the cardinality (number of distinct elements) of a multiset with absurdly small memory. For 1 billion uniques and <1% error, 12 KB is enough. Redis PFCOUNT, BigQuery's APPROX_COUNT_DISTINCT, and Presto's approx_distinct are all HLL under the hood.

The real magic for distributed systems is the merge operator: two HLLs of the same shape merge element-wise by max across buckets, giving the cardinality of the union. This means per-shard or per-minute HLLs can be merged without revisiting raw events — turning cardinality into a commutative monoid.

import math

class HLL:
    def __init__(self, p=14):                  # m = 2^p buckets
        self.p = p; self.m = 1 << p
        self.buckets = [0] * self.m
    def add(self, x):
        h = hash(x) & 0xFFFFFFFFFFFFFFFF
        idx = h >> (64 - self.p)
        w = (h << self.p) & 0xFFFFFFFFFFFFFFFF | (1 << (self.p - 1))
        self.buckets[idx] = max(self.buckets[idx], (w.bit_length() ^ 64) + 1)
    def estimate(self):
        alpha = 0.7213 / (1 + 1.079 / self.m)        # from the paper
        z = sum(2.0**-b for b in self.buckets)
        return alpha * self.m * self.m / z
    def merge(self, other):                       # union cardinality
        self.buckets = [max(a, b) for a, b in zip(self.buckets, other.buckets)]
HLL merge — element-wise max across buckets

Count-Min Sketch

Width w = ⌈e/ε⌉, depth d = ⌈ln(1/δ)⌉. Each item increments one counter per row; estimate(x) = min over rows. With these sizes, err ≤ ε·(total count) with probability 1 − δ.

Count-Min sketch (CMS) estimates the frequency of an item in a stream in sublinear memory. Insert increments d counters (one per independent hash); query returns the minimum of those d counters. The min step is what kills most of the false-positive overcount from hash collisions.

The classic application is heavy-hitters / top-k over a stream: pair a CMS with a min-heap of the k largest (item, est(item)) seen so far. Every insert updates the sketch, then checks whether the new estimate beats the heap minimum. This runs in O(log k + d) per stream event with O(wd + k) memory.

import math, random

class CountMin:
    def __init__(self, eps=0.001, delta=1e-5):
        self.w = max(2, math.ceil(math.e / eps))
        self.d = max(1, math.ceil(math.log(1 / delta)))
        self.table = [[0] * self.w for _ in range(self.d)]
        self.seeds = [random.getrandbits(32) for _ in range(self.d)]
    def _h(self, i, x):
        return (hash((self.seeds[i], x)) & 0x7FFFFFFF) % self.w
    def add(self, x, c=1):
        for i in range(self.d):
            self.table[i][self._h(i, x)] += c
    def estimate(self, x):
        return min(self.table[i][self._h(i, x)] for i in range(self.d))
Count-Min — d=4 rows, w=8 columns; query takes min across rows
Module 13

Cheat Sheet & Quick Reference

Module summaries, rapid-fire Q&A, master complexity table, and pattern recognition guide — everything you need for last-minute review.

Module Summaries

1. Complexity Analysis

  • Big-O (upper), Omega (lower), Theta (tight bound)
  • Growth: O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(2^n) < O(n!)
  • Master Theorem: T(n) = aT(n/b) + O(n^d)
  • Amortized: aggregate cost over sequence of operations
  • Space complexity includes auxiliary + input space

2. Arrays & Strings

  • Two pointers: opposite ends or same direction
  • Sliding window: fixed or variable size, track window state
  • Prefix sums: range sum queries in O(1) after O(n) precompute
  • KMP: pattern matching in O(n+m) using failure function
  • Rabin-Karp: rolling hash for O(n) average string matching

3. Linked Lists

  • Singly, doubly, circular — know tradeoffs
  • Floyd's cycle detection: slow + fast pointer
  • Reverse iteratively (3 pointers) or recursively
  • Merge two sorted lists: O(n+m) time, O(1) space
  • Dummy head node simplifies edge cases

4. Stacks & Queues

  • Stack: LIFO — undo, DFS, parentheses matching
  • Monotonic stack: next greater/smaller element in O(n)
  • Queue: FIFO — BFS, task scheduling
  • Deque: double-ended queue, sliding window max
  • Priority queue = heap (not a queue)

5. Hash Tables

  • Average O(1) lookup, insert, delete
  • Chaining vs open addressing (linear/quadratic probing)
  • Load factor α < 0.75 — resize at threshold
  • Bloom filter: probabilistic set membership, no false negatives
  • Good hash: uniform distribution, avalanche effect

6. Trees

  • Traversals: inorder (BST sorted), preorder (serialize), postorder (delete)
  • BST: O(log n) avg, O(n) worst (degenerate)
  • AVL: strict balance (|height diff| ≤ 1), O(log n) guaranteed
  • Red-Black: relaxed balance, faster inserts than AVL
  • B-Tree: multi-way, disk-optimized. Trie: prefix matching O(L)

7. Heaps

  • Min-heap: parent ≤ children. Max-heap: parent ≥ children
  • Insert: O(log n). Extract: O(log n). Peek: O(1)
  • Build heap: O(n) NOT O(n log n) — sift-down analysis
  • Top-K: min-heap of size K. Running median: two heaps
  • Array-based: parent = i//2, children = 2i, 2i+1

8. Graphs

  • Adjacency list (sparse, O(V+E)) vs matrix (dense, O(V²))
  • BFS: shortest path (unweighted), level-order. DFS: cycle detection, topo sort
  • Dijkstra: O((V+E) log V), no negative edges
  • Bellman-Ford: O(VE), handles negative edges
  • Union-Find: near O(1) with path compression + rank

9. Dynamic Programming

  • Needs: overlapping subproblems + optimal substructure
  • Memoization (top-down) vs tabulation (bottom-up)
  • 1D: Fibonacci, coin change, LIS, house robber
  • 2D: LCS, edit distance, grid paths, knapsack
  • Space optimization: rolling array when dp[i] depends on dp[i-1]

10. Sorting & Searching

  • Merge sort: O(n log n) stable, O(n) space
  • Quick sort: O(n log n) avg, O(n²) worst, in-place
  • Counting/Radix: O(n) for bounded integers
  • Binary search: lo ≤ hi (exact), lo < hi (boundary)
  • Quick-select: O(n) avg kth element

11. Advanced Topics

  • Backtracking: build + prune + undo. N-Queens, permutations
  • Greedy: local optimal → global optimal (prove correctness!)
  • XOR: a^a=0, a^0=a — find single number
  • Bit tricks: x&(x-1) clears lowest set bit
  • GCD: Euclidean O(log n). Sieve: O(n log log n)

12. System Design DS

  • LRU Cache: HashMap + DLL, O(1) get/put
  • Consistent hashing: ring + virtual nodes, K/N remap
  • Skip list: probabilistic O(log n), used by Redis
  • LSM tree: sequential writes, memtable → SSTable
  • Bloom filter: ~10 bits/element for 1% false positive rate

Top 50 DSA Rapid-Fire Q&A

#QuestionAnswer
1Time complexity of binary search?O(log n)
2Best case of bubble sort?O(n) — already sorted, with early exit flag
3Is merge sort stable?Yes
4Worst case of quick sort?O(n²) — sorted input with naive pivot
5Time to build a heap?O(n) — not O(n log n)
6Hash table average lookup time?O(1)
7Hash table worst-case lookup?O(n) — all keys collide into one bucket
8BST worst-case search?O(n) — degenerate (linked list shape)
9AVL tree search time?O(log n) — always balanced
10Difference between BFS and DFS?BFS: level-order (queue). DFS: depth-first (stack/recursion)
11Which gives shortest path in unweighted graph?BFS
12Dijkstra's time complexity?O((V+E) log V) with binary heap
13Can Dijkstra handle negative edges?No — use Bellman-Ford
14Bellman-Ford time complexity?O(V · E)
15What is topological sort used for?Ordering tasks with dependencies (DAG only)
16Time complexity of Union-Find with path compression + rank?O(α(n)) per operation ≈ O(1)
17What are the two properties needed for DP?Overlapping subproblems + optimal substructure
18Memoization vs tabulation?Memo: top-down + cache. Tab: bottom-up + table fill
19Time complexity of LCS?O(m · n)
200/1 Knapsack time complexity?O(n · W) — pseudo-polynomial
21What makes 0/1 knapsack different from unbounded?0/1: each item once. Unbounded: unlimited copies
22Lower bound for comparison-based sorting?Ω(n log n)
23When can you sort in O(n)?Bounded integer range — use counting/radix sort
24What data structure does Dijkstra use?Priority queue (min-heap)
25Stack applications?Undo, DFS, balanced parentheses, function call stack
26What is a monotonic stack?Stack maintaining increasing/decreasing order — next greater element O(n)
27Trie search time?O(L) where L = length of word
28B-Tree vs BST?B-Tree: multi-way, disk-optimized, O(log_B n) I/O
29How does Floyd's cycle detection work?Slow pointer (1 step) + fast pointer (2 steps); they meet inside cycle
30Reverse a linked list time/space?O(n) time, O(1) space (iterative)
31What is a Bloom filter?Probabilistic set: no false negatives, possible false positives
32LRU Cache data structures?HashMap + Doubly Linked List = O(1) get/put
33What is consistent hashing?Hash ring distributing keys to servers; adding server remaps only K/N keys
34Skip list expected search time?O(log n)
35What is an LSM tree optimized for?Write-heavy workloads (sequential I/O)
36Backtracking vs brute force?Backtracking prunes invalid branches early; brute force tries everything
37N-Queens time complexity?O(n!) worst, O(n!) solutions but heavy pruning in practice
38When does greedy fail?When local optimum doesn't lead to global optimum (e.g., 0/1 knapsack)
39XOR of a number with itself?0 (a ^ a = 0)
40Check if n is power of 2?n > 0 and (n & (n-1)) == 0
41Time complexity of Sieve of Eratosthenes?O(n log log n)
42What is the Master Theorem for?Solving recurrences of form T(n) = aT(n/b) + O(n^d)
43Amortized cost of dynamic array append?O(1) — occasional O(n) resize spread across n operations
44Difference between AVL and Red-Black tree?AVL: stricter balance, faster lookups. RB: faster inserts/deletes
45What is a segment tree used for?Range queries (sum, min, max) with point updates in O(log n)
46Quick-select average time?O(n) — find kth element without full sorting
47Tim Sort combines which two algorithms?Merge sort + insertion sort
48Counting sort space complexity?O(n + k) where k = range of values
49How many subsets does a set of size n have?2^n
50Edit distance recurrence?dp[i][j] = min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1] + (s1[i]!=s2[j]))

Master Complexity Table

Data StructureAccessSearchInsertDeleteSpace
ArrayO(1)O(n)O(n)O(n)O(n)
Sorted ArrayO(1)O(log n)O(n)O(n)O(n)
Linked ListO(n)O(n)O(1)*O(1)*O(n)
StackO(n)O(n)O(1)O(1)O(n)
QueueO(n)O(n)O(1)O(1)O(n)
Hash TableN/AO(1) avg / O(n) worstO(1) avg / O(n) worstO(1) avg / O(n) worstO(n)
BST (average)O(log n)O(log n)O(log n)O(log n)O(n)
BST (worst)O(n)O(n)O(n)O(n)O(n)
AVL TreeO(log n)O(log n)O(log n)O(log n)O(n)
Red-Black TreeO(log n)O(log n)O(log n)O(log n)O(n)
B-TreeO(log n)O(log n)O(log n)O(log n)O(n)
Binary HeapO(n)O(n)O(log n)O(log n)O(n)
TrieN/AO(L)O(L)O(L)O(N · L · Σ)
Skip ListO(log n)O(log n)O(log n)O(log n)O(n)

* Linked list insert/delete are O(1) only when you have a reference to the node. Finding the node first is O(n). L = key length, Σ = alphabet size, N = number of keys.

Pattern Recognition Guide

When you see a certain problem pattern, reach for the corresponding technique. This table maps common problem signals to the right approach.

When You See...Think...Why
Sorted arrayBinary searchO(log n) eliminate half each step
Top K / Kth elementHeap (priority queue)Min-heap of size K for top-K largest
Frequency count / duplicatesHash mapO(1) lookup for counting
String matching / patternKMP / Rabin-Karp / TrieO(n+m) instead of O(n·m)
Tree traversalDFS (recursion or stack)Natural recursive structure
Shortest path (unweighted)BFSExplores in order of distance
Shortest path (weighted)Dijkstra / Bellman-FordDijkstra if no neg edges, BF otherwise
Connected componentsUnion-Find / DFSGroup nodes efficiently
Cycle detectionDFS with coloring / Floyd'sGraph coloring or slow/fast pointers
Optimization over subproblemsDynamic programmingOverlapping subproblems + optimal substructure
All permutations / combinationsBacktrackingSystematic exploration with pruning
Contiguous subarray sum/productSliding window / prefix sumO(n) with running state
Parentheses / brackets / nestingStackLIFO matches nested structure
Next greater / smaller elementMonotonic stackO(n) with maintained order
Interval scheduling / mergingSort by end/start + greedyGreedy choice property holds
Matrix / grid traversalBFS / DFS / DPTreat as graph; DP for counting paths
Range queries with updatesSegment tree / BITO(log n) query and update
Prefix / autocompleteTrieO(L) prefix lookup
Median in streamTwo heaps (max + min)O(log n) insert, O(1) median
Minimum spanning treeKruskal (Union-Find) / Prim (heap)Kruskal for sparse, Prim for dense
Master complexity table — averaged operations