Data Structures & Algorithms

A comprehensive FAANG interview preparation guide covering every essential data structure, algorithm, and problem-solving pattern you need to master.

13 Modules • 27 Diagrams • 50+ Code Examples • Complete Interview Prep

Module 01

Complexity Analysis

The mathematical foundation for evaluating algorithm efficiency — Big-O, recurrences, amortized analysis, and space complexity.

Big-O, Omega & Theta Notation

Asymptotic notation describes how an algorithm's resource usage scales as input size approaches infinity. The three primary notations each serve a distinct purpose:

Big-O (Upper Bound)

f(n) = O(g(n)) means there exist constants c > 0 and n₀ such that f(n) ≤ c · g(n) for all n ≥ n₀.

Big-O gives the worst-case upper bound. When we say an algorithm is O(n²), we mean that in the worst case, the running time will not grow faster than some constant times n².

Big-Omega (Lower Bound)

f(n) = Ω(g(n)) means there exist constants c > 0 and n₀ such that f(n) ≥ c · g(n) for all n ≥ n₀.

Omega provides the best-case lower bound. Comparison-based sorting, for instance, has a lower bound of Ω(n log n).

Big-Theta (Tight Bound)

f(n) = Θ(g(n)) means f(n) = O(g(n)) AND f(n) = Ω(g(n)). This gives us an exact asymptotic characterization.

Example: Merge sort is Θ(n log n) in all cases — it always does roughly the same amount of work regardless of input order.

Little-o and Little-omega

f(n) = o(g(n)) — f grows strictly slower than g (not tight). Example: n = o(n²)
f(n) = ω(g(n)) — f grows strictly faster than g. Example: n² = ω(n)

💡

Interview Tip: When interviewers say "What's the time complexity?", they almost always mean Big-O (worst-case upper bound). But be prepared to discuss best-case and average-case when asked to compare algorithms like quicksort vs merge sort.

Growth Rate Comparison

Understanding relative growth rates is critical for choosing appropriate algorithms. Here is the standard hierarchy from fastest to slowest growing:

Complexity	Name	n=10	n=100	n=1000	n=10⁶	Example
`O(1)`	Constant	1	1	1	1	Hash table lookup
`O(log n)`	Logarithmic	3	7	10	20	Binary search
`O(n)`	Linear	10	100	1000	10⁶	Linear scan
`O(n log n)`	Linearithmic	33	664	9966	2×10⁷	Merge sort
`O(n²)`	Quadratic	100	10⁴	10⁶	10¹²	Bubble sort
`O(2ⁿ)`	Exponential	1024	10³⁰	---	---	All subsets
`O(n!)`	Factorial	3.6M	---	---	---	All permutations

Figure 1.1 — Growth rate curves for common time complexities

⚠

Gotcha: O(log n) is NOT always binary search. Many algorithms are O(log n): balanced BST operations, exponentiation by squaring, Euclidean GCD algorithm, and any process that halves the problem size each step. Don't automatically assume binary search when you see logarithmic complexity.

Loop Analysis Patterns

Most interview-level complexity analysis reduces to counting loop iterations. Master these patterns:

Single Loop

for i in range(n):       # O(n) — linear
    process(i)

for i in range(0, n, 2):  # O(n/2) = O(n) — still linear
    process(i)

i = 1
while i < n:              # O(log n) — i doubles each step
    process(i)
    i *= 2

Nested Independent Loops

for i in range(n):          # Outer: n iterations
    for j in range(m):      # Inner: m iterations
        process(i, j)       # Total: O(n * m)

Nested Dependent Loops

for i in range(n):
    for j in range(i):      # j depends on i
        process(i, j)
# Total: 0 + 1 + 2 + ... + (n-1) = n(n-1)/2 = O(n²)

Log-Linear Pattern

for i in range(n):          # n iterations
    j = 1
    while j < n:             # log(n) iterations
        process(i, j)
        j *= 2              # Total: O(n log n)

Tricky Harmonic Series

for i in range(1, n+1):
    for j in range(0, n, i):  # n/1 + n/2 + n/3 + ... + n/n
        process(i, j)
# Total: n * H(n) = n * (ln n + O(1)) = O(n log n)

Recurrence Relations & Master Theorem

Recurrence relations express the runtime of recursive algorithms. The Master Theorem provides a shortcut for solving many common forms.

Standard Form: T(n) = aT(n/b) + f(n)

a = number of subproblems per recursive call
b = factor by which input is divided
f(n) = cost of dividing + combining

The Three Cases

Let c_crit = log_b(a) (the critical exponent):

Case	Condition	Result	Intuition
Case 1	f(n) = O(n^(c_crit - ε))	T(n) = Θ(n^c_crit)	Leaves dominate — recursion does most work
Case 2	f(n) = Θ(n^c_crit · log^k n)	T(n) = Θ(n^c_crit · log^(k+1) n)	Equal work at each level
Case 3	f(n) = Ω(n^(c_crit + ε))	T(n) = Θ(f(n))	Root dominates — combine cost prevails

Common Recurrence Examples

Algorithm	Recurrence	a, b, f(n)	Case	Solution
Binary Search	T(n) = T(n/2) + O(1)	1, 2, O(1)	Case 2 (k=0)	Θ(log n)
Merge Sort	T(n) = 2T(n/2) + O(n)	2, 2, O(n)	Case 2 (k=0)	Θ(n log n)
Strassen	T(n) = 7T(n/2) + O(n²)	7, 2, O(n²)	Case 1	Θ(n^2.807)
Karatsuba	T(n) = 3T(n/2) + O(n)	3, 2, O(n)	Case 1	Θ(n^1.585)
Linear select	T(n) = T(n/5) + T(7n/10) + O(n)	—	Akra-Bazzi	Θ(n)

Amortized Analysis

Amortized analysis determines the average cost per operation over a worst-case sequence. It differs from average-case analysis because it makes no probabilistic assumptions — it guarantees the average even in the worst case.

Three Methods

1. Aggregate Method

Compute the total cost of n operations, then divide by n.

Example — Dynamic Array (ArrayList/vector): When the array is full, we double its capacity. For n insertions: most cost O(1), but capacity doublings cost 1 + 2 + 4 + 8 + ... + n = 2n - 1. Total = n + (2n - 1) < 3n. Amortized cost = 3n/n = O(1) per insertion.

2. Accounting (Banker's) Method

Assign each operation an "amortized cost" that may differ from its actual cost. Overcharged operations build "credit" that pays for expensive future operations.

Example: Charge each array insert $3: $1 for the insert itself, $1 saved to copy this element later, $1 saved to copy an existing element. When doubling occurs, saved credits exactly cover the copying cost.

3. Potential Method

Define a potential function Φ that maps data structure states to non-negative reals. Amortized cost = actual cost + ΔΦ.

Example: For dynamic array, let Φ = 2 × size - capacity. Normal insert: actual=1, ΔΦ=2, amortized=3. Doubling: actual=n+1, ΔΦ=2-(n-1)=-n+3, amortized=3.

Space Complexity

Space complexity measures the total memory an algorithm requires as a function of input size.

Types of Space

Auxiliary space: Extra space used beyond the input — this is what interviewers usually ask about
Total space: Auxiliary space + input space
Stack space: Memory used by recursive call frames — each frame stores local variables, return address, and parameters

In-Place Algorithms

An algorithm is in-place if it uses O(1) auxiliary space (or O(log n) for recursive stack). Examples:

In-place: Quicksort (O(log n) stack), heapsort, insertion sort, two-pointer techniques
Not in-place: Merge sort (O(n) auxiliary array), counting sort (O(k) counting array)

Recursive Space Analysis

# Factorial — O(n) stack space (n recursive frames)
def factorial(n):
    if n <= 1: return 1
    return n * factorial(n - 1)

# Binary search — O(log n) stack space
def binary_search(arr, target, lo, hi):
    if lo > hi: return -1
    mid = (lo + hi) // 2
    if arr[mid] == target: return mid
    elif arr[mid] < target: return binary_search(arr, target, mid+1, hi)
    else: return binary_search(arr, target, lo, mid-1)

# Fibonacci naive — O(n) stack (depth of recursion tree)
# NOTE: O(2^n) time but only O(n) space (tree depth, not breadth)
def fib(n):
    if n <= 1: return n
    return fib(n-1) + fib(n-2)

Under the Hood: Why O(n log n) is Lower Bound for Comparison Sorts

Any comparison-based sorting algorithm can be modeled as a decision tree — a binary tree where each internal node represents a comparison (a[i] ≤ a[j]?), and each leaf represents a specific permutation of the input.

For n elements, there are n! possible permutations, so the decision tree must have at least n! leaves. A binary tree of height h has at most 2^h leaves. Therefore:

2^h ≥ n!
h ≥ log₂(n!)
h ≥ n log₂(n) - n log₂(e)    (Stirling's approximation)
h = Ω(n log n)

This means no comparison sort can do better than Ω(n log n) in the worst case. This is why algorithms like counting sort and radix sort — which bypass comparisons — can achieve O(n) time.

Growth rates f(n) across n = 1,2,4,8,16,32,64,128,256

Dynamic array: push_back (amortized O(1)) vs push_front (O(n))

Module 02

Arrays & Strings

The most fundamental data structures — master contiguous memory, two-pointer techniques, sliding windows, prefix sums, and string matching algorithms.

Array Internals

An array stores elements in contiguous memory locations. Given base address B and element size S, the address of element at index i is: addr(i) = B + i × S. This formula enables O(1) random access.

Key Properties

Fixed size (static arrays) vs dynamic resizing (ArrayList, vector, Python list)
Cache-friendly: Sequential access patterns exploit CPU cache lines (typically 64 bytes), making array traversal extremely fast in practice
Insert/delete at arbitrary position: O(n) due to shifting
Append (dynamic array): O(1) amortized

Figure 2.1 — Array elements stored in contiguous memory with cache line boundaries

Two Pointers

Both pointers only move monotonically, together O(n) visits. No auxiliary storage beyond a few index variables.

The two-pointer technique uses two indices to traverse an array, reducing O(n²) brute force to O(n). Two main variants:

Opposite Direction (Converging Pointers)

Start pointers at both ends and move them inward. Classic use case: finding pairs in a sorted array.

def two_sum_sorted(arr, target):
    """Find pair in sorted array summing to target. O(n) time, O(1) space."""
    left, right = 0, len(arr) - 1
    while left < right:
        curr_sum = arr[left] + arr[right]
        if curr_sum == target:
            return [left, right]
        elif curr_sum < target:
            left += 1       # Need larger sum
        else:
            right -= 1      # Need smaller sum
    return [-1, -1]

def is_palindrome(s):
    """Check palindrome. O(n) time, O(1) space."""
    left, right = 0, len(s) - 1
    while left < right:
        if s[left] != s[right]:
            return False
        left += 1
        right -= 1
    return True

Same Direction (Fast-Slow / Read-Write)

Both pointers move in the same direction. Used for in-place array modification and partitioning.

def remove_duplicates(arr):
    """Remove duplicates from sorted array in-place. O(n) time, O(1) space."""
    if not arr: return 0
    write = 1                        # write pointer
    for read in range(1, len(arr)):   # read pointer
        if arr[read] != arr[read - 1]:
            arr[write] = arr[read]
            write += 1
    return write

def dutch_national_flag(arr):
    """Sort array of 0s, 1s, 2s. O(n) time, O(1) space. (3-pointer variant)"""
    lo, mid, hi = 0, 0, len(arr) - 1
    while mid <= hi:
        if arr[mid] == 0:
            arr[lo], arr[mid] = arr[mid], arr[lo]
            lo += 1; mid += 1
        elif arr[mid] == 1:
            mid += 1
        else:
            arr[mid], arr[hi] = arr[hi], arr[mid]
            hi -= 1

Sliding Window

Each index enters and leaves the window at most once; HashMap state for a K-distinct window adds O(k) space where k is the alphabet size.

Sliding window maintains a contiguous subarray/substring and slides it across the input. Reduces O(n×k) brute force to O(n).

Fixed-Size Window

def max_sum_subarray(arr, k):
    """Maximum sum of subarray of size k. O(n) time."""
    window_sum = sum(arr[:k])
    max_sum = window_sum
    for i in range(k, len(arr)):
        window_sum += arr[i] - arr[i - k]  # slide: add right, remove left
        max_sum = max(max_sum, window_sum)
    return max_sum

Variable-Size Window

def min_subarray_len(target, arr):
    """Smallest subarray with sum >= target. O(n) time."""
    left = 0
    curr_sum = 0
    min_len = float('inf')
    for right in range(len(arr)):
        curr_sum += arr[right]             # expand window
        while curr_sum >= target:          # shrink until invalid
            min_len = min(min_len, right - left + 1)
            curr_sum -= arr[left]
            left += 1
    return min_len if min_len != float('inf') else 0

def longest_unique_substring(s):
    """Longest substring without repeating chars. O(n) time."""
    seen = {}
    left = 0
    max_len = 0
    for right, ch in enumerate(s):
        if ch in seen and seen[ch] >= left:
            left = seen[ch] + 1           # jump left past duplicate
        seen[ch] = right
        max_len = max(max_len, right - left + 1)
    return max_len

Figure 2.2 — Sliding window state transitions: expand, check validity, shrink, record answer

Prefix Sums & Difference Arrays

1D Prefix Sum

Precompute cumulative sums to answer any range sum query in O(1):

# Build prefix sum: O(n)
prefix = [0] * (len(arr) + 1)
for i in range(len(arr)):
    prefix[i+1] = prefix[i] + arr[i]

# Query sum of arr[l..r] inclusive: O(1)
range_sum = prefix[r+1] - prefix[l]

Difference Array

The inverse of prefix sums. Allows O(1) range updates and O(n) final reconstruction:

# Apply +val to arr[l..r] inclusive
diff[l] += val
diff[r+1] -= val   # if r+1 within bounds

# Reconstruct with prefix sum
for i in range(1, len(diff)):
    diff[i] += diff[i-1]

2D Prefix Sum

For matrix range sum queries. Uses inclusion-exclusion principle:

# Build 2D prefix sum
for i in range(1, rows+1):
    for j in range(1, cols+1):
        prefix[i][j] = matrix[i-1][j-1] + prefix[i-1][j] + prefix[i][j-1] - prefix[i-1][j-1]

# Query sum of submatrix (r1,c1) to (r2,c2)
sub_sum = prefix[r2+1][c2+1] - prefix[r1][c2+1] - prefix[r2+1][c1] + prefix[r1][c1]

String Matching: KMP & Rabin-Karp

KMP never backtracks in the text; the failure-function preprocessing is O(m). Rabin-Karp averages O(n+m) with a rolling hash but falls to O(nm) on adversarial hash collisions.

KMP (Knuth-Morris-Pratt) — O(n + m)

KMP avoids redundant comparisons by precomputing a failure function (also called the LPS array — longest proper prefix which is also a suffix).

def compute_lps(pattern):
    """Build failure/LPS array. O(m) time."""
    m = len(pattern)
    lps = [0] * m
    length = 0
    i = 1
    while i < m:
        if pattern[i] == pattern[length]:
            length += 1
            lps[i] = length
            i += 1
        elif length > 0:
            length = lps[length - 1]  # fall back
        else:
            lps[i] = 0
            i += 1
    return lps

def kmp_search(text, pattern):
    """KMP pattern search. O(n+m) time."""
    lps = compute_lps(pattern)
    i = j = 0
    results = []
    while i < len(text):
        if text[i] == pattern[j]:
            i += 1; j += 1
        if j == len(pattern):
            results.append(i - j)
            j = lps[j - 1]
        elif i < len(text) and text[i] != pattern[j]:
            if j > 0:
                j = lps[j - 1]
            else:
                i += 1
    return results

Rabin-Karp — O(n + m) average

Uses a rolling hash to compare pattern hash with window hashes. Worst case O(nm) due to hash collisions.

def rabin_karp(text, pattern):
    """Rabin-Karp with rolling hash. O(n+m) average."""
    n, m = len(text), len(pattern)
    base, mod = 256, 10**9 + 7
    p_hash = 0
    t_hash = 0
    h = pow(base, m - 1, mod)
    for i in range(m):
        p_hash = (p_hash * base + ord(pattern[i])) % mod
        t_hash = (t_hash * base + ord(text[i])) % mod
    for i in range(n - m + 1):
        if p_hash == t_hash:
            if text[i:i+m] == pattern:  # verify to avoid collision
                return i
        if i < n - m:
            t_hash = ((t_hash - ord(text[i]) * h) * base + ord(text[i+m])) % mod
    return -1

⚠

Gotcha: String immutability causing O(n²) concatenation. In Python and Java, strings are immutable. Repeated concatenation with += creates a new string each time, turning an O(n) loop into O(n²). Use ''.join(list) in Python or StringBuilder in Java instead.

Under the Hood: CPU Cache Effects on Array vs Linked List Traversal

Modern CPUs don't fetch individual bytes — they load entire cache lines (typically 64 bytes). When traversing an array, accessing element arr[i] loads nearby elements into L1 cache for free. The next several accesses are cache hits (latency ~1ns vs ~100ns for main memory).

Linked list nodes, by contrast, are scattered across the heap. Each node.next dereference likely causes a cache miss, fetching a new cache line from RAM. This makes linked list traversal 10-100x slower than array traversal in practice, even though both are O(n).

Benchmark reality: Iterating through a 10M-element array takes ~10ms. The same traversal through a linked list takes ~200-500ms. This is why ArrayList beats LinkedList in virtually all practical scenarios in Java.

Sliding window: smallest subarray with sum ≥ 15 (input [4,2,5,1,3,7,6])

Two-pointer: sorted two-sum on [2,3,5,7,8,11] target=10

Prefix sum over [3,1,4,1,5,9,2,6]

KMP failure function (LPS) for pattern "abababc"

Module 03

Linked Lists

Pointer-based data structures — master node manipulation, cycle detection, reversal algorithms, and understand their trade-offs against arrays.

Types & Node Structure

Singly Linked List

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next     # pointer to next node

Doubly Linked List

class DListNode:
    def __init__(self, val=0, prev=None, next=None):
        self.val = val
        self.prev = prev     # pointer to previous node
        self.next = next     # pointer to next node

Circular Linked List

The last node's next pointer points back to the head, forming a cycle. Useful for round-robin scheduling, circular buffers.

Operation	Singly	Doubly	Array
Access by index	O(n)	O(n)	O(1)
Insert at head	O(1)	O(1)	O(n)
Insert at tail	O(n) / O(1)*	O(1)	O(1) amortized
Delete given node	O(n)	O(1)	O(n)
Search	O(n)	O(n)	O(n)
Space per element	val + 1 ptr	val + 2 ptrs	val only

*O(1) if tail pointer is maintained

Figure 3.1 — Singly linked list with head pointer and null-terminated tail

Floyd's Cycle Detection (Tortoise and Hare)

Slow and fast pointers collide within at most one cycle traversal; no hashmap needed, so memory is two pointers total.

Detect if a linked list contains a cycle using O(1) space. Use two pointers moving at different speeds.

Algorithm

def has_cycle(head):
    """Detect cycle. O(n) time, O(1) space."""
    slow = fast = head
    while fast and fast.next:
        slow = slow.next           # 1 step
        fast = fast.next.next      # 2 steps
        if slow == fast:
            return True            # they meet inside cycle
    return False

def find_cycle_start(head):
    """Find where cycle begins. O(n) time, O(1) space."""
    slow = fast = head
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            # Reset one pointer to head
            slow = head
            while slow != fast:
                slow = slow.next
                fast = fast.next   # both move 1 step now
            return slow            # cycle start
    return None

Why It Works (Proof Sketch)

Let the distance from head to cycle start be F, cycle length be C, and the meeting point be a steps into the cycle.

When they meet: slow traveled F + a, fast traveled F + a + kC (some laps)
Since fast moves twice as far: 2(F + a) = F + a + kC, so F + a = kC
Therefore F = kC - a, meaning if we start one pointer at head and one at meeting point (both at speed 1), they meet at the cycle start after F steps

Figure 3.2 — Floyd's algorithm: slow (green) and fast (red) pointers meet inside the cycle

Reversal Techniques

Iterative Reversal (3-pointer technique)

def reverse_list(head):
    """Reverse singly linked list. O(n) time, O(1) space."""
    prev = None
    curr = head
    while curr:
        nxt = curr.next      # save next
        curr.next = prev     # reverse pointer
        prev = curr          # advance prev
        curr = nxt           # advance curr
    return prev              # new head

Recursive Reversal

def reverse_recursive(head):
    """Reverse recursively. O(n) time, O(n) stack space."""
    if not head or not head.next:
        return head
    new_head = reverse_recursive(head.next)
    head.next.next = head   # make next node point back
    head.next = None        # break original forward link
    return new_head

Reverse in K-Groups

def reverse_k_group(head, k):
    """Reverse nodes in groups of k. O(n) time, O(1) space."""
    # Check if k nodes remain
    count = 0
    node = head
    while node and count < k:
        node = node.next
        count += 1
    if count < k:
        return head    # fewer than k nodes, don't reverse

    # Reverse k nodes
    prev = None
    curr = head
    for _ in range(k):
        nxt = curr.next
        curr.next = prev
        prev = curr
        curr = nxt

    # head is now tail of reversed group
    head.next = reverse_k_group(curr, k)  # recurse on remaining
    return prev    # prev is new head of reversed group

⚠

Gotcha: Losing head reference during reversal. A common mistake is forgetting to save curr.next before overwriting it. Always use a temporary variable. Also, the original head becomes the new tail after reversal — don't return the wrong pointer.

Under the Hood: Why Linked Lists are Cache-Unfriendly

Linked list nodes are allocated on the heap at arbitrary addresses. Following node.next is a pointer chase — each dereference likely loads a new cache line from main memory (100+ ns latency). The hardware prefetcher cannot predict where the next node lives.

Compare this to arrays: the prefetcher detects sequential access patterns and preloads the next cache line before you need it. For a hot loop over 1M elements:

Array: ~4ms (mostly L1/L2 cache hits)
Linked List: ~100ms (mostly cache misses, going to DRAM)

This is why practical systems prefer arrays (or array-backed structures like ArrayList) despite linked lists' theoretical O(1) insertion advantage.

Floyd's tortoise & hare on a list with cycle start at node 3

In-place reversal — 3-pointer pattern (prev, curr, next)

Module 04

Stacks & Queues

LIFO and FIFO abstractions — master monotonic stacks, queue variants, and their deep connection to recursion and BFS/DFS.

Stack Fundamentals

A stack is a Last-In, First-Out (LIFO) data structure. Think of a stack of plates — you can only add or remove from the top.

Core Operations (all O(1))

push(x): Add element to top
pop(): Remove and return top element
peek() / top(): View top element without removing
isEmpty(): Check if stack is empty

Classic Stack Problems

def valid_parentheses(s):
    """Check if brackets are balanced. O(n) time, O(n) space."""
    stack = []
    mapping = {')': '(', '}': '{', ']': '['}
    for ch in s:
        if ch in mapping:
            if not stack or stack[-1] != mapping[ch]:
                return False
            stack.pop()
        else:
            stack.append(ch)
    return len(stack) == 0

def eval_rpn(tokens):
    """Evaluate Reverse Polish Notation. O(n) time."""
    stack = []
    ops = {'+': lambda a,b: a+b, '-': lambda a,b: a-b,
           '*': lambda a,b: a*b, '/': lambda a,b: int(a/b)}
    for t in tokens:
        if t in ops:
            b, a = stack.pop(), stack.pop()
            stack.append(ops[t](a, b))
        else:
            stack.append(int(t))
    return stack[0]

Monotonic Stack

Every index is pushed and popped at most once across the whole pass; the stack can hold up to n indices in the worst case.

A monotonic stack maintains elements in sorted order (either increasing or decreasing). It processes "next greater element" and "next smaller element" problems in O(n).

Next Greater Element

def next_greater_element(arr):
    """Find next greater element for each position. O(n) time."""
    n = len(arr)
    result = [-1] * n
    stack = []  # stores indices, maintains decreasing values
    for i in range(n):
        while stack and arr[stack[-1]] < arr[i]:
            idx = stack.pop()
            result[idx] = arr[i]   # arr[i] is next greater for arr[idx]
        stack.append(i)
    return result

Largest Rectangle in Histogram

def largest_rectangle(heights):
    """Classic monotonic stack problem. O(n) time."""
    stack = []  # stores indices, maintains increasing heights
    max_area = 0
    heights.append(0)  # sentinel to flush remaining bars
    for i, h in enumerate(heights):
        while stack and heights[stack[-1]] > h:
            height = heights[stack.pop()]
            width = i if not stack else i - stack[-1] - 1
            max_area = max(max_area, height * width)
        stack.append(i)
    return max_area

Figure 4.1 — Monotonic (increasing) stack processing steps

Queue Variants

Standard Queue (FIFO)

First-In, First-Out. Primary operations: enqueue(x), dequeue(), peek(). Used for BFS, task scheduling, and buffering.

Double-Ended Queue (Deque)

Supports O(1) insertion and removal at both ends. Python's collections.deque is implemented as a doubly-linked list of fixed-size blocks.

from collections import deque
d = deque()
d.append(x)       # add to right — O(1)
d.appendleft(x)   # add to left  — O(1)
d.pop()            # remove from right — O(1)
d.popleft()        # remove from left  — O(1)

Sliding Window Maximum (Monotonic Deque)

def max_sliding_window(nums, k):
    """Max in each window of size k. O(n) time using deque."""
    dq = deque()  # stores indices, values in decreasing order
    result = []
    for i, num in enumerate(nums):
        # Remove elements outside window
        while dq and dq[0] <= i - k:
            dq.popleft()
        # Remove smaller elements (they'll never be the max)
        while dq and nums[dq[-1]] < num:
            dq.pop()
        dq.append(i)
        if i >= k - 1:
            result.append(nums[dq[0]])
    return result

Circular Buffer

Fixed-size array implementing a queue. Uses head and tail pointers that wrap around using modular arithmetic: next = (index + 1) % capacity. Used in OS kernel buffers and embedded systems.

Priority Queue (Bridge to Heaps)

A queue where elements are dequeued in priority order, not insertion order. Implemented using a heap (covered in Module 7). Key operations: insert O(log n), extract-min/max O(log n), peek O(1).

⚠

Gotcha: Every recursive solution can be converted to iterative with an explicit stack. Recursion implicitly uses the call stack. If you hit stack overflow or need to control memory, convert to iteration by maintaining your own stack. DFS with a stack is a classic example.

Under the Hood: CPU Call Stack Frame Layout

When a function is called, the CPU pushes a stack frame onto the call stack containing:

Return address: Where to resume execution after the function returns
Arguments: Parameters passed to the function (or in registers on x86-64)
Local variables: Stack-allocated variables
Saved registers: Registers the function needs to preserve
Frame pointer (RBP): Points to the base of the current frame

A typical stack frame is 48-128 bytes. The default stack size is 1-8 MB, giving roughly 10,000-100,000 recursion levels before stack overflow. This is why recursive DFS on a graph with 10⁶ nodes can crash — convert to iterative with an explicit stack.

Monotonic stack: next greater element for [2,1,2,4,3,1]

Circular buffer deque — head and tail wrap modulo capacity

Module 05

Hash Tables

The workhorse of O(1) lookups — hash function design, collision resolution strategies, load factors, and probabilistic extensions like Bloom filters.

Hash Function Design

A hash function maps keys of arbitrary size to fixed-size integers (bucket indices). A good hash function must satisfy:

Deterministic: Same key always produces the same hash
Uniform distribution: Keys are spread evenly across buckets to minimize collisions
Avalanche effect: Small changes in input produce drastic changes in output
Efficient: O(1) to compute (for fixed-size keys)

Common Hash Functions

# Division method: h(k) = k mod m
# Choose m as a prime not close to a power of 2
def hash_div(key, m):
    return key % m

# Multiplication method: h(k) = floor(m * (k*A mod 1))
# Knuth suggests A = (sqrt(5)-1)/2 ≈ 0.6180339887
def hash_mult(key, m):
    A = 0.6180339887
    return int(m * ((key * A) % 1))

# String hashing (polynomial rolling hash)
def hash_string(s, m):
    h = 0
    base = 31
    for ch in s:
        h = (h * base + ord(ch)) % m
    return h

Figure 5.1 — Hash table mapping: keys pass through a hash function to determine bucket indices

Collision Resolution

When two keys hash to the same bucket, we have a collision. Two primary strategies:

1. Separate Chaining

Each bucket contains a linked list (or tree) of all entries that hash to that bucket.

Insert: Hash to bucket, prepend to chain — O(1)
Search: Hash to bucket, traverse chain — O(chain length)
Worst case: All keys hash to same bucket — O(n) per operation
Load factor α = n/m (entries/buckets). Average chain length = α

2. Open Addressing

All entries stored in the table itself. On collision, probe for the next empty slot.

Probing Strategy	Probe Sequence	Pros	Cons
Linear probing	h(k) + i	Cache-friendly, simple	Primary clustering
Quadratic probing	h(k) + i²	Reduces clustering	Secondary clustering, may miss slots
Double hashing	h₁(k) + i·h₂(k)	Best distribution	Two hash functions needed

Figure 5.2 — Collision resolution: separate chaining vs open addressing with linear probing

Load Factor & Dynamic Resizing

The load factor α = n/m (elements / buckets) determines performance. As α increases, collisions increase and performance degrades.

Implementation	Default Load Factor	Resize Trigger	Growth Factor
Java HashMap	0.75	α > 0.75	2x
Python dict	0.667	α > 2/3	~2x (next power of 2)
C++ unordered_map	1.0	α > 1.0	~2x
Go map	6.5 (bucket load)	avg bucket load > 6.5	2x

Resizing process: Allocate a new array (typically 2x size), rehash every existing key into the new array. This single resize is O(n), but over n insertions, the amortized cost per insertion remains O(1).

Bloom Filters

A Bloom filter is a probabilistic data structure for set membership testing. It can tell you:

"Definitely NOT in the set" — guaranteed correct (no false negatives)
"Probably in the set" — may be wrong (false positives possible)

How It Works

Uses a bit array of m bits and k independent hash functions. To insert element x: set bits at positions h₁(x), h₂(x), ..., hₖ(x) to 1. To query x: check if ALL k bit positions are 1.

False Positive Rate

After inserting n elements: FPR ≈ (1 - e^(-kn/m))^k

Optimal number of hash functions: k = (m/n) · ln(2)

Figure 5.3 — Bloom filter: multiple hash functions set bits in a shared bit array

⚠

Gotcha: HashMap is O(1) average but O(n) worst case. When all keys hash to the same bucket, operations degrade to O(n). Java 8+ mitigates this by converting chains to balanced trees (Red-Black) when a bucket reaches 8 entries, giving O(log n) worst case per bucket. Also note: hash-based data structures have no ordering guarantees — use TreeMap/TreeSet if you need sorted iteration.

Under the Hood: Python dict & Java HashMap Internals

Python dict (CPython 3.6+):

Uses open addressing with perturbation — a modified linear probing where the probe sequence mixes in higher hash bits
Stores keys, values, and hashes in a compact array with a separate sparse index array
Maintains insertion order as a language guarantee since Python 3.7
Resize when 2/3 full; deleted entries are marked as "dummy" for probe chains

Java HashMap:

Uses separate chaining with linked lists
Default capacity 16, load factor 0.75 — resizes at 12 entries
Treeification threshold: 8 — when a bucket's chain reaches 8 nodes, it converts to a Red-Black tree for O(log n) worst-case search
Untreeify threshold: 6 — when a bucket shrinks below 6, it converts back to a linked list
Hash is "spread" by XORing higher bits to lower bits: h ^ (h >>> 16)

Hash collision: chaining vs linear-probing open addressing (capacity 8)

Bloom filter: k=3 hash functions set bits for keys "cat" and "dog"

Module 06

Trees

The most versatile family of data structures — binary trees, BSTs, self-balancing trees, B-Trees, tries, and advanced query structures like segment trees and BITs.

Binary Tree Traversals

Four fundamental ways to visit every node in a binary tree:

Depth-First Traversals

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def inorder(root):
    """Left → Root → Right (gives sorted order for BST)"""
    if not root: return []
    return inorder(root.left) + [root.val] + inorder(root.right)

def preorder(root):
    """Root → Left → Right (serialize tree structure)"""
    if not root: return []
    return [root.val] + preorder(root.left) + preorder(root.right)

def postorder(root):
    """Left → Right → Root (delete tree, evaluate expressions)"""
    if not root: return []
    return postorder(root.left) + postorder(root.right) + [root.val]

Breadth-First (Level-Order)

from collections import deque

def level_order(root):
    """Level by level traversal using queue."""
    if not root: return []
    result = []
    queue = deque([root])
    while queue:
        level = []
        for _ in range(len(queue)):
            node = queue.popleft()
            level.append(node.val)
            if node.left: queue.append(node.left)
            if node.right: queue.append(node.right)
        result.append(level)
    return result

Figure 6.1 — Binary tree with all four traversal orderings shown

Binary Search Trees (BST)

Balanced trees (AVL/RB) keep height logarithmic, so search/insert/delete each touch O(log n) nodes; a degenerate skewed BST collapses to a linked list giving O(n).

A BST maintains the BST invariant: for every node, all values in the left subtree are smaller, and all values in the right subtree are larger.

Core Operations

def bst_search(root, target):
    """Search BST. O(h) time where h = tree height."""
    if not root or root.val == target:
        return root
    if target < root.val:
        return bst_search(root.left, target)
    return bst_search(root.right, target)

def bst_insert(root, val):
    """Insert into BST. O(h) time."""
    if not root:
        return TreeNode(val)
    if val < root.val:
        root.left = bst_insert(root.left, val)
    elif val > root.val:
        root.right = bst_insert(root.right, val)
    return root

def bst_delete(root, val):
    """Delete from BST. O(h) time. Three cases."""
    if not root: return None
    if val < root.val:
        root.left = bst_delete(root.left, val)
    elif val > root.val:
        root.right = bst_delete(root.right, val)
    else:
        # Case 1: No children (leaf)
        if not root.left and not root.right:
            return None
        # Case 2: One child
        if not root.left: return root.right
        if not root.right: return root.left
        # Case 3: Two children — replace with inorder successor
        successor = root.right
        while successor.left:
            successor = successor.left
        root.val = successor.val
        root.right = bst_delete(root.right, successor.val)
    return root

Inorder Successor & Predecessor

Inorder successor: If right subtree exists, it's the leftmost node in right subtree. Otherwise, it's the nearest ancestor for which the node is in the left subtree.
Inorder predecessor: If left subtree exists, it's the rightmost node in left subtree.

AVL Trees

The balance invariant keeps height ≤ 1.44 log n, so all operations trace a single root-to-leaf path with O(1) rotations per level.

An AVL tree is a self-balancing BST where the balance factor (height difference between left and right subtrees) of every node is -1, 0, or 1.

Four Rotation Types

Imbalance	Pattern	Fix	Description
Left-Left (LL)	BF > 1, left child BF ≥ 0	Right rotate	Single right rotation at imbalanced node
Right-Right (RR)	BF < -1, right child BF ≤ 0	Left rotate	Single left rotation at imbalanced node
Left-Right (LR)	BF > 1, left child BF < 0	Left rotate child, then right rotate	Double rotation
Right-Left (RL)	BF < -1, right child BF > 0	Right rotate child, then left rotate	Double rotation

Figure 6.2 — AVL right rotation to fix LL imbalance: node 20 becomes the new root

Red-Black Trees

A Red-Black tree is a self-balancing BST with less strict balancing than AVL. Each node stores a color bit (red or black), and the tree satisfies five properties:

Every node is either red or black
The root is black
Every leaf (NIL sentinel) is black
Red property: A red node cannot have a red child (no two consecutive reds)
Black-height property: Every path from a node to its descendant NIL nodes has the same number of black nodes

These properties guarantee that the longest path is at most 2x the shortest path, giving O(log n) height.

Insertion Cases

After BST insert (new node is red), fix violations by case analysis on the uncle node's color:

Case 1: Uncle is red → recolor parent, uncle, and grandparent
Case 2: Uncle is black, node is inner child → rotate to make it case 3
Case 3: Uncle is black, node is outer child → rotate grandparent and recolor

B-Trees & B+ Trees

B-Tree (Order m)

A self-balancing multi-way search tree designed for disk-based storage systems:

Each node can have up to m children and m-1 keys
All leaves are at the same depth (perfectly balanced)
Internal nodes (except root) have at least ⌈m/2⌉ children
Keys within a node are sorted
Insert: If a node overflows (m keys), split into two nodes and push the median up

B+ Tree

Variant of B-Tree where all data records live in leaf nodes and leaves are connected via a linked list for efficient range queries.

Figure 6.3 — B+ Tree structure with linked leaf nodes enabling efficient range scans

Tries (Prefix Trees)

Lookup follows L character edges and is independent of dictionary size; space scales with the alphabet fan-out times the number of edges across all stored strings.

A trie stores strings character by character, with each edge representing a character. Enables O(L) insert/search/prefix-search where L is the string length — independent of the number of strings stored.

class TrieNode:
    def __init__(self):
        self.children = {}       # char → TrieNode
        self.is_end = False      # marks end of a word

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True

    def search(self, word):
        node = self._find(word)
        return node is not None and node.is_end

    def starts_with(self, prefix):
        return self._find(prefix) is not None

    def _find(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return None
            node = node.children[ch]
        return node

Figure 6.4 — Trie storing five words with shared prefixes; asterisk marks word endings

Segment Trees & Binary Indexed Trees (BIT/Fenwick)

A range query decomposes into at most 2·log n canonical segments; Fenwick trees use the same log-depth traversal of implicit binary indices.

Segment Tree

Supports range queries and point/range updates in O(log n). Each node represents a range of the original array.

class SegmentTree:
    def __init__(self, arr):
        self.n = len(arr)
        self.tree = [0] * (4 * self.n)
        self._build(arr, 1, 0, self.n - 1)

    def _build(self, arr, node, start, end):
        if start == end:
            self.tree[node] = arr[start]
            return
        mid = (start + end) // 2
        self._build(arr, 2*node, start, mid)
        self._build(arr, 2*node+1, mid+1, end)
        self.tree[node] = self.tree[2*node] + self.tree[2*node+1]

    def query(self, node, start, end, l, r):
        """Range sum query [l, r]. O(log n)."""
        if r < start or end < l: return 0
        if l <= start and end <= r: return self.tree[node]
        mid = (start + end) // 2
        return self.query(2*node, start, mid, l, r) + \
               self.query(2*node+1, mid+1, end, l, r)

    def update(self, node, start, end, idx, val):
        """Point update. O(log n)."""
        if start == end:
            self.tree[node] = val
            return
        mid = (start + end) // 2
        if idx <= mid:
            self.update(2*node, start, mid, idx, val)
        else:
            self.update(2*node+1, mid+1, end, idx, val)
        self.tree[node] = self.tree[2*node] + self.tree[2*node+1]

Binary Indexed Tree (Fenwick Tree)

Simpler alternative to segment tree for prefix sums with point updates. Uses clever bit manipulation (lowbit = x & -x) to navigate the tree.

class BIT:
    def __init__(self, n):
        self.n = n
        self.tree = [0] * (n + 1)

    def update(self, i, delta):
        """Add delta to index i. O(log n)."""
        while i <= self.n:
            self.tree[i] += delta
            i += i & (-i)   # add lowest set bit

    def prefix_sum(self, i):
        """Sum of arr[1..i]. O(log n)."""
        s = 0
        while i > 0:
            s += self.tree[i]
            i -= i & (-i)   # remove lowest set bit
        return s

    def range_sum(self, l, r):
        return self.prefix_sum(r) - self.prefix_sum(l - 1)

Figure 6.5 — Segment tree for range sum queries; each node stores the sum of its range

Feature	Segment Tree	BIT (Fenwick)
Range query	O(log n)	O(log n)
Point update	O(log n)	O(log n)
Range update	O(log n) with lazy propagation	O(log n) with range trick
Space	O(4n)	O(n)
Code complexity	Higher	Lower
Flexibility	Any associative operation	Mainly prefix sums

⚠

Gotcha: BST inorder traversal always gives sorted output. This is a direct consequence of the BST invariant. If you perform an inorder traversal and the output is not sorted, the tree is not a valid BST. This property is the basis for the "Validate BST" interview question.

Under the Hood: Why Red-Black Trees are Preferred Over AVL for Insertion-Heavy Workloads

Both AVL and Red-Black trees guarantee O(log n) operations, but they differ in practice:

AVL trees are more strictly balanced (balance factor at most 1), so they have slightly faster lookups due to shorter average paths
Red-Black trees are more loosely balanced (longest path at most 2x shortest), so they require fewer rotations during insertion/deletion

Specifically, Red-Black insertion needs at most 2 rotations and O(log n) recolorings. AVL insertion may need O(log n) rotations (though typically just 1-2). For workloads with many more writes than reads, Red-Black's simpler rebalancing is advantageous.

This is why Linux kernel (CFS scheduler), Java TreeMap/TreeSet, and C++ std::map all use Red-Black trees. AVL trees are preferred for read-heavy scenarios like database indexing.

BST insert 5,3,7,1,4,6,8 and in-order traversal

AVL left rotation after inserting 10, 20, 30 (right-right case)

Red-Black: insert 4 into [10B, 5R] causes recolor at uncle + left-rotate

B+ tree: internal routing nodes + leaves connected as a linked list

Trie storing {cat, car, cart, dog}

Segment tree: build on [1,3,5,7,9,11]; range-sum query [1..4]

Module 07

Heaps

Priority-based data structures — binary heap internals, array representation, build-heap optimization, and classic applications like Top-K and median finding.

Binary Heap Fundamentals

A binary heap is a complete binary tree satisfying the heap property:

Max-Heap: Every parent ≥ its children (root is maximum)
Min-Heap: Every parent ≤ its children (root is minimum)
Complete: All levels filled except possibly the last, which is filled left-to-right

Array Representation

A complete binary tree maps perfectly to an array with zero-based indexing:

Parent of i: (i - 1) // 2
Left child of i: 2*i + 1
Right child of i: 2*i + 2

Figure 7.1 — Max-heap shown as both a tree and its corresponding array representation

Heap Operations

Sift-up and sift-down walk a single root-to-leaf path of length log n; peek is a single array index read.

Insert (Bubble Up / Sift Up)

Add element at the end, then bubble it up by swapping with parent while heap property is violated. O(log n).

Extract Min/Max (Bubble Down / Sift Down)

Replace root with last element, then bubble it down by swapping with the smaller (min-heap) or larger (max-heap) child. O(log n).

import heapq

# Python heapq is a MIN-HEAP
heap = []
heapq.heappush(heap, 5)     # insert: O(log n)
heapq.heappush(heap, 3)
heapq.heappush(heap, 7)
min_val = heapq.heappop(heap) # extract min: O(log n) → returns 3
peek = heap[0]               # peek min: O(1) → returns 5

# For MAX-HEAP, negate values
max_heap = []
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
max_val = -heapq.heappop(max_heap)  # → 5

# Build heap from list: O(n) — NOT O(n log n)!
arr = [4, 1, 7, 3, 8, 5]
heapq.heapify(arr)  # in-place O(n) transformation

Build Heap: O(n) Analysis

The naive analysis says n inserts × O(log n) each = O(n log n). But the tighter analysis using sift-down from the bottom up:

n/2 nodes at leaves — sift-down cost 0
n/4 nodes at height 1 — sift-down cost 1
n/8 nodes at height 2 — sift-down cost 2
Total: ∑ n/(2^(h+1)) × h = n × ∑ h/2^(h+1) = n × 2 = O(n)

Figure 7.2 — Build heap costs: most nodes are near leaves with minimal sift-down distance

Applications: Top-K & Median Finding

Top-K Largest Elements

Use a min-heap of size K. The root is always the K-th largest element seen so far.

def top_k_largest(nums, k):
    """Find k largest elements. O(n log k) time, O(k) space."""
    heap = nums[:k]
    heapq.heapify(heap)         # min-heap of size k
    for num in nums[k:]:
        if num > heap[0]:       # larger than smallest in heap
            heapq.heapreplace(heap, num)  # pop min, push new
    return heap

Running Median (Two-Heap Technique)

Maintain a max-heap for the lower half and a min-heap for the upper half. Median is always accessible from the heap tops.

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negate values) for lower half
        self.hi = []  # min-heap for upper half

    def add_num(self, num):
        heapq.heappush(self.lo, -num)
        # Ensure lo's max <= hi's min
        heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def find_median(self):
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2

⚠

Gotcha: Build-heap is O(n), NOT O(n log n). This is one of the most commonly misremembered complexities. The key insight is that most nodes are near the bottom of the tree, where sift-down does very little work. Always use heapify() instead of n individual inserts when building a heap from scratch.

Under the Hood: Heap Array Representation = Cache-Friendly

The heap's array representation is not just a convenient abstraction — it provides significant cache performance benefits:

Contiguous memory: All heap elements sit in a single array, maximizing cache line utilization
Predictable access patterns: Parent/child navigation via arithmetic (no pointer chasing)
Compared to pointer-based trees: BSTs, AVL trees, and Red-Black trees store node pointers that cause cache misses on every traversal step

In benchmarks, a heap-based priority queue is typically 3-5x faster than a balanced BST for priority queue workloads, despite having the same asymptotic complexity. This is why standard library PQ implementations (Python heapq, Java PriorityQueue, C++ priority_queue) all use array-backed heaps.

Build-heap on [3,1,6,5,2,4] — sift-down from last internal node

Top-3 from stream [4,7,2,9,1,6,8] with a size-3 min-heap

Module 08

Graphs

The most general data structure — representations, traversals, shortest paths, topological ordering, union-find, and minimum spanning trees.

Graph Representations

Representation	Space	Add Edge	Check Edge	Iterate Neighbors	Best For
Adjacency Matrix	O(V²)	O(1)	O(1)	O(V)	Dense graphs, quick edge lookups
Adjacency List	O(V+E)	O(1)	O(degree)	O(degree)	Sparse graphs (most common)
Edge List	O(E)	O(1)	O(E)	O(E)	Kruskal's MST, simple representation

Figure 8.1 — The same undirected graph represented as adjacency matrix, adjacency list, and edge list

BFS vs DFS

Each vertex and edge is explored at most once; the visited set and queue/stack together hold at most V elements at a time.

from collections import deque

def bfs(graph, start):
    """Breadth-First Search. O(V+E) time, O(V) space."""
    visited = {start}
    queue = deque([start])
    order = []
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)
    return order

def dfs(graph, start):
    """Depth-First Search (iterative). O(V+E) time, O(V) space."""
    visited = set()
    stack = [start]
    order = []
    while stack:
        node = stack.pop()
        if node in visited: continue
        visited.add(node)
        order.append(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                stack.append(neighbor)
    return order

Figure 8.2 — BFS explores neighbors level by level; DFS goes deep before backtracking

When to Use Which

Use BFS When	Use DFS When
Finding shortest path (unweighted)	Detecting cycles
Level-order traversal needed	Topological sorting
Finding connected components	Solving mazes / puzzles
Shortest transformation sequence	Finding strongly connected components
Multi-source shortest path	Path existence check

Shortest Path Algorithms

With a binary heap each vertex is popped once (V log V) and each edge can trigger a decrease-key (E log V); adjacency list gives linear scan per vertex.

Dijkstra's Algorithm — O((V+E) log V)

Greedy approach using a priority queue. Works only with non-negative edge weights.

import heapq

def dijkstra(graph, src):
    """Dijkstra's shortest path. graph[u] = [(v, weight), ...]"""
    dist = {src: 0}
    pq = [(0, src)]  # (distance, node)
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist.get(u, float('inf')):
            continue  # skip outdated entry
        for v, w in graph[u]:
            new_dist = d + w
            if new_dist < dist.get(v, float('inf')):
                dist[v] = new_dist
                heapq.heappush(pq, (new_dist, v))
    return dist

Bellman-Ford — O(V · E)

Handles negative edge weights. Relaxes all edges V-1 times. Detects negative cycles with one more pass.

def bellman_ford(n, edges, src):
    """edges = [(u, v, weight), ...]. Returns dist[] or None if negative cycle."""
    dist = [float('inf')] * n
    dist[src] = 0
    for _ in range(n - 1):       # V-1 relaxation passes
        for u, v, w in edges:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
    # Check for negative cycles
    for u, v, w in edges:
        if dist[u] + w < dist[v]:
            return None   # negative cycle detected!
    return dist

Floyd-Warshall — O(V³)

All-pairs shortest paths using dynamic programming. Works with negative weights (no negative cycles).

def floyd_warshall(dist):
    """dist[i][j] = weight of edge i→j (inf if no edge). Modifies in-place."""
    n = len(dist)
    for k in range(n):         # intermediate vertex
        for i in range(n):     # source
            for j in range(n): # destination
                dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
    return dist

Figure 8.3 — Edge relaxation: the fundamental operation in all shortest path algorithms

Topological Sort

A linear ordering of vertices in a DAG (Directed Acyclic Graph) such that for every edge u→v, u comes before v. Used for task scheduling, course prerequisites, build systems.

Kahn's Algorithm (BFS-based)

def topological_sort_kahn(graph, n):
    """Kahn's BFS-based topo sort. O(V+E)."""
    in_degree = [0] * n
    for u in range(n):
        for v in graph[u]:
            in_degree[v] += 1
    queue = deque([v for v in range(n) if in_degree[v] == 0])
    order = []
    while queue:
        u = queue.popleft()
        order.append(u)
        for v in graph[u]:
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)
    return order if len(order) == n else []  # empty = cycle exists

DFS-based Topological Sort

def topological_sort_dfs(graph, n):
    """DFS-based topo sort using post-order reversal. O(V+E)."""
    visited = [False] * n
    stack = []
    def dfs(u):
        visited[u] = True
        for v in graph[u]:
            if not visited[v]:
                dfs(v)
        stack.append(u)  # add after all descendants processed
    for u in range(n):
        if not visited[u]:
            dfs(u)
    return stack[::-1]  # reverse post-order

Union-Find (Disjoint Set Union)

With both path compression and union-by-rank, the amortized cost per operation is the inverse-Ackermann function α(n) ≤ 4 for any practical n.

Tracks a collection of disjoint sets with near-constant time operations. Essential for Kruskal's MST, connected components, and cycle detection.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.count = n  # number of components

    def find(self, x):
        """Find root with path compression. Amortized O(α(n))."""
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        """Union by rank. Amortized O(α(n))."""
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False  # already connected
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        self.count -= 1
        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)

Minimum Spanning Trees

A spanning tree connects all vertices with minimum total edge weight. Two classic algorithms:

Feature	Kruskal's	Prim's
Approach	Edge-centric (sort all edges)	Vertex-centric (grow from source)
Data Structure	Union-Find	Priority Queue
Time	O(E log E)	O((V+E) log V)
Better for	Sparse graphs (E ≈ V)	Dense graphs (E ≈ V²)

def kruskal(n, edges):
    """Kruskal's MST. edges = [(weight, u, v), ...]. O(E log E)."""
    edges.sort()
    uf = UnionFind(n)
    mst = []
    total = 0
    for w, u, v in edges:
        if uf.union(u, v):
            mst.append((u, v, w))
            total += w
            if len(mst) == n - 1: break
    return total, mst

⚠

Gotcha: Dijkstra fails with negative edge weights. Dijkstra is a greedy algorithm that assumes once a vertex's shortest distance is finalized, it won't improve. Negative edges can violate this. Use Bellman-Ford instead for graphs with negative edges, or Johnson's algorithm for all-pairs with negative edges.

Under the Hood: Union-Find Inverse Ackermann Function

With both path compression and union by rank, any sequence of m operations on n elements takes O(m · α(n)) time, where α is the inverse Ackermann function.

The Ackermann function A(n) grows absurdly fast — faster than exponential, faster than tower of powers. Its inverse α(n) grows correspondingly slowly:

α(1) = 0
α(2) = 1
α(2^2048) = 4
For any practical input size (up to 10^80 atoms in the universe), α(n) ≤ 4

So while Union-Find is not technically O(1) per operation, it's indistinguishable from O(1) in practice. This was proved by Tarjan in 1975.

BFS from A — visit order A,B,C,D,E,F (layers 0,1,1,2,2,3)

Dijkstra from A on 5-node weighted graph

Kahn's algorithm: topo sort of DAG {A,B,C,D,E}

Union-Find: union(1,2), union(3,4), union(2,3); find(1) with path compression

Module 09

Dynamic Programming

Break complex problems into overlapping subproblems with optimal substructure — memoization, tabulation, classic 1D/2D patterns, knapsack, interval DP, and space optimization.

Core Concepts

Dynamic programming (DP) is an optimization technique that solves problems by combining solutions to overlapping subproblems. Two properties must hold for DP to apply:

1. Overlapping Subproblems

The problem can be broken down into subproblems that are reused multiple times. Without overlap, you just have divide-and-conquer (like merge sort), not DP.

Example: Computing fib(5) recursively calls fib(3) twice and fib(2) three times. These repeated computations are overlapping subproblems.

2. Optimal Substructure

An optimal solution to the problem can be constructed from optimal solutions to its subproblems. In other words, you can solve a problem by combining the best answers to smaller versions of itself.

Example: The shortest path from A to C through B is the shortest path from A to B plus the shortest path from B to C. This is optimal substructure.

Property	Meaning	Without It
Overlapping Subproblems	Same subproblem is solved multiple times	Divide-and-conquer (no caching needed)
Optimal Substructure	Optimal solution built from optimal sub-solutions	Greedy or brute force may be needed

Memoization vs Tabulation

There are two ways to implement DP: top-down (memoization) and bottom-up (tabulation).

Feature	Memoization (Top-Down)	Tabulation (Bottom-Up)
Direction	Starts from the original problem, recurses down	Starts from base cases, builds up
Implementation	Recursive + hash map/array cache	Iterative + DP table
Subproblems solved	Only those actually needed (lazy)	All subproblems in the table (eager)
Stack overhead	O(n) call stack (risk of stack overflow)	No recursion, O(1) stack
Cache performance	Poor — jumps around in memory	Excellent — sequential memory access
Ease of coding	Often more intuitive (just add cache to recursion)	Requires understanding fill order
Space optimization	Harder to reduce space	Easy to use rolling array

# Top-Down (Memoization)
def fib_memo(n, memo={}):
    if n <= 1: return n
    if n not in memo:
        memo[n] = fib_memo(n - 1, memo) + fib_memo(n - 2, memo)
    return memo[n]

# Bottom-Up (Tabulation)
def fib_tab(n):
    if n <= 1: return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i - 1] + dp[i - 2]
    return dp[n]

# Space-Optimized (Rolling Variables)
def fib_opt(n):
    if n <= 1: return n
    prev2, prev1 = 0, 1
    for _ in range(2, n + 1):
        prev2, prev1 = prev1, prev2 + prev1
    return prev1

Figure 9.1 — Fibonacci recursion tree showing overlapping subproblems (highlighted nodes are computed multiple times without memoization)

1D DP Patterns

The simplest DP problems involve a single dimension — usually an index into an array or a running total. The key is defining dp[i] clearly.

Climbing Stairs

You can climb 1 or 2 steps at a time. How many distinct ways to reach step n?

State: dp[i] = number of ways to reach step i. Transition: dp[i] = dp[i-1] + dp[i-2].

def climb_stairs(n):
    """O(n) time, O(1) space with rolling variables."""
    if n <= 2: return n
    a, b = 1, 2
    for _ in range(3, n + 1):
        a, b = b, a + b
    return b

Coin Change (Minimum Coins)

Given coins of different denominations, find the fewest coins to make a target amount.

State: dp[i] = min coins to make amount i. Transition: dp[i] = min(dp[i - c] + 1) for each coin c.

def coin_change(coins, amount):
    """O(amount * len(coins)) time, O(amount) space."""
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    for i in range(1, amount + 1):
        for c in coins:
            if c <= i and dp[i - c] + 1 < dp[i]:
                dp[i] = dp[i - c] + 1
    return dp[amount] if dp[amount] != float('inf') else -1

Longest Increasing Subsequence (LIS)

State: dp[i] = length of LIS ending at index i. Transition: dp[i] = max(dp[j] + 1) for all j < i where nums[j] < nums[i].

def lis(nums):
    """O(n^2) DP solution."""
    n = len(nums)
    dp = [1] * n
    for i in range(1, n):
        for j in range(i):
            if nums[j] < nums[i]:
                dp[i] = max(dp[i], dp[j] + 1)
    return max(dp)

import bisect
def lis_optimized(nums):
    """O(n log n) using patience sorting / binary search."""
    tails = []  # tails[i] = smallest tail of all increasing subseqs of length i+1
    for x in nums:
        pos = bisect.bisect_left(tails, x)
        if pos == len(tails):
            tails.append(x)
        else:
            tails[pos] = x
    return len(tails)

House Robber

Cannot rob two adjacent houses. Maximize total loot.

State: dp[i] = max money from houses 0..i. Transition: dp[i] = max(dp[i-1], dp[i-2] + nums[i]).

def rob(nums):
    """O(n) time, O(1) space."""
    prev2 = prev1 = 0
    for num in nums:
        prev2, prev1 = prev1, max(prev1, prev2 + num)
    return prev1

2D DP Patterns

When the state depends on two parameters (two indices, two strings, grid coordinates), we use a 2D DP table.

Longest Common Subsequence (LCS)

State: dp[i][j] = length of LCS of s1[0..i-1] and s2[0..j-1].

Transition:

If s1[i-1] == s2[j-1]: dp[i][j] = dp[i-1][j-1] + 1 (match, extend)
Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) (skip one character from either string)

def lcs(s1, s2):
    """O(m*n) time and space."""
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    return dp[m][n]

Figure 9.2 — LCS DP table for "ABCBDAB" and "BDCAB". Green cells show diagonal matches used in backtracking.

Edit Distance (Levenshtein Distance)

Minimum insertions, deletions, and substitutions to transform one string into another.

State: dp[i][j] = edit distance between s1[0..i-1] and s2[0..j-1].

def edit_distance(s1, s2):
    """O(m*n) time and space."""
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(m + 1): dp[i][0] = i
    for j in range(n + 1): dp[0][j] = j
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                dp[i][j] = 1 + min(
                    dp[i - 1][j],     # delete
                    dp[i][j - 1],     # insert
                    dp[i - 1][j - 1]  # replace
                )
    return dp[m][n]

Unique Paths (Grid Traversal)

Count paths from top-left to bottom-right of an m x n grid, moving only right or down.

State: dp[i][j] = number of paths to reach cell (i, j). Transition: dp[i][j] = dp[i-1][j] + dp[i][j-1].

def unique_paths(m, n):
    """O(m*n) time, O(n) space with 1D rolling."""
    dp = [1] * n
    for i in range(1, m):
        for j in range(1, n):
            dp[j] += dp[j - 1]
    return dp[n - 1]

Knapsack Family

Every (item, capacity) pair contributes a constant-work cell; traversing capacities right-to-left for 0/1 knapsack lets us reuse a single 1-D row.

The knapsack problem is a fundamental DP archetype with several variants, each requiring a different approach.

Variant	Item Reuse?	Approach	Time	Space
0/1 Knapsack	No (each item once)	DP with 2D or 1D rolling	O(n · W)	O(W)
Unbounded Knapsack	Yes (unlimited copies)	DP — iterate coins/items forward	O(n · W)	O(W)
Fractional Knapsack	Yes (fractions allowed)	Greedy — sort by value/weight ratio	O(n log n)	O(1)

0/1 Knapsack

Given n items with weights and values, maximize total value without exceeding capacity W. Each item can be taken at most once.

State: dp[i][w] = max value using items 0..i-1 with capacity w.

Transition: dp[i][w] = max(dp[i-1][w], dp[i-1][w - wt[i]] + val[i]) if wt[i] ≤ w.

def knapsack_01(weights, values, W):
    """0/1 Knapsack. O(n*W) time, O(W) space with 1D rolling."""
    n = len(weights)
    dp = [0] * (W + 1)
    for i in range(n):
        # Traverse backwards to avoid using same item twice
        for w in range(W, weights[i] - 1, -1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    return dp[W]

Unbounded Knapsack

Same as 0/1 but each item can be used unlimited times. The key difference: iterate forward through weights.

def knapsack_unbounded(weights, values, W):
    """Unbounded Knapsack. O(n*W) time, O(W) space."""
    dp = [0] * (W + 1)
    for w in range(1, W + 1):
        for i in range(len(weights)):
            if weights[i] <= w:
                dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    return dp[W]

Fractional Knapsack (Greedy)

Items can be broken into fractions. Greedy works because the problem has the greedy choice property: always take the item with the highest value-to-weight ratio.

def knapsack_fractional(weights, values, W):
    """Fractional Knapsack (greedy). O(n log n) time."""
    items = sorted(zip(values, weights), key=lambda x: x[0] / x[1], reverse=True)
    total = 0
    for v, w in items:
        if W >= w:
            total += v
            W -= w
        else:
            total += v * (W / w)  # take fraction
            break
    return total

💡

Key Insight — 0/1 vs Unbounded Loop Order: In 0/1 knapsack, iterate weights backwards to ensure each item is used at most once. In unbounded knapsack, iterate forwards to allow reuse. This single difference determines the variant.

Interval & String DP

Interval DP

Problems where the state is defined by a contiguous interval [i, j]. The subproblems are smaller intervals, and we try all possible split points.

Classic example: Matrix Chain Multiplication — find the optimal way to parenthesize a chain of matrices to minimize scalar multiplications.

def matrix_chain(dims):
    """Matrix chain multiplication. dims[i] = rows of matrix i, dims[n] = cols of last.
    O(n^3) time, O(n^2) space."""
    n = len(dims) - 1
    dp = [[0] * n for _ in range(n)]
    for length in range(2, n + 1):        # chain length
        for i in range(n - length + 1):    # start
            j = i + length - 1              # end
            dp[i][j] = float('inf')
            for k in range(i, j):           # split point
                cost = dp[i][k] + dp[k + 1][j] + dims[i] * dims[k + 1] * dims[j + 1]
                dp[i][j] = min(dp[i][j], cost)
    return dp[0][n - 1]

Palindrome Partitioning

Another interval DP classic — minimum cuts to partition a string into palindromes.

def min_palindrome_cuts(s):
    """Minimum cuts to partition into palindromes. O(n^2)."""
    n = len(s)
    # is_pal[i][j] = True if s[i..j] is palindrome
    is_pal = [[False] * n for _ in range(n)]
    for i in range(n - 1, -1, -1):
        for j in range(i, n):
            if s[i] == s[j] and (j - i < 2 or is_pal[i + 1][j - 1]):
                is_pal[i][j] = True

    dp = list(range(n))  # dp[i] = min cuts for s[0..i]
    for i in range(1, n):
        if is_pal[0][i]:
            dp[i] = 0
        else:
            for j in range(i):
                if is_pal[j + 1][i]:
                    dp[i] = min(dp[i], dp[j] + 1)
    return dp[n - 1]

State Design Principles

Define the state clearly: What does dp[i] (or dp[i][j]) represent? Write it in plain English before coding.
Identify the transition: How does dp[i] relate to smaller subproblems? What choices do you make at each step?
Set base cases: What are the trivial subproblems you can solve directly?
Determine fill order: For tabulation, ensure all dependencies are computed before the current cell.
Optimize space: If dp[i] only depends on dp[i-1] (or a constant number of previous rows), use a rolling array to reduce space from O(n) to O(1) or from O(n×m) to O(m).

Space Optimization: Rolling Array

When each row of the DP table only depends on the previous row, keep only two rows (or even one with careful iteration order) to reduce space.

# LCS with O(min(m,n)) space using rolling array
def lcs_space_optimized(s1, s2):
    if len(s1) < len(s2): s1, s2 = s2, s1  # s2 is shorter
    m, n = len(s1), len(s2)
    prev = [0] * (n + 1)
    for i in range(1, m + 1):
        curr = [0] * (n + 1)
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                curr[j] = prev[j - 1] + 1
            else:
                curr[j] = max(prev[j], curr[j - 1])
        prev = curr
    return prev[n]

⚠

Gotcha: Not every recursive problem is DP. DP requires overlapping subproblems. If every subproblem is unique (like in merge sort or binary search), caching provides no benefit — it's just divide-and-conquer. Ask yourself: "Will I solve the same subproblem more than once?" If yes, it's DP. If no, it's divide-and-conquer.

Under the Hood: Memoization vs Tabulation Cache Locality

Both approaches have O(n) time for Fibonacci, but their memory access patterns differ dramatically:

Tabulation (bottom-up): Fills dp[0], dp[1], dp[2], ... sequentially. This is cache-friendly — consecutive memory addresses are accessed in order, maximizing L1/L2 cache hits. Modern CPUs prefetch sequential data automatically.
Memoization (top-down): Access pattern depends on the recursion tree. For fib(100), it calls fib(99), fib(98), ..., fib(1) on the left spine, then backtracks. The hash map lookups involve hashing, collision resolution, and pointer chasing — all cache-unfriendly.
Practical impact: For simple 1D DP, tabulation is typically 2-5x faster than memoization due to cache effects, despite identical asymptotic complexity.
When memoization wins: If the problem has a large state space but only a small fraction of states are actually reached (sparse subproblem graph), memoization avoids computing unnecessary states that tabulation would fill.

LIS table on [10,9,2,5,3,7,101,18] — dp[i] = LIS ending at i

0/1 Knapsack — items [(w=1,v=1),(w=3,v=4),(w=4,v=5),(w=5,v=7)], capacity 7

Edit distance "horse" to "ros" — answer 3

Module 10

Sorting & Searching

Master every comparison and non-comparison sort, binary search patterns, and selection algorithms — the bedrock of algorithmic problem solving.

Comparison Sorts

Decision-tree argument proves any comparison sort needs Ω(n log n) comparisons; quicksort is in-place O(log n) stack, mergesort needs an O(n) buffer.

Comparison-based sorts determine order by comparing pairs of elements. The theoretical lower bound for any comparison sort is Ω(n log n) in the worst case (proved via decision tree argument).

Algorithm	Best	Average	Worst	Space	Stable?	In-Place?
Bubble Sort	O(n)	O(n²)	O(n²)	O(1)	Yes	Yes
Selection Sort	O(n²)	O(n²)	O(n²)	O(1)	No	Yes
Insertion Sort	O(n)	O(n²)	O(n²)	O(1)	Yes	Yes
Merge Sort	O(n log n)	O(n log n)	O(n log n)	O(n)	Yes	No
Quick Sort	O(n log n)	O(n log n)	O(n²)	O(log n)	No	Yes
Heap Sort	O(n log n)	O(n log n)	O(n log n)	O(1)	No	Yes
Tim Sort	O(n)	O(n log n)	O(n log n)	O(n)	Yes	No

Merge Sort

Divide-and-conquer: split array in half, recursively sort each half, merge. Guaranteed O(n log n) but needs O(n) extra space.

def merge_sort(arr):
    """O(n log n) time, O(n) space. Stable."""
    if len(arr) <= 1: return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:   # <= for stability
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

Figure 10.1 — Merge sort divide-and-conquer tree: split until singletons, then merge upward

Quick Sort

Choose a pivot, partition array into elements < pivot and ≥ pivot, recursively sort partitions. In-place, O(n log n) average, but O(n²) worst case on sorted input.

import random

def quicksort(arr, lo=0, hi=None):
    """In-place quicksort with random pivot. O(n log n) average."""
    if hi is None: hi = len(arr) - 1
    if lo >= hi: return
    pivot_idx = partition(arr, lo, hi)
    quicksort(arr, lo, pivot_idx - 1)
    quicksort(arr, pivot_idx + 1, hi)

def partition(arr, lo, hi):
    # Random pivot to avoid O(n^2) on sorted input
    rand = random.randint(lo, hi)
    arr[rand], arr[hi] = arr[hi], arr[rand]
    pivot = arr[hi]
    i = lo
    for j in range(lo, hi):
        if arr[j] < pivot:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    arr[i], arr[hi] = arr[hi], arr[i]
    return i

Tim Sort

Hybrid of merge sort and insertion sort, used by Python (sorted(), list.sort()) and Java (Arrays.sort() for objects). Key ideas:

Divide input into small "runs" (naturally sorted subsequences or forced to min length ~32)
Sort each run with insertion sort (fast for small/nearly-sorted data)
Merge runs using a modified merge sort with galloping mode for highly unbalanced merges
O(n) best case (already sorted), O(n log n) worst case, always stable

Non-Comparison Sorts

These algorithms break the Ω(n log n) barrier by not comparing elements. Instead, they exploit the structure of the data (integers, bounded range).

Algorithm	Time	Space	Stable?	When to Use
Counting Sort	O(n + k)	O(n + k)	Yes	Integers with small range k
Radix Sort	O(d · (n+k))	O(n + k)	Yes	Fixed-length integers/strings, d digits
Bucket Sort	O(n + k)	O(n + k)	Yes	Uniformly distributed floating-point numbers

Counting Sort

Count occurrences of each value, then reconstruct the sorted array. Only works for non-negative integers in a bounded range.

def counting_sort(arr, max_val):
    """O(n + k) time and space, where k = max_val + 1."""
    count = [0] * (max_val + 1)
    for x in arr:
        count[x] += 1
    # Build prefix sums for stable placement
    for i in range(1, len(count)):
        count[i] += count[i - 1]
    output = [0] * len(arr)
    for x in reversed(arr):  # reversed for stability
        count[x] -= 1
        output[count[x]] = x
    return output

Radix Sort

Sort digit by digit from least significant to most significant, using a stable sub-sort (usually counting sort) at each digit position.

def radix_sort(arr):
    """LSD Radix Sort. O(d * (n + 10)) where d = number of digits."""
    if not arr: return arr
    max_val = max(arr)
    exp = 1
    while max_val // exp > 0:
        arr = counting_sort_by_digit(arr, exp)
        exp *= 10
    return arr

def counting_sort_by_digit(arr, exp):
    count = [0] * 10
    for x in arr:
        count[(x // exp) % 10] += 1
    for i in range(1, 10):
        count[i] += count[i - 1]
    output = [0] * len(arr)
    for x in reversed(arr):
        digit = (x // exp) % 10
        count[digit] -= 1
        output[count[digit]] = x
    return output

Binary Search Variants

Every iteration halves the active range; lower_bound / upper_bound share the same halving loop with different comparator predicate.

Binary search is far more than just "find an element in a sorted array." Mastering its variants is essential for interviews — these patterns appear in dozens of problems.

Classic Binary Search

def binary_search(arr, target):
    """Standard binary search. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2  # avoid overflow
        if arr[mid] == target: return mid
        elif arr[mid] < target: lo = mid + 1
        else: hi = mid - 1
    return -1

Leftmost (Lower Bound)

Find the first position where arr[pos] ≥ target. Equivalent to C++ lower_bound and Python bisect_left.

def bisect_left(arr, target):
    """Find leftmost index where arr[i] >= target. O(log n)."""
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] < target: lo = mid + 1
        else: hi = mid
    return lo

Rightmost (Upper Bound)

Find the first position where arr[pos] > target. Equivalent to C++ upper_bound and Python bisect_right.

def bisect_right(arr, target):
    """Find leftmost index where arr[i] > target. O(log n)."""
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] <= target: lo = mid + 1
        else: hi = mid
    return lo

Rotated Sorted Array

Array was sorted then rotated. One half is always sorted — determine which, then decide which half to search.

def search_rotated(arr, target):
    """Search in rotated sorted array. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] == target: return mid
        if arr[lo] <= arr[mid]:       # left half is sorted
            if arr[lo] <= target < arr[mid]:
                hi = mid - 1
            else:
                lo = mid + 1
        else:                          # right half is sorted
            if arr[mid] < target <= arr[hi]:
                lo = mid + 1
            else:
                hi = mid - 1
    return -1

Peak Finding

Find any peak element (greater than its neighbors) in O(log n).

def find_peak(arr):
    """Find a peak element index. O(log n)."""
    lo, hi = 0, len(arr) - 1
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] < arr[mid + 1]:
            lo = mid + 1   # peak is to the right
        else:
            hi = mid         # peak is at mid or left
    return lo

Figure 10.2 — Binary search decision tree for a 7-element sorted array

Quick-Select

Find the k-th smallest element without fully sorting. Uses the same partition logic as quicksort, but only recurses into one half — giving O(n) average time.

import random

def quickselect(arr, k):
    """Find k-th smallest element (0-indexed). O(n) average, O(n^2) worst."""
    def select(lo, hi, k):
        if lo == hi: return arr[lo]
        # Random pivot
        pivot_idx = random.randint(lo, hi)
        arr[pivot_idx], arr[hi] = arr[hi], arr[pivot_idx]
        pivot = arr[hi]
        i = lo
        for j in range(lo, hi):
            if arr[j] < pivot:
                arr[i], arr[j] = arr[j], arr[i]
                i += 1
        arr[i], arr[hi] = arr[hi], arr[i]

        if k == i: return arr[i]
        elif k < i: return select(lo, i - 1, k)
        else: return select(i + 1, hi, k)

    return select(0, len(arr) - 1, k)

Median of Medians (O(n) Worst Case)

To guarantee O(n) worst case, choose the pivot using the median-of-medians algorithm:

Divide array into groups of 5
Find the median of each group (O(1) per group, O(n/5) total)
Recursively find the median of these medians — use that as the pivot
This guarantees the pivot eliminates at least 30% of elements per step
Recurrence: T(n) = T(n/5) + T(7n/10) + O(n) = O(n)

In practice, random pivot is preferred because the constant factor of median-of-medians is large. But median-of-medians is important theoretically — it proves selection can be done in linear time.

⚠

Gotcha: Quicksort degrades to O(n²) on sorted/nearly-sorted input. The naive pivot choice (first or last element) creates maximally unbalanced partitions. Fix this with: (1) random pivot — simple and effective, (2) median-of-three — take median of first, middle, and last elements, or (3) introsort — switch to heapsort when recursion depth exceeds 2 log n (used by C++ std::sort).

Under the Hood: Merge Sort for Linked Lists, Quick Sort for Arrays

The choice between merge sort and quicksort depends heavily on the underlying data structure:

Arrays + Quick Sort: Quicksort's in-place partitioning accesses elements sequentially within each partition pass, giving excellent cache locality. Elements are in contiguous memory, and the CPU prefetcher can anticipate access patterns. The O(log n) stack space is negligible.
Arrays + Merge Sort: Requires O(n) auxiliary space for the merge step. The extra allocation and copying destroys cache locality, making it slower in practice despite the same O(n log n) complexity.
Linked Lists + Merge Sort: Linked list nodes are scattered in memory anyway (no cache locality to lose). Merge sort's merge step can be done in O(1) extra space by re-linking pointers. No random access needed — just sequential pointer traversal.
Linked Lists + Quick Sort: Partition requires random access to swap elements, which is O(n) per access in a linked list. The resulting constant factors make it impractical.

Rule of thumb: Use quicksort (or introsort) for arrays. Use merge sort for linked lists. Python and Java use Tim Sort (merge sort variant) because stability is often desired and the O(n) space is acceptable.

Quicksort — Lomuto partition of [3,7,1,8,2,5,4,6] (pivot = 6)

Mergesort: merge [1,4,5,8] with [2,3,6,7]

lower_bound for 2 in [1,2,2,2,3,4] — returns 1 (first 2)

Sort comparison — best/avg/worst, stable, in-place

Module 11

Advanced Topics

Backtracking, greedy algorithms, bit manipulation, and mathematical foundations — the techniques that round out your interview toolkit.

Backtracking

Branching factor b to depth d gives the worst-case search tree; only the current path is held on the call stack so memory is linear in depth.

Backtracking systematically explores all potential solutions by building candidates incrementally and abandoning ("pruning") a candidate as soon as it's determined that it cannot lead to a valid solution.

General Framework

def backtrack(candidate, state):
    if is_solution(candidate):
        output(candidate)
        return
    for next_choice in expand(candidate):
        if is_valid(next_choice):       # PRUNE invalid branches
            make_choice(next_choice)
            backtrack(candidate, state)
            undo_choice(next_choice)      # BACKTRACK

N-Queens Problem

Place n queens on an n×n board such that no two queens attack each other (same row, column, or diagonal).

def solve_n_queens(n):
    """Find all valid N-Queens arrangements."""
    solutions = []
    cols = set()
    diag1 = set()  # row - col (anti-diagonal)
    diag2 = set()  # row + col (diagonal)

    def place(row, queens):
        if row == n:
            solutions.append(queens[:])
            return
        for col in range(n):
            if col in cols or (row - col) in diag1 or (row + col) in diag2:
                continue   # PRUNE: under attack
            cols.add(col)
            diag1.add(row - col)
            diag2.add(row + col)
            place(row + 1, queens + [col])
            cols.remove(col)
            diag1.remove(row - col)
            diag2.remove(row + col)

    place(0, [])
    return solutions

Figure 11.1 — 4-Queens backtracking tree with pruned (X) and valid branches. Two solutions found: [1,3,0,2] and [2,0,3,1].

Permutations & Combinations

def permutations(nums):
    """Generate all permutations. O(n! * n)."""
    result = []
    def backtrack(path, remaining):
        if not remaining:
            result.append(path[:])
            return
        for i in range(len(remaining)):
            path.append(remaining[i])
            backtrack(path, remaining[:i] + remaining[i+1:])
            path.pop()
    backtrack([], nums)
    return result

def combinations(nums, k):
    """Generate all C(n,k) combinations. O(C(n,k) * k)."""
    result = []
    def backtrack(start, path):
        if len(path) == k:
            result.append(path[:])
            return
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1, path)
            path.pop()
    backtrack(0, [])
    return result

def subsets(nums):
    """Generate all 2^n subsets (power set)."""
    result = []
    def backtrack(start, path):
        result.append(path[:])
        for i in range(start, len(nums)):
            path.append(nums[i])
            backtrack(i + 1, path)
            path.pop()
    backtrack(0, [])
    return result

Greedy Algorithms

Greedy algorithms make the locally optimal choice at each step, hoping to find the global optimum. They work when the problem has the greedy choice property and optimal substructure.

Feature	Greedy	Dynamic Programming
Approach	Make best local choice, never reconsider	Consider all subproblems, pick the best
Correctness	Must prove (exchange argument / matroid)	Always correct if recurrence is right
Efficiency	Usually faster (often O(n log n) or O(n))	Can be slower due to filling DP table
When it works	Activity selection, MST, Huffman, Dijkstra	0/1 knapsack, LCS, edit distance
When it fails	0/1 knapsack, LCS, coin change (general)	Rarely fails (but may be too slow)

Activity Selection / Interval Scheduling

Select the maximum number of non-overlapping intervals. Greedy strategy: always pick the activity that finishes earliest.

def activity_selection(intervals):
    """Max non-overlapping intervals. O(n log n)."""
    intervals.sort(key=lambda x: x[1])  # sort by end time
    selected = [intervals[0]]
    for start, end in intervals[1:]:
        if start >= selected[-1][1]:    # no overlap
            selected.append((start, end))
    return selected

Huffman Coding

Build an optimal prefix-free binary code for data compression. Greedy: always merge the two nodes with the lowest frequencies.

import heapq

def huffman(freq):
    """Build Huffman tree. freq = {char: count}. O(n log n)."""
    heap = [[count, [char, ""]] for char, count in freq.items()]
    heapq.heapify(heap)
    while len(heap) > 1:
        lo = heapq.heappop(heap)
        hi = heapq.heappop(heap)
        for pair in lo[1:]:
            pair[1] = '0' + pair[1]
        for pair in hi[1:]:
            pair[1] = '1' + pair[1]
        merged = [lo[0] + hi[0]] + lo[1:] + hi[1:]
        heapq.heappush(heap, merged)
    return {char: code for char, code in heap[0][1:]}

Bit Manipulation

Bit operations are the fastest operations a CPU can perform — each maps to a single hardware instruction. Mastering these techniques unlocks elegant solutions to many interview problems.

Fundamental Operations

Operation	Code	What It Does
Set bit i	`x \| (1 << i)`	Turn bit i ON
Clear bit i	`x & ~(1 << i)`	Turn bit i OFF
Toggle bit i	`x ^ (1 << i)`	Flip bit i
Check bit i	`(x >> i) & 1`	Is bit i set?
Clear lowest set bit	`x & (x - 1)`	Turns off the rightmost 1
Isolate lowest set bit	`x & (-x)`	Extracts the rightmost 1
Check power of 2	`x & (x - 1) == 0`	True if exactly one bit set
Count set bits	`bin(x).count('1')`	Number of 1-bits (popcount)

Classic XOR Tricks

XOR has unique properties: a ^ a = 0, a ^ 0 = a, and XOR is commutative and associative.

def find_single_number(nums):
    """Every element appears twice except one. Find it. O(n) time, O(1) space."""
    result = 0
    for num in nums:
        result ^= num  # pairs cancel out (a ^ a = 0)
    return result

def swap_without_temp(a, b):
    """Swap two values using XOR — no temporary variable needed."""
    a ^= b   # a = a ^ b
    b ^= a   # b = (a ^ b) ^ b = a
    a ^= b   # a = (a ^ b) ^ a = b
    return a, b

def missing_number(nums, n):
    """Find missing number in [0..n]. XOR all values with all indices."""
    xor = n
    for i, num in enumerate(nums):
        xor ^= i ^ num
    return xor

Counting Set Bits (Brian Kernighan's Algorithm)

def count_set_bits(n):
    """Count number of 1-bits. O(number of set bits)."""
    count = 0
    while n:
        n &= n - 1   # clear the lowest set bit
        count += 1
    return count

def is_power_of_two(n):
    """Check if n is a power of 2. O(1)."""
    return n > 0 and (n & (n - 1)) == 0

Bitmask DP — Subset Enumeration

Use an integer as a bitmask to represent subsets. Iterate over all subsets of a set of size n in O(2^n).

# Iterate over all subsets of {0, 1, ..., n-1}
for mask in range(1 << n):
    # Check which elements are in this subset
    for i in range(n):
        if mask & (1 << i):
            # element i is in the subset
            pass

# Iterate over all submasks of a given mask
submask = mask
while submask > 0:
    # process submask
    submask = (submask - 1) & mask

Math for Interviews

GCD — Euclidean Algorithm

def gcd(a, b):
    """Greatest Common Divisor. O(log(min(a,b)))."""
    while b:
        a, b = b, a % b
    return a

def lcm(a, b):
    """Least Common Multiple."""
    return a * b // gcd(a, b)

Modular Arithmetic

Essential for problems with large numbers where answers must be returned modulo 10^9+7:

(a + b) % m = ((a % m) + (b % m)) % m
(a * b) % m = ((a % m) * (b % m)) % m
(a - b) % m = ((a % m) - (b % m) + m) % m (add m to avoid negatives)
Modular inverse: a^(-1) mod m = a^(m-2) mod m (when m is prime, by Fermat's little theorem)

MOD = 10**9 + 7

def mod_pow(base, exp, mod):
    """Fast modular exponentiation. O(log exp)."""
    result = 1
    base %= mod
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        exp >>= 1
        base = base * base % mod
    return result

def mod_inverse(a, mod):
    """Modular inverse using Fermat's little theorem (mod must be prime)."""
    return mod_pow(a, mod - 2, mod)

Sieve of Eratosthenes

def sieve(n):
    """Find all primes up to n. O(n log log n)."""
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(n**0.5) + 1):
        if is_prime[i]:
            for j in range(i*i, n + 1, i):
                is_prime[j] = False
    return [i for i in range(2, n + 1) if is_prime[i]]

Combinatorics — nCr with DP

def nCr(n, r, mod=10**9+7):
    """Compute C(n,r) mod p using Pascal's triangle. O(n*r)."""
    if r > n: return 0
    if r == 0 or r == n: return 1
    dp = [[0] * (r + 1) for _ in range(n + 1)]
    for i in range(n + 1):
        dp[i][0] = 1
        for j in range(1, min(i, r) + 1):
            dp[i][j] = (dp[i-1][j-1] + dp[i-1][j]) % mod
    return dp[n][r]

# Faster with factorials and modular inverse (O(n) precompute)
def nCr_fast(n, r, mod=10**9+7):
    if r > n: return 0
    fact = [1] * (n + 1)
    for i in range(2, n + 1): fact[i] = fact[i-1] * i % mod
    inv_fact = [1] * (n + 1)
    inv_fact[n] = mod_pow(fact[n], mod - 2, mod)
    for i in range(n - 1, -1, -1): inv_fact[i] = inv_fact[i+1] * (i+1) % mod
    return fact[n] * inv_fact[r] % mod * inv_fact[n-r] % mod

⚠

Gotcha: Greedy doesn't always give optimal solution. The classic counterexample is the coin change problem with denominations like {1, 3, 4}. Greedy for amount 6 would pick 4+1+1 (3 coins), but the optimal is 3+3 (2 coins). To prove greedy correctness, use the exchange argument: show that swapping any non-greedy choice for the greedy choice doesn't worsen the solution.

Under the Hood: Bit Operations = Single CPU Instructions

Bit operations map directly to single CPU instructions, making them the fastest operations possible:

AND, OR, XOR, NOT: Each executes in 1 clock cycle on all modern CPUs. No pipeline stalls, no memory access.
SHL (shift left), SHR (shift right): Also 1 cycle. x << 1 is faster than x * 2, though modern compilers optimize this automatically.
POPCNT (population count): Counts set bits in a single instruction on x86 (SSE4.2+). Python's bin(x).count('1') uses this under the hood via GMP.
CLZ/CTZ (count leading/trailing zeros): Hardware instructions that enable O(1) operations like "find the highest set bit" or "find the lowest set bit."
PDEP/PEXT (BMI2): Parallel bit deposit/extract — used in advanced applications like chess engines (Stockfish) and hash table implementations.

When solving problems with bitmask DP, the inner loop operations are all bit-level and run orders of magnitude faster than equivalent hash set operations. This is why bitmask DP with n≤20 often runs faster than expected from the O(2^n · n) complexity.

N-Queens 5x5 — one of 10 solutions, placed row-by-row with backtracking

Bit idioms cheat card

Fenwick Tree (Binary Indexed Tree)

Each update or prefix query walks idx += idx & -idx (or idx -= idx & -idx), which toggles at most one set bit per step — so the loop runs ceil(log2 n) times. No recursion, no pointers, single array.

The Fenwick tree (BIT) is what you reach for when you need point update + prefix aggregate on an array and a segment tree feels heavy. The structure is implicit: index i owns the range (i - lowbit(i), i], where lowbit(i) = i & -i. There is no tree object — just a 1-indexed array and two four-line loops.

Two classic extensions interviewers probe: the 2D BIT (same trick nested, O(log² n) per op), and the range-update + range-query variant built from two BITs B1, B2 that maintain invariants sum[1..i] = i·query(B1,i) - query(B2,i). That two-BIT trick converts "add v on [l,r] and query sum on [l,r]" into four O(log n) calls — often the difference between a segment-tree with lazy and a one-screen solution.

When NOT to use: non-invertible operations (min/max can't be range-updated cleanly because there is no inverse). For those, fall back to a segment tree.

# Python — point update, prefix sum
class BIT:
    def __init__(self, n):
        self.n = n
        self.t = [0] * (n + 1)           # 1-indexed
    def update(self, i, delta):            # add delta at index i
        while i <= self.n:
            self.t[i] += delta
            i += i & -i                    # climb to parent in implicit tree
    def query(self, i):                    # sum of a[1..i]
        s = 0
        while i > 0:
            s += self.t[i]
            i -= i & -i                    # jump to predecessor range
        return s
    def range_sum(self, l, r):             # inclusive [l,r]
        return self.query(r) - self.query(l - 1)

# Range update + range query via two BITs
class RURQ:
    def __init__(self, n):
        self.B1, self.B2 = BIT(n), BIT(n)
    def _upd(self, B, i, v):
        B.update(i, v)
    def range_add(self, l, r, v):
        self.B1.update(l, v);      self.B1.update(r + 1, -v)
        self.B2.update(l, v * (l - 1))
        self.B2.update(r + 1, -v * r)
    def prefix(self, i):
        return i * self.B1.query(i) - self.B2.query(i)
    def range_sum(self, l, r):
        return self.prefix(r) - self.prefix(l - 1)

// Java — minimal BIT
class FenwickTree {
    int[] t; int n;
    FenwickTree(int n) { this.n = n; t = new int[n + 1]; }
    void update(int i, int v) { for (; i <= n; i += i & -i) t[i] += v; }
    int  query (int i) { int s = 0; for (; i > 0; i -= i & -i) s += t[i]; return s; }
}

Fenwick prefix-sum walk — update(5, +3) then query(7)

Sqrt Decomposition

Array split into ⌈n/B⌉ blocks of size B=⌈√n⌉. A range query touches at most two partial blocks (B work) and n/B full blocks (√n work); the balance B=√n minimizes B + n/B = 2√n.

Sqrt decomposition is the poor-cousin of segment trees that is often faster in cache and dramatically simpler to write under pressure. Store the array plus a per-block aggregate (sum, min, max, xor…). Updates touch one cell plus recompute one block in O(B). Range queries peel the partial prefix/suffix blocks cell-by-cell and sweep the full interior blocks via their precomputed aggregate.

Versus a segment tree: segment tree is O(log n) per op but fatter constants and more code. Use sqrt when (a) n ≤ 10⁵ and constants matter, (b) the operation is awkward to merge (e.g. "mode of range"), or (c) the problem is offline — then you unlock Mo's algorithm, which reorders q queries by (block-of-L, R) and reuses an incremental data structure to achieve O((n+q)√n).

Typical interview signal: "given q up to 10⁵ range queries on a 10⁵ array, count distinct / count mode / count elements equal to x" — if online needs are absent, Mo's algorithm is the canonical answer.

import math

class SqrtSum:
    def __init__(self, a):
        self.a = a[:]
        self.n = len(a)
        self.B = max(1, int(math.isqrt(self.n)))
        self.block = [0] * ((self.n + self.B - 1) // self.B)
        for i, v in enumerate(a):
            self.block[i // self.B] += v

    def update(self, i, v):                    # a[i] = v
        self.block[i // self.B] += v - self.a[i]
        self.a[i] = v

    def range_sum(self, l, r):                 # inclusive [l,r]
        s = 0
        bl, br = l // self.B, r // self.B
        if bl == br:
            return sum(self.a[l:r+1])
        s += sum(self.a[l:(bl+1)*self.B])         # partial head
        for b in range(bl + 1, br):
            s += self.block[b]                     # full middle
        s += sum(self.a[br*self.B:r+1])           # partial tail
        return s

Sqrt decomposition — n=9, B=3

Persistent Data Structures

Path-copying on a balanced tree clones only the O(log n) nodes on the root-to-leaf path of the mutated index — every other node is shared across versions.

A persistent data structure lets you keep every historical version after updates, so queries can address any snapshot. Two canonical implementations: fat-node (store a list of (version, value) at each node — O(log m) to read a version) and path-copying (on update, clone the path; old root survives unchanged). Path-copying is the interview-friendly one.

The killer FAANG application is the persistent segment tree: build one tree per prefix of the array, where each tree is created from the previous by a point update in O(log n) time and O(log n) new nodes. A range query becomes "compare root_r and root_l-1" inside the tree, giving k-th smallest in range [l,r] in O(log n) after O(n log n) build — the standard "kth smallest in range" problem.

Persistent arrays (e.g. Clojure / Scala vectors) use the same idea with branch factor 32: a 32-way trie where updates copy at most ⌈log₃₂ n⌉ ≤ 7 nodes for any n ≤ 2³⁵. That is the "effectively-constant" copy cost behind immutable collections in functional languages.

# Persistent segment tree for kth-smallest in a range
# Each version is an extra O(log n) nodes; count of a value in [l,r]
# is diff of cumulative counts between version r and version l-1.
class Node:
    __slots__ = ("lo", "hi", "cnt", "l", "r")

def build(lo, hi):
    n = Node(); n.lo, n.hi, n.cnt, n.l, n.r = lo, hi, 0, None, None
    if lo < hi:
        m = (lo + hi) // 2
        n.l = build(lo, m); n.r = build(m + 1, hi)
    return n

def insert(prev, x):
    # Return a new root that shares everything with prev except the path to x.
    n = Node(); n.lo, n.hi = prev.lo, prev.hi
    if prev.lo == prev.hi:
        n.cnt = prev.cnt + 1; n.l = n.r = None; return n
    m = (prev.lo + prev.hi) // 2
    if x <= m:
        n.l, n.r = insert(prev.l, x), prev.r      # share right subtree
    else:
        n.l, n.r = prev.l, insert(prev.r, x)      # share left subtree
    n.cnt = n.l.cnt + n.r.cnt
    return n

def kth(u, v, k):                                  # kth smallest in (l-1, r]
    if u.lo == u.hi: return u.lo
    left = v.l.cnt - u.l.cnt
    if k <= left: return kth(u.l, v.l, k)
    return kth(u.r, v.r, k - left)

Path-copying — one update clones the spine, shares the rest

Suffix Array + LCP (Kasai)

Doubling: sort suffixes by their first 2ᵏ characters using pairs of previous-round ranks; log n rounds, each O(n log n) sort — O(n log² n). Radix-sort inside each round drops it to O(n log n). Kasai then walks suffixes in string order, amortizing LCP shrink to O(n).

A suffix array SA[0..n-1] is the lexicographic ordering of all suffixes of a string. Paired with the LCP array (lcp[i] = longest-common-prefix of SA[i] and SA[i-1]), it unlocks most string problems a suffix tree does, but with half the constants and a clean array layout.

Kasai's algorithm is the cleanest way to build LCP in O(n): walk the string in original position order, maintain a counter h that never increases by more than 1 per step, and use the inverse permutation rank[] to know where each suffix landed in SA.

Classic queries: longest repeated substring = max(lcp); longest common substring of s and t = max(lcp[i]) where SA[i] and SA[i-1] originate from different strings (concatenate s + "#" + t + "$"); distinct substrings = n(n+1)/2 − sum(lcp); pattern search = binary search on SA in O(m log n).

# Python — O(n log^2 n) suffix array (doubling + sort), O(n) Kasai LCP.
def suffix_array(s):
    n = len(s)
    sa = list(range(n))
    rank = [ord(c) for c in s]
    tmp = [0] * n
    k = 1
    while True:
        key = lambda i: (rank[i], rank[i + k] if i + k < n else -1)
        sa.sort(key=key)
        tmp[sa[0]] = 0
        for j in range(1, n):
            tmp[sa[j]] = tmp[sa[j-1]] + (1 if key(sa[j]) != key(sa[j-1]) else 0)
        rank = tmp[:]
        if rank[sa[-1]] == n - 1:           # all ranks unique
            break
        k *= 2
    return sa

def kasai(s, sa):
    n = len(s)
    rank = [0] * n
    for i, p in enumerate(sa): rank[p] = i
    h, lcp = 0, [0] * n
    for i in range(n):
        if rank[i] > 0:
            j = sa[rank[i] - 1]
            while i + h < n and j + h < n and s[i+h] == s[j+h]:
                h += 1
            lcp[rank[i]] = h
            if h: h -= 1                     # key amortization step
        else:
            h = 0
    return lcp

Suffix array of "banana$"

Convex Hull Trick (CHT) / Li Chao Tree

DP recurrences of shape dp[i] = min_j<i (m_j·x_i + b_j) reduce to evaluating a family of lines at a query point — the answer is the lower envelope of those lines. A monotonic deque or Li Chao tree maintains it in O(1) or O(log n) per op respectively.

CHT is the DP-optimization trick for quadratic recurrences where the transition is linear in some precomputed quantity. If adding lines arrive in order of slope and queries arrive in order of x (both monotonic), the classic deque-based CHT runs in amortized O(1) per op. If either ordering breaks, use the Li Chao tree: a segment tree over the x-range that stores at most one "best" line per node and rebalances on insert, giving O(log V) per op where V is the x-coordinate range.

The generalization to O(n·√n) problems like "split array into k parts minimizing sum of costs" is Divide & Conquer optimization or Knuth's optimization — related but needs different monotonicity (quadrangle inequality).

# Monotonic-deque CHT — add lines in decreasing slope, query in increasing x.
def bad(l1, l2, l3):
    # l2 is redundant iff intersection(l1,l3) is left of intersection(l1,l2).
    return (l3[1] - l1[1]) * (l1[0] - l2[0]) \
        <= (l2[1] - l1[1]) * (l1[0] - l3[0])

class CHT:                                            # min lower envelope
    def __init__(self): self.d = []                    # list of (m, b)
    def add(self, m, b):
        while len(self.d) >= 2 and bad(self.d[-2], self.d[-1], (m, b)):
            self.d.pop()
        self.d.append((m, b))
    def query(self, x):                                # x must be non-decreasing
        while len(self.d) >= 2 and \
              self.d[0][0]*x + self.d[0][1] >= self.d[1][0]*x + self.d[1][1]:
            self.d.pop(0)
        m, b = self.d[0]
        return m * x + b

Lower envelope of 4 lines — only L1, L3, L4 survive

Heavy-Light Decomposition (HLD)

Any root-to-leaf path crosses at most O(log n) "light" edges — each light edge halves the subtree size. Each chain lives in a contiguous segment-tree range, so a path query visits O(log n) chains × O(log n) per segment-tree op.

HLD converts tree path queries into a logarithmic number of array-range queries. For each node, pick the child whose subtree is largest — that edge is heavy, everything else is light. Heavy edges form disjoint chains; lay the chains back-to-back in an array and back them with a segment tree or BIT. Answering path(u, v) becomes: jump u and v upward one chain at a time toward their LCA, querying each chain's segment-tree range.

Canonical problem: "given a weighted tree, support point-update on a node's weight and query sum/max on the path between two nodes". That's HLD in its purest form — O(log² n) per operation.

Implementation sketch: two DFS passes. First computes sizes and chooses the heavy child; second assigns DFS-in indices and chain-head pointers. Then path-query bounces upward by u = parent[chainHead[u]].

# HLD — path sum with point updates on nodes. Skeleton shown.
def hld(n, adj, root=0):
    parent = [-1]*n; depth = [0]*n; size = [1]*n; heavy = [-1]*n
    # iterative DFS for size + heavy child
    order, stack = [], [root]
    while stack:
        u = stack.pop(); order.append(u)
        for v in adj[u]:
            if v != parent[u]:
                parent[v] = u; depth[v] = depth[u] + 1; stack.append(v)
    for u in reversed(order):
        for v in adj[u]:
            if v != parent[u]:
                size[u] += size[v]
                if heavy[u] == -1 or size[v] > size[heavy[u]]:
                    heavy[u] = v

    pos = [0]*n; head = [0]*n; t = 0
    def decompose(u, h):
        nonlocal t
        head[u], pos[u] = h, t; t += 1
        if heavy[u] != -1: decompose(heavy[u], h)
        for v in adj[u]:
            if v != parent[u] and v != heavy[u]:
                decompose(v, v)
    decompose(root, root)
    return parent, depth, head, pos                    # plug into segment tree over pos

HLD — two heavy chains, rest are singletons

DSU on Tree (Small-to-Large Merging)

Each node is re-added to the "big" map O(log n) times — every re-add doubles the size of the set containing it, so there are at most log n promotions per node.

"DSU on tree" (also called small-to-large, or Sack) answers subtree aggregate queries without Heavy-Light or persistent structures. For every node: solve light children first and clear their data; then solve the heavy child and keep its data; then walk every subtree of every light child and re-add into the heavy child's data structure; record the answer for the current node.

The canonical interview problem: "given a rooted tree where each node has a color, answer for every node: how many distinct colors appear in its subtree?" DSU-on-tree solves it in O(n log n) with a single hash map and a counter of "distinct" — trivial compared to HLD.

# Python sketch — count distinct colors per subtree.
def solve(n, adj, color, root=0):
    size = [1] * n; big = [-1] * n
    def dfs_size(u, p):
        for v in adj[u]:
            if v != p:
                dfs_size(v, u); size[u] += size[v]
                if big[u] == -1 or size[v] > size[big[u]]: big[u] = v
    dfs_size(root, -1)

    cnt = {}; distinct = [0]; ans = [0] * n
    def add(u, p, delta):
        c = color[u]
        if delta == 1:
            if cnt.get(c, 0) == 0: distinct[0] += 1
            cnt[c] = cnt.get(c, 0) + 1
        else:
            cnt[c] -= 1
            if cnt[c] == 0: distinct[0] -= 1
        for v in adj[u]:
            if v != p: add(v, u, delta)

    def dfs(u, p, keep):
        for v in adj[u]:
            if v != p and v != big[u]: dfs(v, u, False)   # light first, then clear
        if big[u] != -1: dfs(big[u], u, True)             # heavy, keep data
        # add self + light subtrees on top of heavy data
        c = color[u]
        if cnt.get(c, 0) == 0: distinct[0] += 1
        cnt[c] = cnt.get(c, 0) + 1
        for v in adj[u]:
            if v != p and v != big[u]: add(v, u, 1)
        ans[u] = distinct[0]
        if not keep: add(u, p, -1)                        # discard
    dfs(root, -1, False)
    return ans

Small-to-large — heavy subtree kept, light subtrees re-added

Link-Cut Trees (brief)

Each represented tree path is stored as a splay tree keyed by depth; splay's amortized O(log n) carries through the access primitive that all operations are expressed in terms of.

A Link-Cut tree (Sleator-Tarjan) maintains a dynamic forest under link(u,v), cut(u,v), findRoot(u), and path-aggregate queries — all in O(log n) amortized. The trick is the preferred-path decomposition: at any moment, paths are organized into splay trees keyed by depth. The primitive access(v) brings v to the root of its splay tree, re-splaying as it climbs.

Use cases: incremental MST (connectivity under edge inserts/deletes), dynamic tree DP. In practice, HLD covers most static-tree needs; reach for LCT only when the tree structure itself changes.

// Pseudocode of the access primitive (all other ops are one-liners on top)
void access(Node v) {
    splay(v);
    if (v.right != null) { v.right.path_parent = v; v.right.parent = null; v.right = null; }
    while (v.path_parent != null) {
        Node w = v.path_parent;
        splay(w);
        if (w.right != null) { w.right.path_parent = w; w.right.parent = null; }
        w.right = v; v.parent = w; v.path_parent = null;
        splay(v);
    }
}

Preferred path after access(v) — splay trees stacked by path-parent

Mo's Algorithm

Sort queries by (⌊l/B⌋, r). Across all queries, the left pointer moves O(n) per block and there are n/B blocks = O(n·n/B). The right pointer moves O(n) per block = O(n²/B). Pick B = √n → total O((n+q)√n).

Mo's is the canonical answer to "given an immutable array and q offline range queries, compute an aggregate that supports O(1) add / remove one element". Examples: count distinct values, count elements equal to x, count pairs (i,j) with a[i]==a[j] in [l,r] (HackerEarth classic), xor of all subarray elements occurrences.

The algorithm: maintain pointers (L, R) and a running aggregate. Sort queries, then for each query expand/contract pointers one step at a time, calling add(val) or remove(val). Each call is O(1) against your frequency map, and the total number of calls is bounded by the sorting trick.

from math import isqrt

def mo(a, queries):
    n, q = len(a), len(queries)
    B = max(1, isqrt(n))
    order = sorted(range(q), key=lambda i: (queries[i][0] // B, queries[i][1]))
    cnt = [0] * (max(a) + 1); distinct = 0
    L, R = 0, -1
    ans = [0] * q

    def add(x):
        nonlocal distinct
        if cnt[x] == 0: distinct += 1
        cnt[x] += 1
    def rem(x):
        nonlocal distinct
        cnt[x] -= 1
        if cnt[x] == 0: distinct -= 1

    for idx in order:
        l, r = queries[idx]
        while R < r: R += 1; add(a[R])
        while L > l: L -= 1; add(a[L])
        while R > r: rem(a[R]); R -= 1
        while L < l: rem(a[L]); L += 1
        ans[idx] = distinct
    return ans

Mo's ordering — queries sorted by (block-of-L, R)

Treap & Implicit Treap

Random priorities make the expected tree depth O(log n). Split/merge preserve BST-key order and heap-priority order simultaneously; since priority is random, height is the same as a randomly-built BST — expected O(log n).

A treap is a BST by key and a heap by a random priority assigned at insert. Priorities make the tree effectively balanced in expectation, without AVL's or red-black's rotation case-work. The primitives are split(t, k) (split into "≤ k" and "> k") and merge(l, r) (l's keys all ≤ r's keys); every other op is phrased on top.

The real interview weapon is the implicit treap: index nodes by subtree size instead of a key, so the tree represents a sequence and split/merge slice and glue subranges in O(log n). This gives O(log n) "rotate subrange [l,r]", "insert at index k", "reverse [l,r]" (with lazy reverse flag) — operations that arrays and linked lists can't do together.

# Implicit treap — sequence with O(log n) split / merge / reverse-range
import random

class Node:
    __slots__ = ("val", "pri", "sz", "l", "r", "rev")
    def __init__(self, v):
        self.val, self.pri, self.sz = v, random.random(), 1
        self.l = self.r = None; self.rev = False

def sz(t): return t.sz if t else 0
def upd(t):
    if t: t.sz = 1 + sz(t.l) + sz(t.r)
def push(t):
    if t and t.rev:
        t.l, t.r = t.r, t.l
        if t.l: t.l.rev ^= True
        if t.r: t.r.rev ^= True
        t.rev = False

def split(t, k):                       # first k go left, rest go right
    if not t: return (None, None)
    push(t)
    if sz(t.l) < k:
        a, b = split(t.r, k - sz(t.l) - 1); t.r = a; upd(t); return (t, b)
    else:
        a, b = split(t.l, k); t.l = b; upd(t); return (a, t)

def merge(a, b):
    if not a or not b: return a or b
    push(a); push(b)
    if a.pri > b.pri:
        a.r = merge(a.r, b); upd(a); return a
    else:
        b.l = merge(a, b.l); upd(b); return b

def reverse_range(root, l, r):
    a, tmp = split(root, l)
    b, c = split(tmp, r - l + 1)
    if b: b.rev ^= True
    return merge(merge(a, b), c)

Implicit treap — reverse([2,4]) = split, flip rev, merge

Module 12

System Design Data Structures

Production-grade data structures that power real systems — LRU cache, consistent hashing, skip lists, LSM trees, and Bloom filters at scale.

LRU Cache

An LRU (Least Recently Used) cache evicts the least recently accessed item when capacity is full. The classic implementation uses a HashMap + Doubly Linked List for O(1) get and put.

Design

HashMap: key → pointer to DLL node. Provides O(1) lookup.
Doubly Linked List: Maintains access order. Most recently used at head, least recently used at tail.
get(key): Look up in map, move node to head, return value. O(1).
put(key, value): If exists, update and move to head. If new and at capacity, evict tail. Insert at head. O(1).

class Node:
    def __init__(self, key=0, val=0):
        self.key, self.val = key, val
        self.prev = self.next = None

class LRUCache:
    def __init__(self, capacity):
        self.cap = capacity
        self.cache = {}             # key -> Node
        self.head = Node()          # dummy head (MRU side)
        self.tail = Node()          # dummy tail (LRU side)
        self.head.next = self.tail
        self.tail.prev = self.head

    def _remove(self, node):
        node.prev.next = node.next
        node.next.prev = node.prev

    def _add_to_head(self, node):
        node.next = self.head.next
        node.prev = self.head
        self.head.next.prev = node
        self.head.next = node

    def get(self, key):
        if key not in self.cache: return -1
        node = self.cache[key]
        self._remove(node)
        self._add_to_head(node)     # mark as recently used
        return node.val

    def put(self, key, value):
        if key in self.cache:
            self._remove(self.cache[key])
        node = Node(key, value)
        self.cache[key] = node
        self._add_to_head(node)
        if len(self.cache) > self.cap:
            lru = self.tail.prev     # evict LRU
            self._remove(lru)
            del self.cache[lru.key]

Figure 12.1 — LRU Cache structure: HashMap maps keys to DLL nodes. Head = MRU, Tail = LRU (eviction target).

Consistent Hashing

Consistent hashing distributes data across servers such that adding or removing a server only remaps ~K/N keys (K = total keys, N = total servers), compared to traditional hashing which remaps nearly all keys.

How It Works

Hash ring: Both servers and keys are hashed onto a circular space [0, 2^32)
Key assignment: Each key is assigned to the first server encountered clockwise on the ring
Adding a server: Only keys between the new server and its predecessor are remapped
Virtual nodes: Each physical server is assigned multiple positions on the ring (e.g., 100-200 vnodes) to ensure even distribution

Figure 12.2 — Consistent hashing ring with virtual nodes. Keys are assigned to the next server clockwise. Virtual nodes ensure even distribution.

Skip Lists

A skip list is a probabilistic alternative to balanced BSTs. It's a multi-level linked list where each level is a "fast lane" that skips over multiple elements.

Properties

Level 0 (bottom): contains all elements in sorted order
Each higher level contains a random subset of the level below (each element is promoted with probability p = 0.5)
Search: start at top-left, go right until overshooting, then drop down one level. Repeat.
Expected height: O(log n). Search, insert, delete: all O(log n) expected.
Used by Redis (sorted sets), LevelDB, and MemSQL

Figure 12.3 — Skip list with 4 levels. Higher levels act as express lanes, enabling O(log n) search.

LSM Trees

Log-Structured Merge Trees are the backbone of modern write-optimized databases. They convert random writes into sequential I/O, achieving write throughput orders of magnitude better than B-Trees.

Write Path

Step 1: Write to a Write-Ahead Log (WAL) for durability (append-only, sequential)
Step 2: Insert into an in-memory sorted structure called a memtable (usually a red-black tree or skip list)
Step 3: When memtable reaches a size threshold, flush it to disk as a sorted, immutable SSTable (Sorted String Table)

Read Path

Step 1: Check the memtable (fastest — in memory)
Step 2: Check the most recent SSTable, then older SSTables
Step 3: Use Bloom filters on each SSTable to skip files that definitely don't contain the key
Step 4: Use sparse index / block index within each SSTable for efficient lookup

Compaction Strategies

Strategy	How It Works	Write Amp	Read Amp	Space Amp	Used By
Size-Tiered (STCS)	Merge similarly-sized SSTables	Lower	Higher	Higher	Cassandra (default)
Leveled (LCS)	Each level is 10x previous; no overlap within level	Higher	Lower	Lower	LevelDB, RocksDB

Figure 12.4 — LSM tree architecture: writes go to WAL + memtable, then flush to SSTables. Reads check memtable first, then disk levels.

Bloom Filters at Scale

A Bloom filter is a space-efficient probabilistic data structure that answers "Is x in the set?" with:

Definitely not in the set (no false negatives — 100% certain)
Probably in the set (false positives possible)

How It Works

A bit array of m bits, with k hash functions. To add an element, hash it k times and set those k bit positions. To query, check all k positions — if any is 0, the element is definitely absent.

Optimal Parameters

False positive rate: p = (1 - e^(-kn/m))^k
Optimal hash count: k = (m/n) · ln(2)
Bits per element for 1% FP rate: ~9.6 bits (with k=7 hash functions)
Bits per element for 0.1% FP rate: ~14.4 bits (with k=10 hash functions)

Applications at Scale

System	Use Case	Why Bloom Filter
LevelDB / RocksDB	Skip SSTables during reads	Avoid expensive disk I/O for absent keys
Cassandra / HBase	Skip partitions without target key	Reduce read amplification
Chrome / Firefox	Safe Browsing malicious URL check	Check millions of URLs with minimal memory
Medium / Akamai	Recommendation deduplication	Don't recommend articles already seen
Bitcoin / Ethereum	SPV clients filter relevant transactions	Light clients don't store full blockchain

⚠

Gotcha: LRU Cache is the most common data structure design question. Know it cold — HashMap + Doubly Linked List, O(1) everything. Practice implementing it from scratch in under 15 minutes. Variants include LFU cache (frequency-based eviction) and TTL-based caches. The follow-up is often "make it thread-safe" (use a read-write lock or concurrent hash map + lock striping).

Under the Hood: LSM Trees Dominate Write-Heavy Workloads

LSM trees fundamentally change the I/O pattern of writes:

B-Trees (e.g., PostgreSQL, MySQL InnoDB): Random writes — updating a row requires reading the page, modifying it, and writing it back. An HDD seeks ~10ms per random I/O. An SSD does ~100μs but still far slower than sequential.
LSM Trees: All writes are sequential appends — WAL append + memtable (RAM) + SSTable flush (sequential write). HDDs can do 100-200 MB/s sequential vs 1 MB/s random. SSDs still benefit: sequential writes avoid write amplification from garbage collection.
The tradeoff: LSM trees sacrifice read performance (must check multiple levels) for dramatically better write performance. This is why they're used by write-heavy systems: Cassandra, HBase, LevelDB, RocksDB, CockroachDB, and TiKV.
Write amplification: LCS has ~10-30x write amplification (each byte written by user causes 10-30 bytes of compaction I/O). STCS has lower write amp (~4-8x) but higher space amp.

Data-structure chooser for system-design interviews

Bloom Filter & Cuckoo Filter

With m bits, n items, k hashes chosen as k = (m/n)·ln 2, false-positive probability ≈ (1 − e^−kn/m)^k. At 1% FPR this is about 9.6 bits/element — independent of item size.

A Bloom filter answers "is x possibly in the set?" with zero false negatives and a tunable false-positive rate. It is the canonical "cache in front of a cache": Bigtable and Cassandra front every SSTable with one so negative reads skip disk; CDN edge nodes use them to decide whether to origin-fetch.

Variants interviewers poke at: Counting Bloom filter replaces each bit with a small counter (4 bits is typical) so you can delete — at 4× the memory. Cuckoo filter stores short fingerprints in a cuckoo hash table: supports delete without counters, lower space at the same FPR for >0.1% rates, and cache-friendlier lookups (2 buckets instead of k hashes).

# Minimal Bloom filter
class Bloom:
    def __init__(self, m, k):
        self.bits = bytearray(m // 8 + 1); self.m = m; self.k = k
    def _idx(self, x, i):
        return (hash((i, x)) & 0x7FFFFFFFFFFFFFFF) % self.m
    def add(self, x):
        for i in range(self.k):
            b = self._idx(x, i); self.bits[b >> 3] |= 1 << (b & 7)
    def __contains__(self, x):
        return all((self.bits[(b := self._idx(x, i)) >> 3] >> (b & 7)) & 1
                   for i in range(self.k))

Bloom filter — insert "cat" with k=3 hashes

HyperLogLog

Count leading-zero runs in m=2^p hashed buckets; the expected maximum run-length of the bucket storing n/m items is log₂(n/m). The harmonic mean of 2^maxZeros across buckets, corrected by a constant, estimates n with relative error ≈ 1.04/√m.

HyperLogLog (HLL) estimates the cardinality (number of distinct elements) of a multiset with absurdly small memory. For 1 billion uniques and <1% error, 12 KB is enough. Redis PFCOUNT, BigQuery's APPROX_COUNT_DISTINCT, and Presto's approx_distinct are all HLL under the hood.

The real magic for distributed systems is the merge operator: two HLLs of the same shape merge element-wise by max across buckets, giving the cardinality of the union. This means per-shard or per-minute HLLs can be merged without revisiting raw events — turning cardinality into a commutative monoid.

import math

class HLL:
    def __init__(self, p=14):                  # m = 2^p buckets
        self.p = p; self.m = 1 << p
        self.buckets = [0] * self.m
    def add(self, x):
        h = hash(x) & 0xFFFFFFFFFFFFFFFF
        idx = h >> (64 - self.p)
        w = (h << self.p) & 0xFFFFFFFFFFFFFFFF | (1 << (self.p - 1))
        self.buckets[idx] = max(self.buckets[idx], (w.bit_length() ^ 64) + 1)
    def estimate(self):
        alpha = 0.7213 / (1 + 1.079 / self.m)        # from the paper
        z = sum(2.0**-b for b in self.buckets)
        return alpha * self.m * self.m / z
    def merge(self, other):                       # union cardinality
        self.buckets = [max(a, b) for a, b in zip(self.buckets, other.buckets)]

HLL merge — element-wise max across buckets

Count-Min Sketch

Width w = ⌈e/ε⌉, depth d = ⌈ln(1/δ)⌉. Each item increments one counter per row; estimate(x) = min over rows. With these sizes, err ≤ ε·(total count) with probability 1 − δ.

Count-Min sketch (CMS) estimates the frequency of an item in a stream in sublinear memory. Insert increments d counters (one per independent hash); query returns the minimum of those d counters. The min step is what kills most of the false-positive overcount from hash collisions.

The classic application is heavy-hitters / top-k over a stream: pair a CMS with a min-heap of the k largest (item, est(item)) seen so far. Every insert updates the sketch, then checks whether the new estimate beats the heap minimum. This runs in O(log k + d) per stream event with O(wd + k) memory.

import math, random

class CountMin:
    def __init__(self, eps=0.001, delta=1e-5):
        self.w = max(2, math.ceil(math.e / eps))
        self.d = max(1, math.ceil(math.log(1 / delta)))
        self.table = [[0] * self.w for _ in range(self.d)]
        self.seeds = [random.getrandbits(32) for _ in range(self.d)]
    def _h(self, i, x):
        return (hash((self.seeds[i], x)) & 0x7FFFFFFF) % self.w
    def add(self, x, c=1):
        for i in range(self.d):
            self.table[i][self._h(i, x)] += c
    def estimate(self, x):
        return min(self.table[i][self._h(i, x)] for i in range(self.d))

Count-Min — d=4 rows, w=8 columns; query takes min across rows

Module 13

Cheat Sheet & Quick Reference

Module summaries, rapid-fire Q&A, master complexity table, and pattern recognition guide — everything you need for last-minute review.

Module Summaries

1. Complexity Analysis

Big-O (upper), Omega (lower), Theta (tight bound)
Growth: O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(2^n) < O(n!)
Master Theorem: T(n) = aT(n/b) + O(n^d)
Amortized: aggregate cost over sequence of operations
Space complexity includes auxiliary + input space

2. Arrays & Strings

Two pointers: opposite ends or same direction
Sliding window: fixed or variable size, track window state
Prefix sums: range sum queries in O(1) after O(n) precompute
KMP: pattern matching in O(n+m) using failure function
Rabin-Karp: rolling hash for O(n) average string matching

3. Linked Lists

Singly, doubly, circular — know tradeoffs
Floyd's cycle detection: slow + fast pointer
Reverse iteratively (3 pointers) or recursively
Merge two sorted lists: O(n+m) time, O(1) space
Dummy head node simplifies edge cases

4. Stacks & Queues

Stack: LIFO — undo, DFS, parentheses matching
Monotonic stack: next greater/smaller element in O(n)
Queue: FIFO — BFS, task scheduling
Deque: double-ended queue, sliding window max
Priority queue = heap (not a queue)

5. Hash Tables

Average O(1) lookup, insert, delete
Chaining vs open addressing (linear/quadratic probing)
Load factor α < 0.75 — resize at threshold
Bloom filter: probabilistic set membership, no false negatives
Good hash: uniform distribution, avalanche effect

6. Trees

Traversals: inorder (BST sorted), preorder (serialize), postorder (delete)
BST: O(log n) avg, O(n) worst (degenerate)
AVL: strict balance (|height diff| ≤ 1), O(log n) guaranteed
Red-Black: relaxed balance, faster inserts than AVL
B-Tree: multi-way, disk-optimized. Trie: prefix matching O(L)

7. Heaps

Min-heap: parent ≤ children. Max-heap: parent ≥ children
Insert: O(log n). Extract: O(log n). Peek: O(1)
Build heap: O(n) NOT O(n log n) — sift-down analysis
Top-K: min-heap of size K. Running median: two heaps
Array-based: parent = i//2, children = 2i, 2i+1

8. Graphs

Adjacency list (sparse, O(V+E)) vs matrix (dense, O(V²))
BFS: shortest path (unweighted), level-order. DFS: cycle detection, topo sort
Dijkstra: O((V+E) log V), no negative edges
Bellman-Ford: O(VE), handles negative edges
Union-Find: near O(1) with path compression + rank

9. Dynamic Programming

Needs: overlapping subproblems + optimal substructure
Memoization (top-down) vs tabulation (bottom-up)
1D: Fibonacci, coin change, LIS, house robber
2D: LCS, edit distance, grid paths, knapsack
Space optimization: rolling array when dp[i] depends on dp[i-1]

10. Sorting & Searching

Merge sort: O(n log n) stable, O(n) space
Quick sort: O(n log n) avg, O(n²) worst, in-place
Counting/Radix: O(n) for bounded integers
Binary search: lo ≤ hi (exact), lo < hi (boundary)
Quick-select: O(n) avg kth element

11. Advanced Topics

Backtracking: build + prune + undo. N-Queens, permutations
Greedy: local optimal → global optimal (prove correctness!)
XOR: a^a=0, a^0=a — find single number
Bit tricks: x&(x-1) clears lowest set bit
GCD: Euclidean O(log n). Sieve: O(n log log n)

12. System Design DS

LRU Cache: HashMap + DLL, O(1) get/put
Consistent hashing: ring + virtual nodes, K/N remap
Skip list: probabilistic O(log n), used by Redis
LSM tree: sequential writes, memtable → SSTable
Bloom filter: ~10 bits/element for 1% false positive rate

Top 50 DSA Rapid-Fire Q&A

#	Question	Answer
1	Time complexity of binary search?	O(log n)
2	Best case of bubble sort?	O(n) — already sorted, with early exit flag
3	Is merge sort stable?	Yes
4	Worst case of quick sort?	O(n²) — sorted input with naive pivot
5	Time to build a heap?	O(n) — not O(n log n)
6	Hash table average lookup time?	O(1)
7	Hash table worst-case lookup?	O(n) — all keys collide into one bucket
8	BST worst-case search?	O(n) — degenerate (linked list shape)
9	AVL tree search time?	O(log n) — always balanced
10	Difference between BFS and DFS?	BFS: level-order (queue). DFS: depth-first (stack/recursion)
11	Which gives shortest path in unweighted graph?	BFS
12	Dijkstra's time complexity?	O((V+E) log V) with binary heap
13	Can Dijkstra handle negative edges?	No — use Bellman-Ford
14	Bellman-Ford time complexity?	O(V · E)
15	What is topological sort used for?	Ordering tasks with dependencies (DAG only)
16	Time complexity of Union-Find with path compression + rank?	O(α(n)) per operation ≈ O(1)
17	What are the two properties needed for DP?	Overlapping subproblems + optimal substructure
18	Memoization vs tabulation?	Memo: top-down + cache. Tab: bottom-up + table fill
19	Time complexity of LCS?	O(m · n)
20	0/1 Knapsack time complexity?	O(n · W) — pseudo-polynomial
21	What makes 0/1 knapsack different from unbounded?	0/1: each item once. Unbounded: unlimited copies
22	Lower bound for comparison-based sorting?	Ω(n log n)
23	When can you sort in O(n)?	Bounded integer range — use counting/radix sort
24	What data structure does Dijkstra use?	Priority queue (min-heap)
25	Stack applications?	Undo, DFS, balanced parentheses, function call stack
26	What is a monotonic stack?	Stack maintaining increasing/decreasing order — next greater element O(n)
27	Trie search time?	O(L) where L = length of word
28	B-Tree vs BST?	B-Tree: multi-way, disk-optimized, O(log_B n) I/O
29	How does Floyd's cycle detection work?	Slow pointer (1 step) + fast pointer (2 steps); they meet inside cycle
30	Reverse a linked list time/space?	O(n) time, O(1) space (iterative)
31	What is a Bloom filter?	Probabilistic set: no false negatives, possible false positives
32	LRU Cache data structures?	HashMap + Doubly Linked List = O(1) get/put
33	What is consistent hashing?	Hash ring distributing keys to servers; adding server remaps only K/N keys
34	Skip list expected search time?	O(log n)
35	What is an LSM tree optimized for?	Write-heavy workloads (sequential I/O)
36	Backtracking vs brute force?	Backtracking prunes invalid branches early; brute force tries everything
37	N-Queens time complexity?	O(n!) worst, O(n!) solutions but heavy pruning in practice
38	When does greedy fail?	When local optimum doesn't lead to global optimum (e.g., 0/1 knapsack)
39	XOR of a number with itself?	0 (a ^ a = 0)
40	Check if n is power of 2?	n > 0 and (n & (n-1)) == 0
41	Time complexity of Sieve of Eratosthenes?	O(n log log n)
42	What is the Master Theorem for?	Solving recurrences of form T(n) = aT(n/b) + O(n^d)
43	Amortized cost of dynamic array append?	O(1) — occasional O(n) resize spread across n operations
44	Difference between AVL and Red-Black tree?	AVL: stricter balance, faster lookups. RB: faster inserts/deletes
45	What is a segment tree used for?	Range queries (sum, min, max) with point updates in O(log n)
46	Quick-select average time?	O(n) — find kth element without full sorting
47	Tim Sort combines which two algorithms?	Merge sort + insertion sort
48	Counting sort space complexity?	O(n + k) where k = range of values
49	How many subsets does a set of size n have?	2^n
50	Edit distance recurrence?	dp[i][j] = min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1] + (s1[i]!=s2[j]))

Master Complexity Table

Data Structure	Access	Search	Insert	Delete	Space
Array	O(1)	O(n)	O(n)	O(n)	O(n)
Sorted Array	O(1)	O(log n)	O(n)	O(n)	O(n)
Linked List	O(n)	O(n)	O(1)*	O(1)*	O(n)
Stack	O(n)	O(n)	O(1)	O(1)	O(n)
Queue	O(n)	O(n)	O(1)	O(1)	O(n)
Hash Table	N/A	O(1) avg / O(n) worst	O(1) avg / O(n) worst	O(1) avg / O(n) worst	O(n)
BST (average)	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
BST (worst)	O(n)	O(n)	O(n)	O(n)	O(n)
AVL Tree	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
Red-Black Tree	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
B-Tree	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
Binary Heap	O(n)	O(n)	O(log n)	O(log n)	O(n)
Trie	N/A	O(L)	O(L)	O(L)	O(N · L · Σ)
Skip List	O(log n)	O(log n)	O(log n)	O(log n)	O(n)

* Linked list insert/delete are O(1) only when you have a reference to the node. Finding the node first is O(n). L = key length, Σ = alphabet size, N = number of keys.

Pattern Recognition Guide

When you see a certain problem pattern, reach for the corresponding technique. This table maps common problem signals to the right approach.

When You See...	Think...	Why
Sorted array	Binary search	O(log n) eliminate half each step
Top K / Kth element	Heap (priority queue)	Min-heap of size K for top-K largest
Frequency count / duplicates	Hash map	O(1) lookup for counting
String matching / pattern	KMP / Rabin-Karp / Trie	O(n+m) instead of O(n·m)
Tree traversal	DFS (recursion or stack)	Natural recursive structure
Shortest path (unweighted)	BFS	Explores in order of distance
Shortest path (weighted)	Dijkstra / Bellman-Ford	Dijkstra if no neg edges, BF otherwise
Connected components	Union-Find / DFS	Group nodes efficiently
Cycle detection	DFS with coloring / Floyd's	Graph coloring or slow/fast pointers
Optimization over subproblems	Dynamic programming	Overlapping subproblems + optimal substructure
All permutations / combinations	Backtracking	Systematic exploration with pruning
Contiguous subarray sum/product	Sliding window / prefix sum	O(n) with running state
Parentheses / brackets / nesting	Stack	LIFO matches nested structure
Next greater / smaller element	Monotonic stack	O(n) with maintained order
Interval scheduling / merging	Sort by end/start + greedy	Greedy choice property holds
Matrix / grid traversal	BFS / DFS / DP	Treat as graph; DP for counting paths
Range queries with updates	Segment tree / BIT	O(log n) query and update
Prefix / autocomplete	Trie	O(L) prefix lookup
Median in stream	Two heaps (max + min)	O(log n) insert, O(1) median
Minimum spanning tree	Kruskal (Union-Find) / Prim (heap)	Kruskal for sparse, Prim for dense

Master complexity table — averaged operations