Binary Search

Week 3, Tuesday (Video)

January 20, 2026

The Twenty Questions Puzzle

The Game

I’m thinking of a number between 0 and 63.

You can ask yes/no questions.

What’s your guaranteed winning strategy?

How Many Questions?

Start with \(n = 64\) possibilities (0 to 63).

After each question: \(n \to n/2\)

Questions needed: \(\log_2(64) = 6\)

Let’s Play!

I’m thinking of a number between 0 and 63.

You can ask yes/no questions.

The Problem

Given a sorted array and a target value, find the target’s index (or determine it’s not there).

0 1 2 3 4 5 6 7 8 9
2 5 8 12 16 23 38 56 72 91

Find target = 23.

Linear Search: The Naive Approach

def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i
    return -1

Worst case: \(\Theta(n)\)

Binary Search: The Clever Approach

def binary_search(arr, target):
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1
    return -1

Worst case: \(\Theta(\log n)\)

The 17-Year Bug

“I’ve assigned binary search in courses at Bell Labs and IBM… and I’ve studied their solutions… In the history of this exercise, only about ten percent of professional programmers have gotten the program right.”

— Jon Bentley, Programming Pearls (1986)

The first correct binary search was published in 1946.

The first bug-free implementation? 1962.

Proving Binary Search Correct

The Invariant

Invariant: If target is in arr, then target is in arr[lo:hi+1].

Equivalently: We haven’t excluded any index where target could be.

Base Case (Initialization)

Before the first iteration: lo = 0, hi = len(arr) - 1

arr[lo:hi+1] = arr[0:len(arr)] = entire array

If target is in arr, it’s in arr[lo:hi+1]. ✓

Key Fact: mid is Always Valid

Since lo <= hi (loop condition) and mid = (lo + hi) // 2:

\[\texttt{lo} \leq \texttt{mid} \leq \texttt{hi}\]

This means arr[mid] is always a valid access inside the loop.

Inductive Step: Case 1 (Found it)

arr[mid] == target → we return mid.

Since lo <= mid <= hi, we know mid is in the search range.

We found the target and return its index. ✓

Inductive Step: Case 2

arr[mid] < target

Setting lo = mid + 1 excludes only indices where target can’t be. ✓

Inductive Step: Case 3

arr[mid] > target

Setting hi = mid - 1 excludes only indices where target can’t be. ✓

Termination (Exit via lo > hi)

The loop exits when lo > hi.

At this exit point, arr[lo:hi+1] is empty.

The invariant says: if target is in arr, it’s in this empty range.

But nothing is in an empty range.

Therefore: target is not in arr. Return -1. ∎

The Proof in One Slide

Theorem: binary_search(arr, target) returns the index of target if present, else -1.

Proof: By induction on iteration count.

  • Invariant: If target ∈ arr, then target ∈ arr[lo:hi+1]
  • Base case: Initially, full array satisfies invariant ✓
  • Inductive step: Each case either exits correctly or safely shrinks range ✓
  • Termination: At exit, invariant \(\Rightarrow\) correct answer ∎

Why the Bug?

Common Mistakes

# Bug 1: Wrong comparison
while lo < hi:  # Should be lo <= hi
    ...

# Bug 2: Wrong update
lo = mid        # Should be mid + 1
hi = mid        # Should be mid - 1

# Bug 3: Integer overflow (in C/Java)
mid = (lo + hi) / 2  # Can overflow!
# Fix: mid = lo + (hi - lo) / 2

The Java Bug (2006)

// This was in java.util.Arrays for 9 years!
int mid = (low + high) / 2;  // OVERFLOW BUG

When low + high > Integer.MAX_VALUE, this overflows.

Fix:

int mid = low + (high - low) / 2;

The Invariant Saves You

When you’re unsure about an edge case, ask:

“Does this maintain the invariant?”

  • lo = mid + 1: Does this exclude only impossible locations? Yes.
  • hi = mid - 1: Does this exclude only impossible locations? Yes.
  • lo <= hi: When does the search space become empty? When lo > hi.

Analysis

Worst-Case Running Time

Each iteration:

  • Constant work (comparisons, arithmetic)
  • Range shrinks by half: \((hi - lo + 1) \to \lfloor(hi - lo + 1)/2\rfloor\)

How many halvings until range is empty?

\(\log_2(n)\)

Worst case: \(\Theta(\log n)\)

The Scale Comparison

\(n\) Linear Search Binary Search
1,000 1,000 10
1,000,000 1,000,000 20
1,000,000,000 1,000,000,000 30

Racing Them

The Race

The Price of Speed

The Precondition

Binary search requires the array to be sorted.

What if it’s not sorted?

Summary

What We Learned

  1. Binary search finds a target in \(\Theta(\log n)\) time (vs \(\Theta(n)\) for linear)

  2. The invariant: “If target exists, it’s in arr[lo:hi+1]

  3. The proof: Init, Maintenance (3 cases), Termination

  4. The bugs: Off-by-one errors, infinite loops, integer overflow

  5. The precondition: Array must be sorted

Wednesday

Recursion: When the function calls itself.

  • Thinking recursively
  • Binary search, recursively
  • The fractal coastline puzzle

Questions?

Binary search: \(\Theta(\log n)\)

The invariant makes it correct.

The halving makes it fast.