Binary Search

Week 3, Tuesday (Video)

January 20, 2026

The Twenty Questions Puzzle

The Game

I’m thinking of a number between 0 and 63.

You can ask yes/no questions.

What’s your guaranteed winning strategy?

How Many Questions?

Start with \(n = 64\) possibilities (0 to 63).

After each question: \(n \to n/2\)

Questions needed: \(\log_2(64) = 6\)

Let’s Play!

I’m thinking of a number between 0 and 63.

You can ask yes/no questions.

Binary Search

The Problem

Given a sorted array and a target value, find the target’s index (or determine it’s not there).

0	1	2	3	4	5	6	7	8	9
2	5	8	12	16	23	38	56	72	91

Find target = 23.

Linear Search: The Naive Approach

def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i
    return -1

Worst case: \(\Theta(n)\)

Binary Search: The Clever Approach

def binary_search(arr, target):
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1
    return -1

Worst case: \(\Theta(\log n)\)

Tracing Binary Search

0	1	2	3	4	5	6	7	8	9
2	5	8	12	16	23	38	56	72	91

Find target = 23:

lo=0, hi=9, mid=4 → arr[4]=16 < 23 → lo=5
lo=5, hi=9, mid=7 → arr[7]=56 > 23 → hi=6
lo=5, hi=6, mid=5 → arr[5]=23 ✓ → return 5

The 17-Year Bug

“I’ve assigned binary search in courses at Bell Labs and IBM… and I’ve studied their solutions… In the history of this exercise, only about ten percent of professional programmers have gotten the program right.”

— Jon Bentley, Programming Pearls (1986)

The first correct binary search was published in 1946.

The first bug-free implementation? 1962.

Proving Binary Search Correct

The Invariant

Invariant: If target is in arr, then target is in arr[lo:hi+1].

Equivalently: We haven’t excluded any index where target could be.

Base Case (Initialization)

Before the first iteration: lo = 0, hi = len(arr) - 1

arr[lo:hi+1] = arr[0:len(arr)] = entire array

If target is in arr, it’s in arr[lo:hi+1]. ✓

Key Fact: `mid` is Always Valid

Since lo <= hi (loop condition) and mid = (lo + hi) // 2:

\[\texttt{lo} \leq \texttt{mid} \leq \texttt{hi}\]

This means arr[mid] is always a valid access inside the loop.

Inductive Step: Case 1 (Found it)

arr[mid] == target → we return mid.

Since lo <= mid <= hi, we know mid is in the search range.

We found the target and return its index. ✓

Inductive Step: Case 2

arr[mid] < target

Setting lo = mid + 1 excludes only indices where target can’t be. ✓

Inductive Step: Case 3

arr[mid] > target

Setting hi = mid - 1 excludes only indices where target can’t be. ✓

Termination (Exit via `lo > hi`)

The loop exits when lo > hi.

At this exit point, arr[lo:hi+1] is empty.

The invariant says: if target is in arr, it’s in this empty range.

But nothing is in an empty range.

Therefore: target is not in arr. Return -1. ∎

The Proof in One Slide

Theorem: binary_search(arr, target) returns the index of target if present, else -1.

Proof: By induction on iteration count.

Invariant: If target ∈ arr, then target ∈ arr[lo:hi+1]
Base case: Initially, full array satisfies invariant ✓
Inductive step: Each case either exits correctly or safely shrinks range ✓
Termination: At exit, invariant \(\Rightarrow\) correct answer ∎

Why the Bug?

Common Mistakes

# Bug 1: Wrong comparison
while lo < hi:  # Should be lo <= hi
    ...

# Bug 2: Wrong update
lo = mid        # Should be mid + 1
hi = mid        # Should be mid - 1

# Bug 3: Integer overflow (in C/Java)
mid = (lo + hi) / 2  # Can overflow!
# Fix: mid = lo + (hi - lo) / 2

The Java Bug (2006)

// This was in java.util.Arrays for 9 years!
int mid = (low + high) / 2;  // OVERFLOW BUG

When low + high > Integer.MAX_VALUE, this overflows.

Fix:

int mid = low + (high - low) / 2;

The Invariant Saves You

When you’re unsure about an edge case, ask:

“Does this maintain the invariant?”

lo = mid + 1: Does this exclude only impossible locations? Yes.
hi = mid - 1: Does this exclude only impossible locations? Yes.
lo <= hi: When does the search space become empty? When lo > hi.

Analysis

Worst-Case Running Time

Each iteration:

Constant work (comparisons, arithmetic)
Range shrinks by half: \((hi - lo + 1) \to \lfloor(hi - lo + 1)/2\rfloor\)

How many halvings until range is empty?

\(\log_2(n)\)

Worst case: \(\Theta(\log n)\)

The Scale Comparison

\(n\)	Linear Search	Binary Search
1,000	1,000	10
1,000,000	1,000,000	20
1,000,000,000	1,000,000,000	30

Racing Them

The Race

The Price of Speed

The Precondition

Binary search requires the array to be sorted.

What if it’s not sorted?

When to Use Binary Search

Use binary search when:

Data is sorted (or can be sorted once, searched many times)
Random access is \(O(1)\) (arrays, not linked lists)
You need to find a specific value

Don’t use when:

Data is unsorted and searched only once
Data structure doesn’t support random access
You need to find all occurrences

Summary

What We Learned

Binary search finds a target in \(\Theta(\log n)\) time (vs \(\Theta(n)\) for linear)
The invariant: “If target exists, it’s in arr[lo:hi+1]”
The proof: Init, Maintenance (3 cases), Termination
The bugs: Off-by-one errors, infinite loops, integer overflow
The precondition: Array must be sorted

Wednesday

Recursion: When the function calls itself.

Thinking recursively
Binary search, recursively
The fractal coastline puzzle

Questions?

Binary search: \(\Theta(\log n)\)

The invariant makes it correct.

The halving makes it fast.

Binary Search

The Twenty Questions Puzzle

The Game

How Many Questions?

Let’s Play!

Binary Search

The Problem

Linear Search: The Naive Approach

Binary Search: The Clever Approach

Tracing Binary Search

The 17-Year Bug

Proving Binary Search Correct

The Invariant

Base Case (Initialization)

Key Fact: mid is Always Valid

Inductive Step: Case 1 (Found it)

Inductive Step: Case 2

Inductive Step: Case 3

Termination (Exit via lo > hi)

The Proof in One Slide

Why the Bug?

Common Mistakes

The Java Bug (2006)

The Invariant Saves You

Analysis

Worst-Case Running Time

The Scale Comparison

Racing Them

The Race

The Price of Speed

The Precondition

When to Use Binary Search

Summary

What We Learned

Wednesday

Questions?

Key Fact: `mid` is Always Valid

Termination (Exit via `lo > hi`)