Programming, problem solving, and algorithms

CPSC 203, 2025 W2

February 10, 2026

Announcements

  • Examlet 2 is this week – check PT to verify
  • Project 1 (Pokémon Data Explorer) released — due March 1

Today

🔍 The Great Data Heist

An escape room adventure to master Python dictionaries!

The Setup

Last night, someone broke into the Vancouver Museum of Data Science.

The thief stole the legendary Golden DataFrame — an artifact said to contain the answers to all data questions.

You’ve been called in as data detectives to solve the case.

Your Tool: The Dictionary

Dictionaries are Python’s key-value lookup structure.

Pattern: Dictionary as Record

This looks familiar! It’s just like our Billboard data:

This is similar to named tuples and dataclasses — keys act like attribute names!

Dictionary Basics

Watch Out: KeyError!

Why Dictionaries?

Lists: Access by position (index)

data[0], data[1], data[2]  # What's at position 0?

Dictionaries: Access by meaningful key

data["name"], data["age"], data["score"]  # Self-documenting!

Dictionaries are also fast — looking up a key takes about the same time whether you have 10 items or 10 million!

The Evidence Room

Before we enter the crime scene, let’s examine the evidence we’ve gathered:

Pattern: Tracking Membership

Use case: Track whether we’ve encountered something before.

defaultdict(bool) returns False for missing keys — no KeyError!

🚪 Room 1: The Intersection

Room 1: The Intersection

The thief visited both Gallery A and Gallery B before the theft.

Your mission: Find everyone who was in BOTH rooms.

The Naive Approach: Check Every Pair

This works… but what if the lists are LONG?

The Problem with Nested Loops

With 10 people in Gallery A and 12 in Gallery B: 10 × 12 = 120 comparisons

With 10,000 in each: 100,000,000 comparisons!

Room 1: Solving with a Dictionary

Why Is This Faster?

  • Step 1: Look at each Gallery A person once → 10 operations
  • Step 2: Look at each Gallery B person once → 12 operations
  • Total: 10 + 12 = 22 operations (not 120!)

Dictionary lookup is instant — it doesn’t matter how many keys are stored!

The Pattern: Set Intersection

from collections import defaultdict
seen_in_first = defaultdict(bool)
for item in first_list:
    seen_in_first[item] = True

in_both = [item for item in second_list if seen_in_first[item]]

Use when: Finding common elements, detecting duplicates

🔓 Room 1 Complete!

Clue unlocked: The suspects who visited both rooms are…

These are our prime suspects!

Pattern: Counting Frequencies

Use case: Count how many times each item appears.

The KeyError Problem

What if we forget to check?

Solution: Use Counter from collections

Counter is a special dictionary that handles counting automatically!

🚪 Room 2: The Cipher

Room 2: The Cipher

You found a mysterious note left by the thief:

WKLV LV WKH FRGH: GLDJRQDO

It appears to be a substitution cipher. The thief replaced each letter with another.

Hint: It’s a Caesar cipher — each letter shifted by the same amount.

Frequency Analysis

In English, some letters appear more often than others:

Letter Frequency
E 12.7%
T 9.1%
A 8.2%
O 7.5%
I 7.0%

If we count the letters in the cipher, the most common one is probably E!

Room 2: Cracking the Code

How Letters Become Numbers

Shifting Letters

The % 26 handles wrap-around: if we go past ‘A’, we loop back to ‘Z’.

Room 2: Trying Different Shifts

🔓 Room 2 Complete!

Clue unlocked: Shift 3 reveals the message!

“THIS IS THE CODE: DIAGONAL” — a hint about the vault!

On Thursday, we’ll learn how to use a mapping dictionary to decode messages more elegantly.

Why Dictionaries Are Fast

With 50,000 items:

  • List search: Check each item one by one → slow!
  • Dictionary lookup: Jump directly to the key → fast!

Dictionary lookups take about the same time whether you have 100 items or 100,000!

Pattern: Finding Complements (Two-Sum)

Use case: Find two numbers that add up to a target.

The Complement Pattern

As we scan, we remember what we’ve seen — and check if the complement exists!

🚪 Room 3: The Vault

Room 3: The Vault

The vault has a two-dial combination lock.

Each dial shows a number, and they must add up to exactly 42.

The thief left behind two torn pieces of paper with numbers:

Find the two numbers (one from each list) that open the vault!

Room 3: Solving It

🔓 Room 3 Complete!

The vault opens, revealing more evidence…

And a note: “The alibi is the key.”

Pattern: Grouping by Category

Use case: Organize items into categories (like pandas groupby!).

Why defaultdict(list)?

Without it, you get a KeyError:

defaultdict(list) automatically creates an empty list for missing keys!

Grouping with Dictionaries

What if our items are dictionaries, not tuples?

Same pattern — just access the key with suspect["alibi"] instead of unpacking a tuple!

🚪 Room 4: The Suspect Board

Room 4: The Suspect Board

We have testimony from five suspects about where they were at 8pm:

Group suspects by their alibi to find who was together!

Room 4: Grouping Suspects

Room 4: Finding the Lone Wolf

🔓 Room 4 Complete!

Clue unlocked: One suspect has an unverified alibi with no witnesses…

Let’s Practice!

Open the Data Heist activity. You’ll solve 4 puzzles with LARGE datasets.

Inspired by Advent of Code puzzles!

PrairieLearn Activity

The Dictionary Patterns

Pattern Use Case Technique
Record Store named fields {"name": ..., "age": ...}
Membership Track what we’ve seen seen[x] = True
Counting Count occurrences Counter(items)
Complement Find pairs Store & check complements
Grouping Organize by category defaultdict(list)

Summary

Dictionaries give you fast lookup by key — whether you have 100 items or 100,000.

The patterns we learned:

  1. Record — Store data with named fields
  2. Membership — Have I seen this before? (defaultdict(bool))
  3. Counting — How many of each? (Counter)
  4. Complement — Find the matching pair
  5. Grouping — Organize by category (defaultdict(list))