CPSC 203, 2025 W2
February 10, 2026
🔍 The Great Data Heist
An escape room adventure to master Python dictionaries!
Last night, someone broke into the Vancouver Museum of Data Science.
The thief stole the legendary Golden DataFrame — an artifact said to contain the answers to all data questions.
You’ve been called in as data detectives to solve the case.

Dictionaries are Python’s key-value lookup structure.
This looks familiar! It’s just like our Billboard data:
This is similar to named tuples and dataclasses — keys act like attribute names!
Lists: Access by position (index)
Dictionaries: Access by meaningful key
Dictionaries are also fast — looking up a key takes about the same time whether you have 10 items or 10 million!
Before we enter the crime scene, let’s examine the evidence we’ve gathered:
Use case: Track whether we’ve encountered something before.
defaultdict(bool) returns False for missing keys — no KeyError!
The thief visited both Gallery A and Gallery B before the theft.
Your mission: Find everyone who was in BOTH rooms.
This works… but what if the lists are LONG?
With 10 people in Gallery A and 12 in Gallery B: 10 × 12 = 120 comparisons
With 10,000 in each: 100,000,000 comparisons!
Dictionary lookup is instant — it doesn’t matter how many keys are stored!
Use when: Finding common elements, detecting duplicates
Clue unlocked: The suspects who visited both rooms are…
These are our prime suspects!
Use case: Count how many times each item appears.
What if we forget to check?
Counter is a special dictionary that handles counting automatically!
You found a mysterious note left by the thief:
WKLV LV WKH FRGH: GLDJRQDO
It appears to be a substitution cipher. The thief replaced each letter with another.
Hint: It’s a Caesar cipher — each letter shifted by the same amount.
In English, some letters appear more often than others:
| Letter | Frequency |
|---|---|
| E | 12.7% |
| T | 9.1% |
| A | 8.2% |
| O | 7.5% |
| I | 7.0% |
If we count the letters in the cipher, the most common one is probably E!
The % 26 handles wrap-around: if we go past ‘A’, we loop back to ‘Z’.
Clue unlocked: Shift 3 reveals the message!
“THIS IS THE CODE: DIAGONAL” — a hint about the vault!
On Thursday, we’ll learn how to use a mapping dictionary to decode messages more elegantly.
With 50,000 items:
Dictionary lookups take about the same time whether you have 100 items or 100,000!
Use case: Find two numbers that add up to a target.
As we scan, we remember what we’ve seen — and check if the complement exists!
The vault has a two-dial combination lock.
Each dial shows a number, and they must add up to exactly 42.
The thief left behind two torn pieces of paper with numbers:
Find the two numbers (one from each list) that open the vault!
The vault opens, revealing more evidence…
And a note: “The alibi is the key.”
Use case: Organize items into categories (like pandas groupby!).
Without it, you get a KeyError:
defaultdict(list) automatically creates an empty list for missing keys!
What if our items are dictionaries, not tuples?
Same pattern — just access the key with suspect["alibi"] instead of unpacking a tuple!
We have testimony from five suspects about where they were at 8pm:
Group suspects by their alibi to find who was together!
Clue unlocked: One suspect has an unverified alibi with no witnesses…
Open the Data Heist activity. You’ll solve 4 puzzles with LARGE datasets.
Inspired by Advent of Code puzzles!
| Pattern | Use Case | Technique |
|---|---|---|
| Record | Store named fields | {"name": ..., "age": ...} |
| Membership | Track what we’ve seen | seen[x] = True |
| Counting | Count occurrences | Counter(items) |
| Complement | Find pairs | Store & check complements |
| Grouping | Organize by category | defaultdict(list) |
Dictionaries give you fast lookup by key — whether you have 100 items or 100,000.
The patterns we learned:
defaultdict(bool))Counter)defaultdict(list))