Programming, problem solving, and algorithms

CPSC 203, 2025 W2

February 5, 2026

Announcements

  • Lab 5 is this week

Today

Asking harder questions:

  • Working with multiple weeks of data
  • Visualizing song trajectories
  • Using .isin() to filter by a list

Today’s challenge…

Given a year of charts,

  • What do the paths of the number ones look like?

  • Do most songs debut at the top? How fast do they fall?

We can quantify these things, and analyze them, but a picture will help us see if there’s anything interesting in the data.

Sketch

Draw a very loose sketch of the paths through the charts of all the songs that hit #1.

  • What are some reasonable axes?

  • Imagine what the path of one song looks like.

  • How many paths are there for all songs?

Key steps:

  1. Assemble the data we want to analyze.
  2. Find the list of Number 1 songs.
  3. For each of those songs, make a list of their ranks over time.
  4. Plot each song’s trajectory!

Step 1: One Week vs. One Year

Tuesday: single snapshot (100 songs, 1 week)

Step 1: Structure of Multi-Week Data

Each row is one song on one week’s chart:

We can track a song’s journey through the charts!

Step 2: Which Songs Hit #1?

First, filter to rows where rank == 1:

Step 2: Get Unique Titles

Many songs stay at #1 for multiple weeks. We want the unique song titles:

Step 3: The Problem

We have a list of song titles that hit #1.

Now we want all rows for those songs — their complete chart history.

How do we filter for existence in a list?

Step 3: .isin() — Filter by List Membership

.isin(list) returns True for rows where the value appears in the list.

Step 4: Plot one Line Per Song

We want to plot rank over time for each #1 song.

  • X-axis: _________
  • Y-axis: _________ (note: _________)
  • One line per song

Step 4: Reshaping for Plotting

Our data looks like this (one row per song per week):

date title rank
2025-01-04 Song A 1
2025-01-04 Song B 15
2025-01-11 Song A 3
2025-01-11 Song B 1

But plotting wants one column per song:

date Song A Song B
2025-01-04 1 15
2025-01-11 3 1

Step 4A: Breaking Down the Chain

groupby organizes rows into groups — here, one group per (date, title) pair.

Step 4B: Extract the rank

Now we have a Series with a multi-level index (date, title).

.sum() might seem odd — but each group has exactly one row, so sum just extracts that value.

Step 4C: Pivot with unstack()

unstack() takes the inner index level (title) and makes it into columns.

Now each column is a song, each row is a date!

Step 4D: The Plot

Let’s Write Code

Open the Billboard (Visualization) activity, and load STUDENT_viz_nb.py.

PrairieLearn Activity

Discussion

  • What patterns do you see in the #1 songs?
  • Do most songs debut at the top?
  • How quickly do they typically fall?
  • Any songs with unusual trajectories?

Part 2: Staying Power

The Question

Does a strong debut mean a song will stick around longer?

Or do slow climbers have more staying power?

Let’s investigate with a scatter plot!

What We Need

To answer this, we need two pieces of info for each song:

  1. Debut position — where did it first appear on the chart?
  2. Total weeks — how long did it stay on the chart?

Finding Debut Position

The isNew column is True when a song first appears:

Finding Total Weeks

The weeks column shows how many weeks a song has been on the chart.

We want the maximum for each song:

Combining the Data

Now we need to merge debut position with staying power:

The Scatter Plot

What Do You See?

  • Is there a correlation between debut position and staying power?
  • Any outliers — songs that debuted low but stayed forever?
  • Or songs that debuted at #1 but disappeared quickly?

Summary

Concept Code
Filter by condition df[df['rank'] == 1]
Get unique values df['title'].unique()
Filter by list df[df['title'].isin(list)]
Reshape for plotting .groupby([...]).unstack()

Resources

https://pymotw.com/2/datetime/

https://www.dataschool.io/best-python-pandas-resources/

https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

https://queirozf.com/entries/pandas-dataframe-plot-examples-with-matplotlib-pyplot