Programming, problem solving, and algorithms

CPSC 203, 2025 W2

February 3, 2026

Announcements

Today

  • Warm Up: datetime
  • From dictionaries to DataFrames
  • Filtering, aggregating, and more — in one line!

Warm Up

datetime: a Python library that simplifies date computation.

Objects we need

  • date — a calendar day (year-month-day)
  • timedelta — a number of days you can add/subtract

Core calls

  • today=date.today(): today’s date
  • today.weekday(): an integer 0..6 with Mon=0, … , Sun=6
  • today - timedelta(days=k): the date k days earlier

Ref: https://docs.python.org/3/library/datetime.html

Last Saturday

Let \(w =\) today.weekday() \(\in \{0,\dots,6\}\) and note Saturday \(= 5\).

offset: How many days to step back to reach Saturday.

  • If today is Saturday (\(w=5\)): offset \(=0\)
  • If today is Sunday (\(w=6\)): offset \(=1\)
  • If today is Monday (\(w=0\)): offset \(=2\)

\[ \texttt{offset} = (w - 5)\bmod 7 \]

\[ \texttt{lastSaturday} = \texttt{today} - \texttt{timedelta(days=offset)}. \]

Mon0 Tue1 Wed2 Thu3 Fri4 Sat5 Sun6

Code

Data Frames

Last week: Lists of Dictionaries

We explored Billboard data using loops:

This works, but it’s verbose. Every question needed a loop!

A Better Way: DataFrames

A DataFrame is a 2D table: rows = records, columns = attributes.

DataFrames have built-in operations for filtering, aggregating, and more.

pandas and data frames

import pandas as pd

Create a DataFrame from a list of dictionaries:

df = pd.DataFrame(chartlist)

Or load from a CSV file:

df = pd.read_csv('billboard.csv')

The Payoff: One-Liners!

Task With loops With DataFrames
Count new songs 5+ lines df[df['isNew']].shape[0]
Average peak 4 lines df['peakPos'].mean()
Filter long-running 4 lines df[df['weeks'] > 10]
Find biggest mover 8+ lines df.loc[df['gradient'].abs().idxmax()]

.csv files

If our data is in a .csv file:

df = pd.read_csv('billboard.csv')

.csv files have attribute names in row 1, and data beginning in row 2.

title,artist,rank,last_week,...
Die With A Smile,Lady Gaga,1,1,...
APT.,Rose & Bruno Mars,2,2,...

Selecting rows

pandas cheat sheet section on selecting rows in pandas data frames.

  • df.nlargest(n, col) — top n by column
  • df[df[col] > val] — filter by condition

Adding a column

  • df['new_col'] = expression

    Adds a column to the DataFrame containing the computed value for every row.

  • df[df['new_col'] > val]

    Filter rows by the new column.

Guided Demos

Let’s work through some examples together before you try similar problems.

Demo 1: Counting with a condition

Demo: How many Taylor Swift songs?

Goal: Count songs by a specific artist.

Strategy:

  1. Filter rows where artist contains “Taylor”
  2. Count the resulting rows

Demo: How many songs at #1 peak?

Goal: Count songs that peaked at #1.

Strategy:

  1. Filter where peakPos == 1
  2. Count with len() or .shape[0]

Demo 2: Aggregation (mean, max, min)

Demo: Average weeks on chart

Goal: Find the average weeks on chart.

Strategy:

  • Select a column: df['weeks']
  • Apply aggregation: .mean()

Demo: Average weeks for top 10 only

Goal: Average weeks, but only for top 10 songs.

Strategy:

  1. Filter first: df[df['rank'] <= 10]
  2. Then aggregate: ['weeks'].mean()

Demo 3: Finding extremes with idxmax

Demo: Longest-running song

Goal: Find the song with the most weeks on chart.

Strategy:

  • idxmax() returns the row index of the maximum
  • df.loc[idx] retrieves that row

Demo: Biggest riser this week

Goal: Which song moved up the most?

Problem: New songs have lastPos = 0 (they weren’t on the chart last week).

Strategy:

  1. Filter out new songs first
  2. Compute change: lastPos - rank
  3. Find max with idxmax()

Handling special values

The lastPos = 0 problem

New songs weren’t on the chart last week, so lastPos = 0.

If we compute change = lastPos - rank:

  • New song at #23: 0 - 23 = -23
  • Looks like it fell 23 spots!

Solution: Filter them out first, or use isNew.

Let’s Write Code

Open CA5.1 Billboard, and load STUDENT_nb.py.

PrairieLearn Activity

Resources

https://pymotw.com/2/datetime/

https://www.dataschool.io/best-python-pandas-resources/

https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

https://queirozf.com/entries/pandas-dataframe-plot-examples-with-matplotlib-pyplot