CPSC 203, 2025 W2
February 3, 2026
datetime: a Python library that simplifies date computation.
Objects we need
date — a calendar day (year-month-day)timedelta — a number of days you can add/subtractCore calls
today=date.today(): today’s datetoday.weekday(): an integer 0..6 with Mon=0, … , Sun=6today - timedelta(days=k): the date k days earlierLet \(w =\) today.weekday() \(\in \{0,\dots,6\}\) and note Saturday \(= 5\).
offset: How many days to step back to reach Saturday.
\[ \texttt{offset} = (w - 5)\bmod 7 \]
\[ \texttt{lastSaturday} = \texttt{today} - \texttt{timedelta(days=offset)}. \]
We explored Billboard data using loops:
This works, but it’s verbose. Every question needed a loop!
A DataFrame is a 2D table: rows = records, columns = attributes.
DataFrames have built-in operations for filtering, aggregating, and more.
import pandas as pd
Create a DataFrame from a list of dictionaries:
Or load from a CSV file:
| Task | With loops | With DataFrames |
|---|---|---|
| Count new songs | 5+ lines | df[df['isNew']].shape[0] |
| Average peak | 4 lines | df['peakPos'].mean() |
| Filter long-running | 4 lines | df[df['weeks'] > 10] |
| Find biggest mover | 8+ lines | df.loc[df['gradient'].abs().idxmax()] |
.csv filesIf our data is in a .csv file:
.csv files have attribute names in row 1, and data beginning in row 2.

df.nlargest(n, col) — top n by columndf[df[col] > val] — filter by conditiondf['new_col'] = expression
Adds a column to the DataFrame containing the computed value for every row.
df[df['new_col'] > val]
Filter rows by the new column.
Let’s work through some examples together before you try similar problems.
Goal: Count songs by a specific artist.
Strategy:
Goal: Count songs that peaked at #1.
Strategy:
peakPos == 1len() or .shape[0]Goal: Find the average weeks on chart.
Strategy:
df['weeks'].mean()Goal: Average weeks, but only for top 10 songs.
Strategy:
df[df['rank'] <= 10]['weeks'].mean()Goal: Find the song with the most weeks on chart.
Strategy:
idxmax() returns the row index of the maximumdf.loc[idx] retrieves that rowGoal: Which song moved up the most?
Problem: New songs have lastPos = 0 (they weren’t on the chart last week).
Strategy:
lastPos - rankidxmax()The lastPos = 0 problem
New songs weren’t on the chart last week, so lastPos = 0.
If we compute change = lastPos - rank:
0 - 23 = -23Solution: Filter them out first, or use isNew.
Open CA5.1 Billboard, and load STUDENT_nb.py.
https://pymotw.com/2/datetime/
https://www.dataschool.io/best-python-pandas-resources/
https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
https://queirozf.com/entries/pandas-dataframe-plot-examples-with-matplotlib-pyplot