Discrete Math for Data Science

DSCI 220, 2025 W1

September 29, 2025

Announcements

Examlet this week.

Truth and reconciliation tomorrow – Tue lab will not meet.

Reminder: Joe’s office hour is Fri 11-12, in this room!

We are shooting video this afternoon, 4p via zoom. You are all welcome!

Summations

Trick 0: Warm Up Handshakes

Story. If each of us (\(n+1\)) were to shake hands w everyone else, how many handshakes would there be?

Problem.

\(S(n) = \sum\limits_{k=1}^{n} k\)

 

 

 

We’ll need this, and we’ll also need:

\(S(n)=\sum\limits_{k=1}^{n} k^2 = \frac{n(n+1)(2n+1)}{6}\)

Trick 1: Linearity

 

Story. Processing n data rows:

  • each row has a constant overhead of 7 units,
  • plus 3 units per column, with column count = row index.

Problem.

\(S(n)=\sum\limits_{i=1}^{n} (7 + 3i)\)

Trick 2: Telescoping


Story. Model accuracy at epoch \(i\):

  • improvement = \(\ln(i+1) - \ln(i)\).

Problem.
\(S(n)=\sum_{i=1}^{n} \big(\ln(i+1)-\ln(i)\big)\)

Trick 3: Shifting

 

Story. Each day your batch size doubles (like a viral share),
but the annotation time per item grows linearly with the day index. What’s the total work after \(n\) days?

Problem.
\(S_n =\sum_{k=1}^{n} k\,2^{k}\)

Trick 3: Shifting

Trick 4: Variable Substitution

 

Story. Each item \(j\) has effort that depends both on its index and how many remain.

Problem.
\(S = \sum_{j=1}^{n} \big(7 + 3(n-j+1)\big)\)

Trick 5: Reindexing

 

Story. You only log every 3rd data point (rows 3, 6, 9, …).
The cost for row \(i\) is \(2i+1\).

Question. What is the total cost up to row \(n\)?

Trick 6: Double Sum

Story. Each item \(i\) must be compared with all later items \(j\ge i\). The pairwise cost is \(c_j = 4j+1\) (depends only on the later index).

Problem.
\(T = \displaystyle \sum_{i=1}^{n}\;\sum_{j=i}^{n}\,(4j+1)\)