DSCI 220, 2025 W1
October 19, 2025
Let \(U\) be all rows of a dataframe df.
A_mask = P_A(row).
Key Ideas
TrueA_mask.sum()((A & ~B).sum()==0)A Boolean mask over rows is an indicator vector \(\in\{0,1\}^{|U|}\).
Let \(U\) be all rows in df.
A_mask = P_A(row) then A_df = df[A_mask]We’ll use these café predicates:
\(A=\) “Iced drinks”
\(A=\{x\in U \mid x.\mathrm{is\_iced}=\text{True}\}\)
A = df['is_iced']
\(B=\) “Non-dairy milk (Oat or Almond)”
\(B=\{x\in U \mid x.\mathrm{milk}\in\{\text{'Oat','Almond'}\}\}\)
B = df['milk'].str.strip().isin(['Oat','Almond'])
\(C=\) “High caffeine (≥150 mg)”
\(C=\{x\in U \mid x.\mathrm{caffeine\_mg}\ge 150\}\)
C = df['caffeine_mg'] >= 150
\(D=\) “Low calorie (≤150 cal)”
\(D=\{x\in U \mid x.\mathrm{calories} \le 150\}\)
D = df['calories'] <= 150
Let \(A,B\subseteq U\).
| Operation | Math (set) | Predicate (logic) | Dataframe (mask) |
|---|---|---|---|
| Union | \(A\cup B\) | \(\{x: P_A(x)\lor P_B(x)\}\) | (A | B) |
| Intersection | \(A\cap B\) | \(\{x: P_A(x)\land P_B(x)\}\) | (A & B) |
| Difference | \(A-B\) | \(\{x: P_A(x)\land \neg P_B(x)\}\) | (A & ~B) |
| Complement | \(A^c\) | \(\{x: \neg P_A(x)\}\) | ~A |
| Symm. diff. | \(A\triangle B\) | \(\{x:(P_A\lor P_B)\land \neg(P_A\land P_B)\}\) | A ^ B |
Precedence note: use &, |, ~ with parentheses: (A & B) | (~C).
Let
\(A=\{x\mid x.\mathrm{is\_iced}\}\) and
\(B=\{x\mid x.\mathrm{milk}\in\{\text{Oat, Almond}\}\}\).
(df['is_iced']) | (df['milk'].str.strip().isin(['Oat','Almond']))Experiment using the code below, and then complete the observation:
For any sets \(A\), and \(B\): ___________ = ___________
Experiment using the code below, and then complete the observation:
For any sets \(A\) and \(B\): \(|A\cup B| =\) _____________
Two sets are equal if they contain exactly the same elements. Think of a way to test set equality using set operations.
Are sets \(X\) and \(Y\) equal?
\(A\subseteq B \iff \forall x, P_A(x)\Rightarrow P_B(x)\iff\) ((A & ~B).sum()==0)
Are any of our sets \(A\), \(B\), \(C\), or \(D\) subsets of one another?
Fill in the Venn Diagram using the values for \(|U|\), \(|A|\), \(|B|\), \(|A\cup B|\). Verify your counts are correct by finding \(|A-B|\), \(|A\cap B|\), \(|B-A|\), and \(|U-(A\cup B)|\), via code.
Give upper and lower bounds on \(|A\cap B|\) for any \(A,B\subseteq U\).
__________________ \(\leq |A\cap B|\leq\) __________________