Discrete Math for Data Science

DSCI 220, 2025 W1

November 4, 2025

Announcements

Regular Expressions and DFAs

Closure Tricks

What’s closure?

  • Complement: make the DFA total (add SINK if needed), then flip accept to non-accept, and vice-versa.
  • Union / Intersection:
    States are pairs (p,q)
    • Union accepts if p ∈ F1 or q ∈ F2
    • Intersection accepts if p ∈ F1 and q ∈ F2
  • Difference: L1 - L2 = \(L1\cap L2^c\)

Exercise

Build a DFA for the union of:

  • L1: contains 11 (use the machine we built)
  • L2: ends with 00 (three states: T, T0, T00)

Hint: sketch product states (S_i, T_j) and mark accept pairs for union (or).

Non-deterministic FA

NFAs

 

 

 

DFA: exactly one transition for each input symbol in each state.
NFA: may have multiple choices on a symbol and \(\varepsilon\)-moves (move without consuming input).

Acceptance: a string is accepted if there exists a path that reads all characters and lands in an accept state.

Notes on NFAs

  • Direct from a regex: Translate patterns piece-by-piece without tricky state discovery.

  • Local “gadgets”: Build tiny NFAs for literal, union (\(|\)), concatenation, Kleene star (\(^*\)) and wire together with \(\varepsilon\).

  • NFAs and DFAs have the same power!! (Kleene’s Theorem)

  • Build a DFA from NFA algorithmically if you need one.

Regex to NFA Gadgets

  1. Literal ‘a’
a s f
  1. Union: R | S
s f [ R ] [ S ] ε ε ε ε
  1. Concatenation: RS
s f [ R ] [ S ] ε
  1. Kleene star: R*
s f [ R ] ε (empty) ε ε ε

Example 1

^(ab|ba)*$

Example 2

^(0|1)*11(0|1)*$

Example 3

Task: Sketch an NFA for ^1(01|10)*0$ using only the four gadgets.

Plan: literal 1 … union(01,10) … star … literal 0.

Last Puzzle:

\(0^n1^n\)

Beyond Regular: Context Free Grammars

Why Regex/DFAs Can’t Do \(0^n1^n\)

Language: \(\{\,0^n1^n : n\ge 0\,\}\) (equal 0s followed by equal 1s)

Finite-state limitation:

Assume a DFA exists that accepts \(0^n1^n\), and let \(k\) denote its number of states.

Among the \(k+1\) prefixes \(0^0, 0^1, \dots, 0^k\) two must land in the same state.

From those two prefixes, feed the same number of 1s. One string should accept, and the other should not.

CFG Review

A context-free grammar (CFG) has:

  • Terminals (alphabet),
  • Nonterminals (variables),
  • Start symbol,
  • Productions like \(S \to 0S1 \mid \varepsilon\).

Notes:

  • A derivation rewrites nonterminals to produce a string.
  • A parse tree is evidence of how a string was generated.
  • The class of Regular Languages \(\subset\) the class of Context-Free Languages

Grammar for \(0^n1^n\) ?

Grammar:
\(S \to 0S1 \ \mid\ \varepsilon\)

Derivations:

  • For 01: \[ S \Rightarrow 0S1 \Rightarrow 01 \]
  • For 000111: \[ S \Rightarrow 0S1 \Rightarrow 00S11 \Rightarrow 000S111 \Rightarrow 000111 \]

Parse tree shape for 0011:

Closure Facts

If \(L_1, L_2\) are context-free:

  • Union: context-free via new start \(S \to S_1 \mid S_2\)
  • Concatenation: context-free via \(S \to S_1 S_2\)
  • Kleene star: context-free via \(S \to SS \mid \varepsilon\)

(Intersection with regular languages is CF; general intersection/complement are not.)

JSON as a Grammar

A Basic JSON CFG

https://www.json.org/json-en.html

We ignore lexical details and focus on structure.

  • Value → Object | Array | String | Number | true | false | null
  • Object → { Members } | {}
  • Members → Pair | Pair , Members
  • Pair → String : Value
  • Array → [ Elements ] | []
  • Elements → Value | Value , Elements

Example: Is {"a":[1,2,{"b":null}]} in the language?