Representation as Encoding — Day 1: Binary & Size

DSCI 220, 2025 W1

October 8, 2025

Announcements

Binary Numbers

Goals

  • Read/write binary.
  • Reason about bit length and capacity.
  • Prove: \(b\) bits result in \(2^b\) codes, with max value \(2^b-1\).
  • Overflow.

DON’T FORGET THE MAGIC TRICK!

Place Value

  Consider an encoding into binary:

\[(b_{k-1}\dots b_1 b_0)_2=\sum_{i=0}^{k-1} b_i 2^i,\quad b_i\in\{0,1\}\]

We are not flummoxed by this, because:

\[(d_{k-1}\dots d_1 d_0)_{10}=\sum_{i=0}^{k-1} d_i 10^i,\quad d_i\in\{0,1,\ldots,9\}\]

Quantify the Length

 

How many unique values can we store in \(n\) bits?

 

 

How many bits to represent \(m\) values?

Capacity Quickies

 

How many codes with…

  • 8 bits?
  • 12 bits?

 

How many bits to count to…

  • \(500\)?
  • \(5000\)?
  • \(1000000\)?

Proof Moment

Claim: With \(b\) bits we can represent \(2^b\) values.

Technical Aside

  • int (pure Python): arbitrary precision (variable width)

  • NumPy / Pandas arrays: fixed-width datatypes (int8, int32, int64, uint64, …)

Demo:

Overflow

Consider addition, with 4 bits:


 


 

Punchline: Overflow occurs when the number you want to represent is too big to fit into the space you’ve reserved for it.

Signed Integers


Unsigned integers are represented by their encodings as discussed above.


\(b\) bits typically represent values \([0..2^b-1]\).


Signed integers have a different encoding (that we won’t discuss) but we can still guess the values that can be encoded:


\(b\) bits typically represent values ____________.

Activity

For each, compute bits needed or max value:

  1. Seats in a 500-seat hall.
  2. 12-bit unsigned max.
  3. World pop (\(\approx 8\cdot10^9\) people) bits for unique IDs.
  4. Timestamps for next \(100\) years at 1-second resolution.

Exit Ticket

In words: what does \(\lceil\log_2 N\rceil\) tell you?