Randomized Algorithms
* Occupancy problems 3.1, 3.6,
2.2.2, skim appendices
* Lower bounds for randomized algorithms.
* Spencer's graduation/tenure game.
==============================================================================
Occupancy problems:
------------------
Say we put m balls randomly into n boxes: what can we say
about the
distribution that comes up? Motivation behind universal hashing.
Let's look at some things we can say, and also how much
independence
we are using to say them. Let m=n for simplicity.
Let B_i be the size of the ith bin. X_j = indicator for ball j
falling into ith bin.
What is E[B_i]? 1
What is Var[B_i]?
1 - 1/n
-> Var[B_i] = sum_j Var[X_j] = n*(1/n -
1/n^2) = 1 - 1/n.
What does a single bin look like, distribution of bins?
Pr[get zero balls] = (1 - 1/n)^n = (approx) 1/e
-> expect 1/e fraction of bins to be empty
Pr[exactly 1 ball] = n(1/n)(1-1/n)^{n-1} = (approx) 1/e
Pr[exactly 2 balls]= (n choose 2)(1/n^2)(1-1/n)^{n-2} =
(approx) 1/2e
Pr[get exactly i balls]=(n choose i)(1/n)^i(1-1/n)^{n-i}
<= 1/(e*i!)
What can we say about largest bin? 1/(e*i!) = 1/n^c at i
= O(log(n)/loglog(n)).
So, with high probability, largest bin has size at most
O(log(n)/loglog(n)).
In fact, we expect some bin to be that full. (pick i so have 1/n
chance that bin has >= i balls. Prob that bin i is at least this full
given that previous bins are not this full is only
larger. -> martingales)
-> Note: this argument uses full
independence. What if we only have
pairwise independence. What can we say about size of largest bin?
Chebyshev: sigma is about 1, so Pr[bin size < t] <
1/t^2. Expected
maximum is at most O(sqrt(n)).
Q:
Can this happen, say with matrix-based hashing? Say we
hash
all of {0,1}^n into {0,1}^n, using a random n x n matrix???
Coupon-collector's problem
--------------------------
What is the expected number of balls until to fill all
bins?
1 way to analyze: likely all filled after n*ln(n) + cn:
prob a given
bin is empty after k*n balls is 1/e^k, so at k = ln(n) +
c, this is 1/n^c.
or, let X_i be the time to fill a new bin when there are
i empty ones
remaining.
Total time X = sum_i X_i. Each new ball as i/n chance of
filling a new bin, so X_i is geometrically distributed
and E[X_i] = n/i.
So, expected time is Sum_i n/i = n*H_n.
Note: at this point, w.h.p., largest bin has only O(log
n) balls.
Lower bounds for randomized algorithms
======================================
Consider some problem we want to solve, like
sorting. What does it
mean to give a lower bound on performance?
for
all alg A, exists input I s.t. C(A,I) > b.
Can think of as game between algorithm designer and
adversary, where
alg has to go first.
Ex. "comparison-based sorting": output is some
permutation of input,
and only access to input is via comparing two values in
input.
Theorem: any deterministic comparison-based sorting alg
must perform
Omega(n log n) comparisons.
Proof: If there exists two different permutations of
input set that yeild
the same sequence of results to the comparisons made,
then sorter
fails.
1. There
are n! different possible permutations to start with.
2. Each
comparison splits set of still-possible permutations into
two pieces. Pick outcome
corresponding to larger piece.
3. Size of
set drops by at most a factor of 2 each time, and sorter
can't finish until size=1.
So number of steps is at least lg(n!)
= Omega(n log n).
Specific
input is one following this path.
What about randomized algorithms? View randomized alg as
a
distribution on deterministic algs A(x,y). Then we want to say that for any
distribution on deterministic algs, there exist an input
such that
expected cost is high.
1) for
all dist Q, exists I such that E[C(A_Q, I)] > b
Often easiest way to do this is to have adversary go
first, and be
randomized, and assume alg is deterministic. That is, to show:
2) exists
dist P such that for all A, E[C(A,I_P)] > b
Can you see why this is sufficient?
-> look
at E[C(A_Q, I_P)], for some Q and for P of (2). (2) says
this must be larger than b, since its larger for all A,
and if for
some Q this is larger than b, then clearly exists I s.t.
it's larger
than b.
(Using fact that if E[X] > y, then exists legal value of X
that is > y)
In fact, in the case where algs=circuits, so have finite
number, and
any distrib is OK, then min-max theorem of game theory
says
max_P
min_A E[C(A, I_P)] == min_Q max_I E[C(A_Q, I)]
adv goes
first
alg goes first
Apply to sorting: adversary picks random
permutation. So, asking:
what is a lower bound on the depth of a random leaf in a
binary tree
of n! leaves.
At most 1/2 of leaves at depth less than lg(n!), so
average depth is at least 1/2 of that.
Apply to hashing
----------------
Theorem: for any family H of functions h: M-->N, (|M|
= m, |N| = n),
there exist x != y in M s.t.
Prob[
h(x) = h(y) ] >= 1/n - 1/m.
h<-H
Proof: show that for any h, for random x!=y,
Prob[
h(x) = h(y) ] >= 1/n - 1/m.
x!=y
B_j
= number of points in M that map to j.
Number of ordered
pairs
<x,y> (x != y) that collide is:
Sum_j
B_j(B_j - 1) = Sum_j(B_j)^2 - m.
Minimum
occurs when all B_j are equal.
Why?
No
randomness, but consider random variable X = B_j, j=1,...,n
with
equal probability. Variance = Avg[(B_j)^2] - Avg[B_j]^2
Notice
that variance can't be negative.
So,
Avg[(B_j)^2]
>= (m/n)^2. Minimized when all
equal (variance=0).
So,
Sum_j B_j(B_j - 1) >= n(m/n)^2 - m = m^2(1/n - 1/m).
There
are m(m-1) ordered pairs x,y, so the probability that a
random pair collides is
at least:
[m^2/(m(m-1))]
* (1/n - 1/m) >= 1/n - 1/m.
Spencer's "graduation game"
---------------------------
Begin with some set of people at various positions.
2 players: Paul (partitioner) splits people into two
groups
Carole (chooser) selects one
group to be removed, other advances.
Start:
people at various years until graduation.
Paul wins if
somebody
graduates, Carole wins if nobody graduates.
[Note: similar game: 20 questions with k lies
allowed. questioner
splits the set, and the chooser gives answer. Those that do not agree
with answer move up one. position of element corresponds to # of lies
if element is the correct one. questioner's goal is to reduce the
number of elements below the k-boundary to 1 as quickly
as possible.]
Question: given a starting configuration, who wins? Note: this is a
full information 2-player game. Let's say that at the start, there
are a_k people at distance k from graduation.
Theorem 1: If sum_k a_k / 2^k < 1, then Carole has a
winning strategy.
Theorem 2: If sum_k a_k / 2^k >= 1, then Paul has a
winning strategy.
Proof of theorem 1.
Suppose Carole plays randomly.
What is the
expected number of people that reach the goal? For instance, for
fixed person who is initially at level k, what is the
probability that
this person reaches the goal? Therefore, no matter what Paul's
strategy is, a random Carole has a non-zero chance of
winning.
Therefore, Paul cannot have a winning strategy and so
Carole does.
Actually, we can figure out Carole's winning
strategy. Any ideas?