Method
- Run
two copies of Markov chain Xt, Yt
- Each
considered in isolation is a copy of MC (that is, both have MC
distribution)
- but they are not independent: they make dependent
choices at each step
- in
fact, after a while they are almost certainly the same
- Start Yt in
stationary distribution, Xt
anywhere
- Coupling
argument:
|
Pr[Xt = j]
|
=
|
Pr[Xt = j | Xt
= Yt]Pr[Xt = Yt] + Pr[Xt = j | Xt Yt]Pr[Xt Yt]
|
|
|
|
=
|
Pr[Yt = j]Pr[Xt
= Yt] + Pr[Xt = j | Xt Yt]
|
|
- So
just need to make
(which is
r.p.d.) small enough.
n-bit Hypercube walk: at each step, flip
random bit to random value
- At
step t, pick a random bit b,
random value v
- both
chains set but b to value v
- after O(n log n) steps,
probably all bits matched.
Counting k colorings when k > 2
+ 1
- The
reduction from (approximate) uniform generation
- compute
ratio of coloring of G to coloring of G - e
- Recurse
counting G - e
colorings
- Base
case kn
colorings of empty graph
· Bounding
the ratio:
o note
G - e colorings
outnumber G colorings
o By
how much? Let L colorings in difference (u and v same color)
o to
make an L coloring a G
coloring, change u to one of k -
=
+ 1
legal colors
o Each
G-coloring arises at most one way from this
o So
each L coloring has at least
+ 1 neighbors unique to them
o So
L is 1/(
+ 1)
fraction of G.
o So
can estimate ratio with few samples
· The
chain:
o Pick
random vertex, random color, try to recolor
o loops,
so aperiodic
o Chain
is time-reversible, so uniform distribution.
· Coupling:
o choose
random vertex v (same for both)
o based
on Xt and Yt, choose
bijection of colors
o choose
random color c
o apply
c to v in Xt (if can), g(c)
to v in Yt (if can).
o What
bijection?
§ Let
A be vertices that agree in color, D that disagree.
§ if
v
D, let g
be identity
§ if
v
A, let N
be neighbors of v
§ let
CX be colors
that N has in X
but not Y (X
can't use them at v)
§ let
CY similar,
wlog larger than CX
§ g should swap each CX with some CY, leave other colors fixed. Result: if X doesn't change, Y doesn't
· Convergence:
o Let
d'(v) be number of neighbors of v
in opposite set, so
d'(v) =
d'(v) = m'
- Let
= | D|
- Note
at each step,
changes by 0,±1
- When
does it increase?
- v must be in A,
but move to D
- happens
if only one MC accepts new color
- If c not in CX or CY, then g(c) = c and both
change
- If c
CX,
then g(c)
CY so neither moves
- So
must have c
CY
- But
| CY|
d'(v), so probability this happens is

.
= 
- When
does it decrease?
- must
have v
D, only one moves
- sufficient
that pick color not in either neighborhood of v,
- total
neighborhood size 2
, but that counts the d'(v) elements of A twice.
- so
Prob.

.
= 
+ 
- Deduce
that expected change in
is difference of
above, namely
- 
= - a
.
- So
after t steps, E[
]
(1 - a)t
(1 - a)tn.
- Thus,
probability
> 0 at
most (1 - a)tn.
- But
now note a > 1/n2, so n2log n steps reduce to one over
polynomial chance.
Note: couple depends on state, but who cares
- From
worm's eye view, each chain is random walk
- so,
all arguments hold
Counting vs. generating:
- we
showed that by generating, can count
- by
counting, can generate:
PRAM
- P processors, each with a RAM, local registers
- global
memory of M locations
- each
processor can in one step do a RAM op or read/write to one global memory
location
- synchronous
parallel steps
- various
conflict resolutions (CREW, EREW, CRCW)
- not
realistic, but explores ``degree of parallelism''
Randomization in parallel:
- load
balancing
- symmetry
breaking
- isolating
solutions
Classes:
- NC:
poly processor, polylog steps
- RNC:
with randomization. polylog runtime, monte carlo
- ZNC:
las vegas NC
- immune
to choice of conflict resolution
Practical observations:
- very
little can be done in o(log n) with poly processors
- lots
can be done in
(log n)
- often
concerned about work which is processors times
time
- algorithm
is ``optimal'' if work equals best sequential
Basic operations
Quicksort in parallel:
- n processors
- each
takes one item, compares to splitter
- count
number of predecessors less than splitter
- determines
location of item in split
- total
time O(log n)
- combine:
O(log n) per layer with n
processors
- problem:
(log2n) time bound
- problem:
n log2n
work
Parallel recursion:
- paradigm:
reduce problem size from n to
in O(log n) time.
- total
time O(log n + log
+ ... ) = O(log n)
More processors:
- n2 processors
- do
all comparisons
- count
number of items smaller than me: O(log n)
- put
into place
- result: O(log n) time with n2 processors
- or, O(n) time with n processors
BoxSort:
- n processors
- Choose
random splitters
- sort
in O(log n) time
- insert
items in splitters: O(log n) time
- solve
each piece separately, recursively
Intuition:
- expected
subproblem size O(
)
- so
expected time spent on a branch is O(log n) as above
- problem:
many branches: need high probability result.
- solution:
analyze each path, show O(log n) time whp
- thus
max path is O(log n)
High probability:
- consider
item x
- claim
splitter within

on each side
- since
prob. not at most (1 -

/n)
e- 
- fix
, d < 1/
- define
= dk
- define
= n
- note
size
problem takes
log n
time
- argue
at most dk
size-
problems whp
- deduce
runtime
dk
=
(d
)klog n = O(log n)
- note:
as problem shrinks, allowing more divergence in quantity for whp result
- minor
detail: ``whp'' dies for small problems
- OK:
if problem size log n, finish in log n time with log n
processors