Randomized Algorithms
* more on expanders and eigenvalues
* approximate counting via rapid mixing: overview
==============================================================================
Previously gave some intuition about why expanders have
an eigenvalue
gap.
Actually, reverse proof is much easier. An in fact, some of the
constructions explicitly create graphs with an eigenvalue
gap. So,
could have defined expanders as "graphs with an
eigenvalue gap" and
then said "here's a neat expansion property that
they have".
Theorem: if G is d-regular, then for all |W| <= n/2,
|N(W)
- W| >= |W|*(d-lambda)/2d
lambda = second-largest eigenvalue of A(G).
Proof: Will prove there are a lot of edges between W and
V-W. In particular:
E
= E(W,V-W) >= |W|*|V-W|*(d-lambda)/n
where E(A,B) = # edges between A and B. This implies what we want because:
Since |W|<=n/2 we have e(W,V-W) >=
|W|*(d-lambda)/2. Now just use the
fact that each node in V-W has at most d neighbors in W.
Define vector f such that f_v are equal (and positive)
over v in W,
f_v are equal (and negative) over v in V-W, and the sum
of entries in
f is 0. E.g., f_v = |V-W| for v in W, f_v =
-|W| for v in V-W.
If we write f in terms of eigenvectors, it has zero
component along
largest since sum of entries is 0.
IDEA: in one step of random walk, f has to shrink since
there's an
eigenvalue gap, but the only way for f to shrink is to
miz between W
and V-W, which means there have to be a lot of
edges. Formally:
f = c_2e_2 + ... + c_ne_n
Af =
c_2*lambda_2*e_2 + ... + c_n*lambda_n*e_n.
Now, take dot product with f.
Af.f =
(c_2)^2*lambda_2 + ... + (c_n)^2*lambda_n <= lambda_2*(f.f)
RHS is lambda_2*sum_v (f_v)^2
LHS is sum_v [ f_v * [sum_{u in N(v)} f_u] ]. If f_v was constant,
this
would just be d*sum_v[(f_v)^2].
But f is not exactly
constant,
so instead we have:
d*sum_v[(f_v)^2] - E*[|V-W|^2 + |W|^2 + 2|V-W|*|W|]
= d*sum_v[(f_v)^2] - E*n^2.
and voila.
---------------------------------------------------------------------------
For the other direction, I still don't know an easy
proof, but here's
another piece of intuition. Turns out that can reduce the
proof to the
following probabilistic puzzle.
Consider a probability distribution P on a regular
degree-d graph G.
Say we pick two vertices independently according to P.
Let A = Prob(the two vertices are neighbors)
Let B = Prob(the two vertices are the same)
Claim 1: for any P, A/B <= d.
E.g., if P is uniform, then A=d/n and B=1/n so A/B=d.
Argue
that any other P just decreases the ratio?
Follows
from (P_u)^2 + (P_v)^2 >= 2*P_u*P_v.
Sum
over all edges. LHS gives d*B, RHS
gives A.
Claim 2: suppose we require P_v = 0 on at least half of
the vertices
of G. Then,
if G is an expander, A/B <= d - c for some constant c>0.
Proving this
is the hard part. Some intuition:
say P was flat on the
non-zero
part W. In this case, B =
1/|W|. A = d/|W| - (# edges
leaving W)
So, A/B = d
- (# edges leaving W)/|W| = d-c since graph is an
expander. I
don't know an intuitive proof in general.
Let's just see why Claim 2 implies what we want: namely,
lambda_2 <= d - c.
Let f = eigenvector for 2nd largest eigenvalue
lambda. WLOG, f is
positive on at most half of entries (else look at -f).
Define g so
that g_v = f_v if f_v > 0 and g_v = 0 otherwise. Let's scale g so
that the sum of entries is 1. g is going to be the probability dist P.
We know A(G)f = lambda f. Let's take dot product of both sides with g.
Get: lambda = [A(G)f.g]/[f.g]
Denominator = B
Numerator = Sum_v( g_v * (sum_{u in N(v)} f_u) ) <= A
(replacing
negative entries with zero only increases this)
So, all one has to do is prove Claim 2.
I'm still thinking about an intuitive proof of this. Extra credit if
you can find one....
=============================================================================
Approximate counting
--------------------
There are a lot of interesting problems that fall into
the following
framework.
We have some universe U of points, some subset S of which
are special.
We want to know |S| (or in a continuous domain, the
volume of S), or equivalently, the probability that a
randomly drawn
point from U belongs to S.
E.g., S = some polygonal object in the plane, U = a
bounding rectangle.
What is the volume of S?
E.g., U = strings of length n, S = those that satisfy a
given DNF formula F.
How many solutions are there to F?
One easy thing to do: pick a bunch of samples in U, and
observe how
many fall into S (assume we can verify whether a point is
in S in poly time),
and then take the ratio.
But, what if |S|/|U| is small? In this case, this does not
approximate |S| very well. Can we do better?
In fact, many of these counting problems are
#P-hard. E.g., reporting
the number of solutions to a given DNF formula is
#P-hard. Computing
the permanent of a matrix (which for A(G) corresponds to
asking how many
perfect matchings there are in G) is #P-hard.
Goal: fpras: for any epsilon>0, produce an answer A
such that
A
is in [(1-epsilon)|S|, (1+epsilon)|S|] in time poly in
n
and 1/epsilon. (n = description
length of points in U).
Idea: suppose we can create a sequence of
polynomially-many sets:
S=S_0
contained in S_1 contained in ... S_m ... contained in U
where
|S_i|/|S_{i-1}| is at most poly(n).
Then, we can reduce the
counting
problem to the random sampling problem:
--> pick
a random point from S_1, and see if it's in S. Since
|S|/|S_1|
is at least 1/poly, we can get a good relative-error
estimate
of this ratio. Say |S| =
p_1*|S_1|. We find q_1 with
(1-epsilon/m)p_1
< q_1 < (1+epsilon/m)p_1.
Then do for |S_1|/|S_2|,
etc.
Finally,
output q_1*q_2*...*q_m*|U| which is at least
(1-epsilon)*|S|
and at most (1+2epsilon)*|S|.
So, if we can set up this sequence of sets, then the
problem boils
down to: how do we pick a random element of S_i? We'll do this by
setting up some appropriate neighborhood structure and
doing a random
walk, and showing that this walk is rapidly mixing.