← All blueprints

Expander Graphs and Their Applications

3 Random walks on expander graphs

Lemma 42 Jensen’s inequality
#

For a concave \(\varphi \) and a random variable \(X\), \(\mathbb {E}[\varphi (X)]\le \varphi (\mathbb {E}X)\).

Lemma 43 Markov’s inequality
#

For a nonnegative random variable \(X\) and \(a{\gt}0\), \(\Pr [X\ge a]\le \mathbb {E}X/a\).

Definition 44 3.1, Random walk

A random walk on \(G\) is the Markov chain \((X_0,X_1,\dots )\) with \(X_0\) drawn from an initial distribution and \(X_{i+1}\) uniform among the neighbours of \(X_i\); this induces distributions \(\pi _i\) on \(V\).

Definition 45 Normalized adjacency matrix

For \(d\)-regular \(G\), \(\hat A=\tfrac 1d A\). The uniform vector is \(\mathbf u=(1,\dots ,1)/n\).

Claim 46 Properties of \(\hat A\)

\(\hat A\) is the transition matrix of the random walk; it is symmetric and doubly stochastic; its eigenvalues are \(\hat\lambda _i=\lambda _i/d\), so \(\hat\lambda _1=1\) and \(\max (|\hat\lambda _2|,|\hat\lambda _n|)\le \alpha \); \(\hat A^t\) is the \(t\)-step transition matrix; and \(\hat A\mathbf u=\mathbf u\).

Proof

\(A\) is symmetric with constant row sum \(d\), so \(\hat A=\tfrac 1dA\) is symmetric with row/column sums \(1\). Scaling preserves eigenvectors and divides eigenvalues by \(d\); powers give \(t\)-step probabilities, and \(\hat A\mathbf u=\mathbf u\) since rows sum to \(1\).

Definition 47 Entropies
#

For a distribution \(\mathbf p\) on \([n]\): Shannon entropy \(H(\mathbf p)=-\sum _ip_i\log p_i\); Rényi \(2\)-entropy \(H_2(\mathbf p)=-2\log \| \mathbf p\| _2\); min-entropy \(H_\infty (\mathbf p)=-\log \| \mathbf p\| _\infty \).

Lemma 48 3.4, \(\ell _2\) shrinkage per step

For a probability vector \(\mathbf p\) in an \((n,d,\alpha )\)-graph, \(\| \hat A\mathbf p-\mathbf u\| _2\le \alpha \| \mathbf p-\mathbf u\| _2\le \alpha \).

Proof

\(\hat A\mathbf u=\mathbf u\) and \(\mathbf p-\mathbf u\perp \mathbf u\), so it lies in the span of eigenvectors with \(|\hat\lambda |\le \alpha \). Thus \(\| \hat A\mathbf p-\mathbf u\| _2=\| \hat A(\mathbf p-\mathbf u)\| _2 \le \alpha \| \mathbf p-\mathbf u\| _2\le \alpha \).

Theorem 49 3.3, \(\ell _2\) convergence

For any distribution \(\mathbf p\) and \(t\ge 1\), \(\| \hat A^t\mathbf p-\mathbf u\| _2\le \alpha ^t\).

Proof

Induct on \(t\) using Lemma 48, noting \(\hat A^{t-1}\mathbf p\) is again a distribution.

Theorem 50 3.2, \(\ell _1\) convergence

For any distribution \(\mathbf p\) and \(t\ge 1\), \(\| \hat A^t\mathbf p-\mathbf u\| _1\le \sqrt n\, \alpha ^t\).

Proof

Cauchy–Schwarz gives \(\| x\| _1\le \sqrt n\| x\| _2\); apply with \(x=\hat A^t\mathbf p-\mathbf u\).

Proposition 51 3.5, Entropy ordering

\(H_\infty (\mathbf p)\le H_2(\mathbf p)\le 2H_\infty (\mathbf p)\).

Proof

Since \(\| \mathbf p\| _\infty \ge \| \mathbf p\| _2^2\ge \| \mathbf p\| _\infty ^2\), take \(-\log \).

Theorem 52 Rényi \(2\)-entropy increases under a walk

Writing \(\mathbf p=\mathbf u+\mathbf f\), \(\mathbf f\perp \mathbf u\), \(\mu =\| \mathbf f\| /\| \mathbf p\| \), one has \(H_2(\hat A\mathbf p)\ge H_2(\mathbf p)-\log (1-(1-\alpha ^2)\mu ^2)\).

Proof

\(\hat A\mathbf p=\mathbf u+\hat A\mathbf f\) with \(\hat A\mathbf f\perp \mathbf u\), so \(\| \hat A\mathbf p\| ^2=\| \mathbf u\| ^2+\| \hat A\mathbf f\| ^2 \le (1-(1-\alpha ^2)\mu ^2)\| \mathbf p\| ^2\). Take \(-\log \).

Definition 53 Projection onto a set
#

For \(B\subseteq V\), \(P=P_B\) is the diagonal \(0/1\) matrix with \(P_{ii}=1\) iff \(i\in B\).

Lemma 54 3.7, Confined-walk probability

Let \((B,t)\) be the event that a uniformly-started length-\(t\) walk stays in \(B\). Then \(\Pr [(B,t)]=\| (P\hat A)^tP\mathbf u\| _1\).

Proof

The entries of \((P\hat A)^tP\) count walks confined to \(B\); summing against \(\mathbf u=\tfrac 1n\mathbf1\) gives the probability over a uniform start.

Lemma 55 3.8, Projection contraction

\(\| P\hat A P\mathbf v\| _2\le (\beta +\alpha )\| \mathbf v\| _2\) for all \(\mathbf v\), where \(\beta =|B|/n\).

Proof

WLOG \(\mathbf v=P\mathbf v\ge 0\), \(\sum v_i=1\), \(\mathbf v=\mathbf u+\mathbf z\), \(\mathbf z\perp \mathbf u\). Then \(\| P\hat A\mathbf v\| _2\le \| P\mathbf u\| _2+\| P\hat A\mathbf z\| _2\); Cauchy–Schwarz on the \(\le \beta n\)-support gives \(\| P\mathbf u\| _2\le \beta \| \mathbf v\| _2\), and \(\| P\hat A\mathbf z\| _2\le \alpha \| \mathbf z\| _2\le \alpha \| \mathbf v\| _2\).

Theorem 56 3.6, AKS confinement bound

For \(B\) of density \(\beta \) in an \((n,d,\alpha )\)-graph, \(\Pr [(B,t)]\le (\beta +\alpha )^t\).

Proof

By Lemma 54, \(\Pr [(B,t)]=\| (P\hat A)^tP\mathbf u\| _1\le \sqrt n\| (P\hat A P)^t\mathbf u\| _2\); iterate Lemma 55 and use \(\sqrt n\| \mathbf u\| _2=1\).

Theorem 57 3.9, AKS two-sided bound
#

If \(\beta {\gt}6\alpha \) then \(\beta (\beta +2\alpha )^t\ge \Pr [(B,t)]\ge \beta (\beta -2\alpha )^t\).

Theorem 58 3.10, Time-dependent confinement

For \(K\subseteq \{ 0,\dots ,t\} \) and \(B\) of density \(\beta \), \(\Pr [X_i\in B\ \forall i\in K]\le (\beta +\alpha )^{|K|-1}\).

Proof

Insert projections only at steps in \(K\), with free \(\hat A\) propagation between; \(|K|\) projection factors give the bound.

Theorem 59 3.11, Varying-set confinement

For sets \(B_0,\dots ,B_t\) of densities \(\beta _i\), \(\Pr [X_i\in B_i\ \forall i]\le \prod _{i=0}^{t-1}(\sqrt{\beta _i\beta _{i+1}}+\alpha )\).

Proof

As in Theorem 56, with per-step bound \(\| P_{i+1}\hat A P_i\mathbf v\| _2\le (\sqrt{\beta _i\beta _{i+1}}+\alpha )\| \mathbf v\| _2\).

Definition 60 \(\mathbf{BPP}\)
#

\(\mathcal L\in \mathbf{BPP}\) if a randomized polynomial-time \(A\) using \(k\) bits errs (on every \(x\)) with probability \(\le \beta \le \tfrac 1{10}\).

Theorem 61 Randomness-efficient one-sided error reduction

For \(\mathcal L\in \mathbf{RP}\) with bad set of density \(\beta \), the AND over a length-\(t\) walk on a \((2^k,d,\alpha )\)-graph errs with probability \(\le (\beta +\alpha )^t\), using \(k+O(t)\) random bits.

Proof

The reduced algorithm errs iff the walk stays in the bad set, i.e. event \((B,t)\), bounded by \((\beta +\alpha )^t\); the walk uses \(k\) bits for the start and \(O(1)\) per step.

Theorem 62 Randomness-efficient two-sided error reduction

For \(\mathcal L\in \mathbf{BPP}\) with \(\alpha +\beta \le \tfrac 18\), the majority vote over a length-\(t\) walk errs with probability \(O(2^{-t/2})\), using \(k+O(t)\) random bits.

Proof

Failure means some \(K\) with \(|K|\ge (t+1)/2\) has all \(X_i\in B\); by Theorem 58 each has probability \(\le (\beta +\alpha )^{(t-1)/2}\), and a union bound over \(\le 2^t\) sets \(K\) gives \(O(2^{-t/2})\).

Definition 63 Clique and clique number

A clique is a set of pairwise-adjacent vertices; \(\omega (G)\) is the maximum clique size.

Theorem 64 3.12, FGL clique hardness

There are constants \(1{\gt}\delta _1{\gt}\delta _2{\gt}0\) such that it is \(\mathbf{NP}\)-hard to decide whether \(\omega (G)\le \delta _2n\) or \(\omega (G)\ge \delta _1n\).

Construction 65 Tensor-power graph \(H\)

Given \(n\)-vertex \(G\) and \(t=\log n\), \(H\) has vertex set \(V^t\); two tuples are adjacent iff the union of their coordinates induces a clique in \(G\).

Claim 66 3.14.1, Clique structure of \(H\)

Every clique of \(H\) lies in some \(S^t\) with \(S\) a maximal clique of \(G\); in particular \(\omega (H)=\omega (G)^t\).

Proof

If \(S\) is a clique then \(S^t\) is a clique of \(H\), so \(\omega (H)\ge \omega (G)^t\); conversely the vertices appearing in a clique \(S'\) of \(H\) form a clique \(S\) with \(|S'|\le |S|^t\).

Lemma 67 3.14, Clique inapproximability, \(\mathbf{NP}\subseteq \mathbf{RP}\)

If a polynomial-time \(A\) satisfies \(n^{-\epsilon }\le A(G)/\omega (G)\le n^{\epsilon }\) on every graph, then \(\mathbf{NP}\subseteq \mathbf{RP}\).

Proof

Sample \(m=\mathrm{poly}(n)\) random vertices of \(H\), induce \(H'\), run \(A\), threshold at \(\tfrac 12\delta _1^tm\). If \(\omega (G)\ge \delta _1n\), a clique of \(H\) of size \((\delta _1n)^t\) meets \(H'\) in \(\ge \tfrac 12\delta _1^tm\) vertices w.h.p.; if \(\omega (G)\le \delta _2n\), each of \(\le 2^n\) maximal cliques contributes \(\le 2\delta _2^tm\) w.h.p. (Chernoff). With \(t=\log n\) the gap is polynomial, separating the cases, so \(A\) decides the problem of Theorem 64.

Construction 68 Derandomized sampling via walks

Replace random \(t\)-tuples by all length-\((t-1)\) walks of an \((n,d,\alpha )\)-graph \(\mathcal G\) on \(V(G)\); then \(m=nd^{t-1}\).

Claim 69 3.15, Derandomized upper bound

If \(\omega (G)\le \delta _2n\) then \(\omega (H')\le (\delta _2+2\alpha )^tm\).

Proof

Cliques of \(H'\) are walks confined to a clique of density \(\le \delta _2\); Theorem 57 caps the confinement probability.

Claim 70 3.16, Derandomized lower bound

If \(\omega (G)\ge \delta _1n\) then \(\omega (H')\ge (\delta _1-2\alpha )^tm\).

Proof

A clique of density \(\ge \delta _1\) retains a \((\delta _1-2\alpha )^t\) fraction of walks by Theorem 57.

Theorem 71 3.13, Clique inapproximability, \(\mathbf{NP}=\mathbf{P}\)

There is \(\epsilon {\gt}0\) such that a polynomial-time \(A\) with \(n^{-\epsilon }\le A(G)/\omega (G)\le n^{\epsilon }\) on every graph implies \(\mathbf{NP}=\mathbf{P}\).

Proof

Derandomize Lemma 67 via Construction 68; Claims 6970 separate the cases deterministically.