8 Explicit constructions of expander codes

Lemma 34 Lemma 15: distance from eigenvalue separation

If \(\mathcal{S}\) is a linear code of rate \(r\), block length \(d\), and minimum relative distance \(\epsilon \), and \(B\) is the edge-vertex incidence graph of a \(d\)-regular graph with second-largest eigenvalue \(\lambda \), then the code \(\mathcal{C}(B,\mathcal{S})\) has rate at least \(2r-1\) and minimum relative distance at least

\[ \left(\frac{\epsilon -\lambda /d}{1-\lambda /d}\right)^2. \]

Proof ▶

Let \(G\) be the \(d\)-regular graph from which \(B\) is derived and \(n\) its number of vertices, so \(\mathcal{C}(B,\mathcal{S})\) has \(dn/2\) variables and \(n\) constraints. The rate bound \(2r-1\) follows as in Theorem 23 (here \(c=2\)): \(cr-(c-1)=2r-1\).

For the distance, consider a nonzero codeword and let \(X\) be the set of \(\gamma n\) constraints (vertices of \(G\)) incident to the support variables. By Lemma 21, the support edges lie in a subgraph containing at most \(\tfrac {dn}{2}\bigl(\gamma ^2 + \tfrac {\lambda }{d}(\gamma -\gamma ^2)\bigr)\) edges, so the average number of support variables per constraint in \(X\) is

\[ \frac{2\cdot \frac{dn}{2}\bigl(\gamma ^2+\frac{\lambda }{d}(\gamma -\gamma ^2)\bigr)}{\gamma n} = d\! \left(\gamma + \Bigl(\tfrac {\lambda }{d}\Bigr)(1-\gamma )\right). \]

If \(d\bigl(\gamma +(\lambda /d)(1-\gamma )\bigr) {\lt} \epsilon d\), i.e.

\begin{equation} d\Bigl(\gamma + (\lambda /d)(1-\gamma )\Bigr) {\lt} \epsilon d, \label{eq:lem15-key} \end{equation}

then some constraint sees fewer than \(\epsilon d\) support variables, so its restriction is a nonzero word of weight below the minimum distance of \(\mathcal{S}\) – impossible. Inequality 1 holds for \(\gamma {\lt} (1-\lambda /d)/(\epsilon -\lambda /d)\cdot (\epsilon -\lambda /d)\); solving for the relative weight \(\gamma ^2+(\lambda /d)(\gamma -\gamma ^2)\) shows \(\mathcal{C}(B,\mathcal{S})\) has no nonzero codeword of relative weight \(\bigl((\epsilon -\lambda /d)/(1-\lambda /d)\bigr)^2\) or less.

Lemma 35 Lemma 17: progress of one parallel round

Let \(\mathcal{S}\) be a linear code of rate \(r\), block length \(d\), and minimum relative distance \(\epsilon \), and let \(B\) be the edge-vertex incidence graph of a \(d\)-regular graph with second-largest eigenvalue \(\lambda \). If a parallel decoding round for \(\mathcal{C}(B,\mathcal{S})\) is given as input a word of relative distance \(\alpha \) from a codeword, then it outputs a word of relative distance at most

\[ \alpha \left(\frac{2}{3} + \frac{16\alpha }{\epsilon ^2} + \frac{4\lambda }{\epsilon d}\right) \]

from that codeword.

Proof ▶

(Here a round flips a variable iff each constraint containing it, seeing its restriction within \(d\epsilon /4\) of a codeword of \(\mathcal{S}\), sends it a “flip” signal.) Let \(G\) be the \(d\)-regular graph defining \(B\), with \(n\) vertices, so \(\mathcal{C}(B,\mathcal{S})\) has \(dn/2\) variables and \(n\) constraints. Let \(X\) be the set of \(\alpha \, dn/2\) corrupt variables.

A corrupt output variable is one not in \(X\) that receives a flip, or one in \(X\) that does not. Call a constraint confused if it sends a flip to a variable not in \(X\), and unhelpful if it contains a variable of \(X\) but sends it no flip. For a constraint to be confused it must have at least \(3d\epsilon /4\) variables of \(X\) as neighbours; each variable of \(X\) neighbours two constraints, so there are at most \(\frac{2\alpha \, dn/2}{\frac34 d\epsilon }=\frac{4\alpha n}{3\epsilon }\) confused constraints, each sending at most \(d\epsilon /4\) flips, so at most \(\frac{4\alpha n}{3\epsilon }\cdot \frac{d\epsilon }{4}=\frac{dn}{2}\cdot \frac{2\alpha }{3}\) variables outside \(X\) receive flips.

For a constraint to be unhelpful it must contain more than \(d\epsilon /4\) variables of \(X\), so there are at most \(\frac{2\alpha \, dn/2}{d\epsilon /4}=\frac{4\alpha n}{\epsilon }\) unhelpful constraints. By Lemma 21, the number of variables both of whose neighbouring constraints are unhelpful (a superset of the variables of \(X\) that fail to receive a flip) is at most

\[ \frac{dn}{2}\left(\Bigl(\frac{4\alpha }{\epsilon }\Bigr)^2 + \frac{\lambda }{d}\Bigl(\frac{4\alpha }{\epsilon }\Bigr)\right). \]

Adding the two contributions, the number of corrupt output variables is at most \(\frac{dn}{2}\bigl(\frac{2\alpha }{3} + \frac{16\alpha ^2}{\epsilon ^2} + \frac{4\alpha \lambda }{\epsilon d}\bigr)\), which is the claimed relative distance \(\alpha \bigl(\frac23 + \frac{16\alpha }{\epsilon ^2} + \frac{4\lambda }{\epsilon d}\bigr)\) times \(\frac{dn}{2}\).

Proposition 36 Proposition 16: one round is a constant-depth circuit

For \(\mathcal{S}\) fixed, a parallel decoding round for \(\mathcal{C}(B,\mathcal{S})\) can be implemented as a linear-size Boolean circuit of constant depth.

Proof ▶

Since \(\mathcal{S}\) has fixed block length \(d\), deciding for each constraint whether its restriction lies within \(d\epsilon /4\) of a codeword of \(\mathcal{S}\), and which variables to signal, is a Boolean function of a constant number (\(d\)) of inputs, hence a constant-depth subcircuit. Deciding whether a variable flips is a function of the constant number (\(c=2\)) of signals it receives. There is one such gadget per constraint and per variable, so the whole round has linear size and constant depth.

Lemma 37 Gilbert–Varshamov bound

For every \(\epsilon \) with \(0{\lt}\epsilon {\lt}1/2\) and all sufficiently long block lengths, there exists a binary linear code of minimum relative distance \(\epsilon \) and rate \(1-H(\epsilon )\).

Theorem 38 Theorem 19: explicit asymptotically good expander codes

For all \(\epsilon \) such that \(1-2H(\epsilon ){\gt}0\), there exists a polynomial-time constructible family of expander codes of rate \(1-2H(\epsilon )\) and minimum relative distance arbitrarily close to \(\epsilon ^2\), in which an \(\alpha {\lt} \epsilon ^2/48\) fraction of error can be corrected by a circuit of size \(O(n\log n)\) and depth \(O(\log n)\). Moreover, the action of this circuit can be simulated in linear time on a Pointer Machine or RAM under the uniform cost model.

Proof ▶

By the Gilbert–Varshamov bound (Lemma 37) there exist, for all sufficiently long block lengths, linear codes \(\mathcal{S}\) of minimum relative distance \(\epsilon \) and rate \(1-H(\epsilon )\); fix one. By Theorem 18 we can build \(d\)-regular expander graphs with second-largest eigenvalue \(\lambda _d \approx 2\sqrt{d-1}\), so for \(d\) large \(\lambda _d/d \to 0\). Choose \(d\) large enough that \(\alpha {\lt} \epsilon ^2(1/3 - 4\lambda _d/(\epsilon d))/16\) and \(\epsilon {\gt} 12\lambda _d/d\), and let \(B\) be the edge-vertex incidence graph of the degree-\(d\) graph.

By Lemma 34, \(\mathcal{C}(B,\mathcal{S})\) has rate at least \(2r-1 = 1-2H(\epsilon )\) and minimum relative distance at least \(\bigl((\epsilon -\lambda _d/d)/(1-\lambda _d/d)\bigr)^2\), which is arbitrarily close to \(\epsilon ^2\) as \(d\to \infty \). By Lemma 35 a parallel decoding round transforms a word of relative distance \(\alpha \) into one of relative distance at most \(\alpha \beta \) with \(\beta = 2/3 + 16\alpha /\epsilon ^2 + \lambda _d/(4\epsilon \cdot \tfrac 14)\); more precisely \(\beta = (2/3 + 16\alpha /\epsilon ^2 + 4\lambda _d/(\epsilon d))\). Our choice of \(d\) and \(\alpha \) makes \(\beta {\lt}1\), so iterating \(\log _{1/\beta }(\alpha n)\) rounds corrects all \(\alpha n\) errors.

Each round is a constant-depth linear-size circuit (Proposition 36); stacking the \(O(\log n)\) rounds gives a circuit of size \(O(n\log n)\) and depth \(O(\log n)\). The graph \(B\) is polynomial-time (indeed logarithmic-space) constructible by Theorem 18, and the circuit’s action can be simulated in linear time on a Pointer Machine or RAM under the uniform cost model.