12 Error correcting codes

Definition 220 Error correcting code

A code is \(C\subseteq \{ 0,1\} ^n\); its distance is \(\mathrm{dist}(C)=\min _{x\neq y\in C}d_H(x,y)\) and its rate is \(\log |C|/n\).

Claim 221 Nearest-codeword decoding

Decoding to the nearest codeword corrects up to \(\lfloor (\mathrm{dist}(C)-1)/2\rfloor \) bit flips.

Proof ▶

If \(\le \lfloor (\mathrm{dist}(C)-1)/2\rfloor \) bits flip, the triangle inequality makes the sent codeword strictly closer to the received word than any other.

Definition 222 12.1, Asymptotically good and efficient codes

A family \(C_n\) is asymptotically good if \(\mathrm{dist}(C){\gt}\delta n\) and rate \({\gt}r\) for fixed \(\delta ,r{\gt}0\), and efficient if encoding/decoding (up to \(\delta n/2\) errors) is polynomial-time.

Definition 223 Linear code

A linear code is a subspace \(C\subseteq \mathbb {F}_2^n\).

Lemma 224 Distance equals minimum weight

The distance of a linear code equals the minimum weight of a nonzero codeword.

Proof ▶

\(d_H(x,y)=\operatorname {wt}(x+y)\) over \(\mathbb {F}_2\), and \(x+y\) ranges over nonzero codewords.

Definition 225 Hamming ball volume

\(v(n,r)=\sum _{i=0}^r\binom ni\).

Definition 226 Binary entropy function

✓

\(H(\delta )=-\delta \log \delta -(1-\delta )\log (1-\delta )\); \(\tfrac 1n\log \binom n{\delta n}\to H(\delta )\).

Theorem 227 12.2, Gilbert–Varshamov

There is a (linear) length-\(n\) code of distance \(\ge d\) and size \(\ge 2^n/v(n,d)\).

Proof ▶

Greedily add codewords, each removing \(\le v(n,d)\) points within distance \(d\), for \(\ge 2^n/v(n,d)\) steps. Linearly: build a parity-check matrix column by column so that no \({\lt}d\) columns are dependent; this is possible while \(\sum _{r{\lt}d}\binom {j-1}r{\lt}2^m\), giving a code of dimension \(\ge n-m\).

Corollary 228 12.3, Asymptotic Gilbert–Varshamov

For \(\delta \le 1/2\) there are codes of normalized distance \(\delta \) and rate \(\ge 1-H(\delta )\).

Proof ▶

The rate is \(\ge 1-\tfrac 1n\log \binom n{\delta n}\to 1-H(\delta )\).

Theorem 229 12.4, Sphere-packing bound

Any length-\(n\), distance-\(d\) code has \(|C|\le 2^n/v(n,d/2)\).

Proof ▶

The radius-\(d/2\) balls around codewords are disjoint, so their number is \(\le 2^n/v(n,d/2)\).

Corollary 230 12.5, Sphere-packing rate

A code of relative distance \(\delta \) has rate \(\le 1-H(\delta /2)\).

Proof ▶

Take logarithms in Theorem 229.

Theorem 231 12.6, MRRW bound

A code of relative distance \(\delta \) has rate \(\le H(1/2-\sqrt{\delta (1-\delta )})\).

Construction 232 Code from a bipartite graph

For a bipartite \(G\) (\(n\) left variables, \(m\) right constraints), \(C(G)=\{ x: Ax=0\} \) where \(A\) is the bipartite adjacency matrix (each constraint = mod-2 sum of incident variables).

Definition 233 12.7, Left vertex expansion ratio

For a \(k\)-left-regular bipartite \(G\), \(L(G,d)=\min _{0{\lt}|S|\le d}|\Gamma (S)|/|S|\).

Theorem 234 12.8, Sipser–Spielman distance

If \(L(G,d){\gt}k/2\) then \(\mathrm{dist}(C(G))\ge d\).

Proof ▶

Expansion \({\gt}k/2\) gives every \(S\) with \(|S|\le d\) a unique neighbour (average \(S\)-degree of \(\Gamma (S)\) is \({\lt}2\)). So if a nonzero codeword had support \(|S|{\lt}d\), some constraint would be violated; hence min weight \(\ge d\).

Construction 235 Belief-propagation decoding

Given \(y\notin C\), flip any bit \(i\) for which \(|A(y+e_i)|{\lt}|Ay|\) (more violated than satisfied constraints), repeating until a codeword is reached.

Theorem 236 12.9, Sipser–Spielman decoding

If \(L(G,d){\gt}\tfrac 34k\) and \(d_H(y,x)\le d/2\) for a codeword \(x\), then belief-propagation returns \(x\) in a linear number of steps.

Proof ▶

Let \(A\) be the error set, \(U\)/\(S\) its unsatisfied/satisfied neighbours. Then \(|S|+|U|=|\Gamma (A)|{\gt}\tfrac 34k|A|\) and \(|U|+2|S|\le k|A|\), giving \(|U|{\gt}\tfrac 12k|A|\); so some variable has a majority of violated constraints and is flipped, decreasing \(|U|\). Since \(|A|\) changes by \(1\) and starts \(\le d/2\), it never reaches \(d\), so the algorithm converges to \(x\).

Claim 237 Good codes from lossless expanders

The lossless expanders of the conductor chapter give explicit constant-rate codes with \(L(G,d){\gt}0.99k\) for \(d=\Omega (n)\), hence efficient asymptotically good codes.