2 Preliminaries
2.1 Codes and basic parameters
For a finite alphabet \(\Sigma \) and \(n \in \mathbb {N}\), a code of length \(n\) is a subset \(\mathcal{C}\subseteq \Sigma ^n\). We also view \(\mathcal{C}\) as an encoding map \(\mathcal{C}: \Sigma ^k \to \Sigma ^n\) where \(k = \log _{|\Sigma |}|\mathcal{C}|\).
For a finite field \(\mathbb {F}\), a code \(\mathcal{C}\subseteq \mathbb {F}^n\) is linear if it is an \(\mathbb {F}\)-linear subspace of \(\mathbb {F}^n\). Its dimension is its dimension as a subspace.
For \(x,y \in \Sigma ^n\), the Hamming distance \(\Delta (x,y)\) is the number of coordinates \(\alpha \in [n]\) on which \(x_\alpha \neq y_\alpha \). (Standard; recalled so later nodes may depend on it.)
For \(x \in \mathbb {F}^n\), the Hamming weight \(\operatorname {weight}(x)\) is the number of nonzero coordinates of \(x\), i.e. \(\operatorname {weight}(x) = \Delta (x,0)\).
For a code \(\mathcal{C}\subseteq \Sigma ^n\), the rate is \(R = \frac{\log _{|\Sigma |}|\mathcal{C}|}{n} = \frac{k}{n}\), where \(k = \log _{|\Sigma |}|\mathcal{C}|\).
For a code \(\mathcal{C}\subseteq \Sigma ^n\) with \(|\mathcal{C}| \geq 2\), the (relative) distance is \(\delta = \frac{1}{n}\min _{c \neq c' \in \mathcal{C}} \Delta (c,c')\). For a linear code this equals \(\frac{1}{n}\min _{0 \neq c \in \mathcal{C}}\operatorname {weight}(c)\).
A code \(\mathcal{C}\subseteq \mathbb {F}_2^n\) is a random linear code of dimension \(k\) if it is chosen uniformly among all subspaces of \(\mathbb {F}_2^n\) of dimension \(k\). (More generally over \(\mathbb {F}_q\).)
2.2 The field trace and inner products
Let \(q = 2^{k_0}\) and \(\mathbb {F}_q\) be the field with \(q\) elements, viewed as a degree \(k_0\) extension of \(\mathbb {F}_2\). The field trace is \(\operatorname {Tr}(x) = x + x^2 + \cdots + x^{2^{k_0-1}}\). It is \(\mathbb {F}_2\)-linear with image \(\mathbb {F}_2\).
For \(q = 2^{k_0}\) there exists a self-dual basis \(\nu _1,\dots ,\nu _{k_0}\) of \(\mathbb {F}_q\) over \(\mathbb {F}_2\), i.e. \(\operatorname {Tr}(\nu _i \nu _j) = \mathbf{1}[i=j]\).
For \(x,y \in \mathbb {F}_2^N\) set \(\langle x, y\rangle = \sum _{\alpha =1}^N x_\alpha y_\alpha \in \mathbb {F}_2\). For \(\mathbb {F}_q\) with \(q {\gt} 2\) a power of \(2\) and \(x,y \in \mathbb {F}_q^n\) set \(\langle x, y\rangle = \operatorname {Tr}\! \left(\sum _\alpha x_\alpha y_\alpha \right)\). Fixing a self-dual basis (Lemma 9) to identify \(\mathbb {F}_q\) with \(\mathbb {F}_2^{k_0}\), these agree: \(\operatorname {Tr}(\alpha \cdot \beta ) = \langle \alpha , \beta \rangle \) for \(\alpha ,\beta \in \mathbb {F}_q\cong \mathbb {F}_2^{k_0}\).
Fix a self-dual basis \(\nu _1,\dots ,\nu _{k_0}\). If \(\alpha = \sum _i \alpha _i \nu _i\) and \(\beta = \sum _j \beta _j \nu _j\) with \(\alpha _i,\beta _j \in \mathbb {F}_2\), then \(\operatorname {Tr}(\alpha \cdot \beta ) = \langle (\alpha _1,\dots ,\alpha _{k_0}), (\beta _1,\dots ,\beta _{k_0})\rangle \).
By \(\mathbb {F}_2\)-bilinearity of \((\alpha ,\beta ) \mapsto \operatorname {Tr}(\alpha \beta )\),
using the self-dual property \(\operatorname {Tr}(\nu _i\nu _j) = \mathbf{1}[i=j]\).
For a linear code \(\mathcal{C}\subseteq \mathbb {F}^n\) the dual code is
Over \(\mathbb {F}_q\) this uses the trace pairing of Definition 10.
2.3 The Gilbert–Varshamov bound and the goal
The binary entropy function is \(h_2(x) = x\log _2(1/x) + (1-x)\log _2(1/(1-x))\) for \(x \in (0,1)\), with \(h_2(0)=h_2(1)=0\).
Let \(\delta \in [0,1/2]\) and \(\eta \in (0, 1 - h_2(\delta ))\). Then for any \(n {\gt} 1/\eta \) there exists a linear code \(\mathcal{C}\subseteq \mathbb {F}_2^n\) with rate \(R \geq 1 - h_2(\delta ) - \eta \) and relative distance at least \(\delta \).
Since \(1 - h_2(1/2 - \varepsilon ) = \Theta (\varepsilon ^2)\) as \(\varepsilon \to 0\), we say a family of low-rate codes \(\mathcal{C}_{N,\varepsilon } \subseteq \mathbb {F}_2^N\) approaches the GV bound if \(\mathcal{C}_{N,\varepsilon }\) has rate \(\Omega (\varepsilon ^2)\) and relative distance \(1/2 - O(\varepsilon )\), the asymptotics being as \(\varepsilon \to 0\) and \(N \to \infty \).
2.4 Concatenated codes and default parameters
Throughout, \(\mathcal{C}_{\mathrm{out}}\subseteq \mathbb {F}_q^n\) and \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) are linear codes of rate \(\varepsilon \), with \(q = 2^{k_0}\), \(k = \varepsilon n\), \(k_0 = \varepsilon n_0 = \log _2 q\), \(N = n_0 n\) and \(K = k_0 k\). Identifying \(\mathbb {F}_q\cong \mathbb {F}_2^{k_0}\), the concatenated code \(\mathcal{C}= \mathcal{C}_{\mathrm{out}}\circ \mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^N\) is the encoding map \(\mathcal{C}: \mathbb {F}_2^K \to \mathbb {F}_2^N\) given, for a message \(m\) interpreted in \(\mathbb {F}_q^k\) with \(c = \mathcal{C}_{\mathrm{out}}(m)\), by
Its rate is the product of the rates and its distance is at least the product of the distances of \(\mathcal{C}_{\mathrm{out}}\) and \(\mathcal{C}_{\mathrm{in}}\).
2.5 Weight distribution of the inner code
Fix \(0 {\lt} \tau {\lt} \varepsilon \). The inner code \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) (of rate \(\varepsilon \)) is \(\tau \)-nice if for every \(i \in \{ 1,\dots ,n_0\} \), the number of codewords \(c \in \mathcal{C}_{\mathrm{in}}^\perp \) with \(\operatorname {weight}(c) = i\) is at most
(The \(\varepsilon \) is dropped from the name because it is the common rate of \(\mathcal{C}_{\mathrm{in}}\) and \(\mathcal{C}_{\mathrm{out}}\) throughout.)
Let \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) be a random linear code of dimension \(k_0 = \varepsilon n_0\). Then with probability at least \(1 - n_0 \cdot 2^{-\tau n_0}\), the code \(\mathcal{C}_{\mathrm{in}}\) is \(\tau \)-nice.
For \(i \in \{ 1,\dots ,n_0\} \) let \(E_i\) be the event that \(\mathcal{C}_{\mathrm{in}}^\perp \) has at most \(\binom {n_0}{i}2^{-n_0(\varepsilon - \tau )}\) codewords of weight \(i\). Since \(\mathcal{C}_{\mathrm{in}}^\perp \) is a random subspace of dimension \(n_0 - k_0\), a fixed nonzero vector lies in \(\mathcal{C}_{\mathrm{in}}^\perp \) with probability at most \(2^{-k_0} = 2^{-n_0\varepsilon }\); hence the expected number of weight-\(i\) codewords of \(\mathcal{C}_{\mathrm{in}}^\perp \) is at most \(\binom {n_0}{i}2^{-n_0\varepsilon }\). By Markov’s inequality (Lemma 23), the number exceeds \(2^{\tau n_0}\binom {n_0}{i}2^{-n_0\varepsilon } = \binom {n_0}{i}2^{-n_0(\varepsilon -\tau )}\) with probability at most \(2^{-\tau n_0}\), so \(\Pr [E_i] \geq 1 - 2^{-\tau n_0}\). By a union bound (Lemma 25) over the \(n_0\) values of \(i\), \(\Pr [\mathcal{C}_{\mathrm{in}}\text{ is } \tau \text{-nice}] \geq 1 - n_0 2^{-\tau n_0}\).
Let \(\mathcal{C}\subseteq \mathbb {F}_2^n\) be a random linear code of dimension \(Rn\) and let \(x,y \in \mathbb {F}_2^n \setminus \{ 0\} \) with \(x \neq y\). Then
A random linear code of dimension \(Rn\) contains a fixed \(d\)-dimensional subspace of \(\mathbb {F}_2^n\) with probability \(\prod _{i=0}^{d-1} \frac{2^{Rn}-2^i}{2^n - 2^i}\). Since \(x \neq y\) are nonzero and distinct they span a \(2\)-dimensional subspace, so
where \(\Pr [x \in \mathcal{C}] = \frac{2^{Rn}-1}{2^n-1}\) for each nonzero \(x\).
2.6 Fourier analysis over \(\mathbb {F}_2^{k_0 n}\)
Identifying \(\mathbb {F}_q\cong \mathbb {F}_2^{k_0}\) via a self-dual basis, for \(\varphi : \mathbb {F}_q^n \to \mathbb {R}\) and \(\omega \in \mathbb {F}_q^n\), the Fourier transform is
For any \(\varphi : \mathbb {F}_q^n \to \mathbb {R}\) and any \(x\), \(\varphi (x) = \sum _{\omega \in \mathbb {F}_q^n}\hat{\varphi }(\omega )(-1)^{\langle x, \omega \rangle }\).
Using the orthogonality relation \(\sum _{\omega }(-1)^{\langle \omega , z\rangle } = q^n \cdot \mathbf{1}[z = 0]\) (a sum of a nontrivial character over the group vanishes), we compute
For an \(\mathbb {F}_q\)-subspace \(V \subseteq \mathbb {F}_q^n\) and any \(g \in \mathbb {F}_q^n\),
By definition \(\widehat{\mathbf{1}_V}(g) = \frac{1}{q^n}\sum _{x \in V} (-1)^{\langle g, x\rangle }\). The map \(x \mapsto (-1)^{\langle g, x\rangle }\) is a character of the group \(V\); it is trivial precisely when \(g \in V^\perp \). Hence the inner sum equals \(|V|\) if \(g \in V^\perp \) and \(0\) otherwise, giving \(\frac{|V|}{q^n}\mathbf{1}_{V^\perp }(g)\).
2.7 Standard probabilistic and combinatorial facts
For a nonnegative random variable \(Z\) and \(t {\gt} 0\), \(\Pr [Z \geq t] \leq \mathbb {E}[Z]/t\).
For a random variable \(Z\) with finite variance and \(t {\gt} 0\), \(\Pr [|Z - \mathbb {E}[Z]| \geq t] \leq \operatorname {Var}[Z]/t^2\).
For events \(A_1,\dots ,A_t\), \(\Pr [\bigcup _i A_i] \leq \sum _i \Pr [A_i]\).
The Poisson distribution \(\operatorname {Poi}(\lambda )\) has \(\Pr [\operatorname {Poi}(\lambda ) = r] = \frac{\lambda ^r e^{-\lambda }}{r!}\) for \(r \in \mathbb {N}\).
The Bernoulli distribution \(\operatorname {Ber}(p)\) takes value \(1\) with probability \(p\) and \(0\) with probability \(1-p\).
For probability distributions \(\mathcal{P},\mathcal{Q}\) on a finite set, \(\Delta _{\mathrm{TV}}(\mathcal{P},\mathcal{Q}) = \frac12\sum _\sigma |\mathcal{P}(\sigma ) - \mathcal{Q}(\sigma )|\).
If \(Y \sim \operatorname {Poi}(\lambda )\) then \(\Pr [Y \text{ odd}] = \tfrac 12(1 - e^{-2\lambda })\).
Summing the Poisson pmf over odd values,
using \(\sum _{j\geq 0}\frac{\lambda ^{2j+1}}{(2j+1)!} = \sinh \lambda = \frac{e^\lambda - e^{-\lambda }}{2}\).
If \(r \sim \operatorname {Poi}(\Lambda )\) balls are thrown independently and uniformly into \(M\) bins, then the occupancy counts \(Y_1,\dots ,Y_M\) are independent, each distributed as \(\operatorname {Poi}(\Lambda /M)\).
If \(r \sim \operatorname {Poi}(\lambda )\) then \(\Pr [r \geq \lambda + t] \leq \exp \! \big(\frac{-t^2}{\lambda + t}\big)\). In particular \(\Pr [r \geq \lambda + 100\sqrt{\lambda }] \leq \exp (-100^2/2)\).
For \(\alpha \in [0,1/2]\) and \(n \in \mathbb {N}\), \(\sum _{i=0}^{\lfloor \alpha n\rfloor }\binom {n}{i} \leq 2^{nh_2(\alpha )}\).
For \(\alpha \in (0,1/2]\) we have \(\frac{\alpha }{1-\alpha }\leq 1\), so for \(i \leq \alpha n\) the factor \((\alpha /(1-\alpha ))^{i - \alpha n}\geq 1\). Hence
using the binomial theorem \(\sum _i \binom {n}{i}t^i = (1+t)^i\) with \(t = \alpha /(1-\alpha )\), so \((1+t)^n = (1-\alpha )^{-n}\). The cases \(\alpha = 0\) is trivial.
For \(\alpha \in [0,1]\) with \(\alpha n \in \mathbb {N}\), \(\binom {n}{\alpha n} \geq \frac{1}{n+1}2^{nh_2(\alpha )} \geq 2^{(h_2(\alpha ) - \frac{\log _2(2n)}{2n})n}\).
Among the \(n+1\) terms \(\binom {n}{i}\) the term \(\binom {n}{\alpha n}\) is the largest at \(i = \alpha n \leq n/2\) (or by symmetry), and \(\binom {n}{\alpha n} \alpha ^{\alpha n}(1-\alpha )^{(1-\alpha )n}\) is the largest term of \(1 = (\alpha + (1-\alpha ))^n = \sum _i \binom {n}{i}\alpha ^i(1-\alpha )^{n-i}\), whence \(\binom {n}{\alpha n}\alpha ^{\alpha n}(1-\alpha )^{(1-\alpha )n}\geq \frac{1}{n+1}\), i.e. \(\binom {n}{\alpha n}\geq \frac{1}{n+1}2^{nh_2(\alpha )}\). Finally \(\frac{1}{n+1}\geq 2^{-\log _2(2n)}\) for \(n \geq 1\) gives the second inequality.
For all \(x \in [0,1]\), \(h_2(x) \leq -x\log _2 x + 2x\).
We have \(h_2(x) = -x\log _2 x - (1-x)\log _2(1-x)\). It suffices to show \(-(1-x)\log _2(1-x) \leq 2x\) for \(x \in [0,1]\). Writing \(u = 1-x \in [0,1]\) this is \(-u\log _2 u \leq 2(1-u)\). Since \(\ln u \geq 1 - 1/u\) gives \(-\ln u \leq \frac{1-u}{u}\), we get \(-u\ln u \leq 1-u\), so \(-u\log _2 u = \frac{-u\ln u}{\ln 2}\leq \frac{1-u}{\ln 2}\leq 2(1-u)\), as \(1/\ln 2 {\lt} 2\).
For \(x \in [0,1]\), \(h_2^{-1}(1 - x^2) \leq \frac12 - \frac{\sqrt{\ln 2}}{2}x\).
For \(x \in [0,\tfrac 14]\), \(h_2^{-1}(1 - x) \geq \frac12 - 5x^2\).
The number of ways to choose a multiset of \(s\) even positive integers (an ordered assignment of even values to \(\leq s/2\) chosen positions summing to a prescribed even total \(S\) with \(s = S\)) is at most \(2^{3S/2}\); concretely, the number of nonnegative integer compositions counted in the proof of Claim 4.3 is \(\binom {S/2 + S - 1}{S} \leq 2^{3S/2}\).
By stars and bars, the number of ways to distribute is \(\binom {S/2 + S - 1}{S} \leq 2^{S/2 + S - 1}\leq 2^{3S/2}\), using \(\binom {a+b}{b}\leq 2^{a+b}\).