When Do Low-Rate Concatenated Codes Approach the Gilbert–Varshamov Bound?

2 Preliminaries

2.1 Codes and basic parameters

Definition 1 Code
#

For a finite alphabet \(\Sigma \) and \(n \in \mathbb {N}\), a code of length \(n\) is a subset \(\mathcal{C}\subseteq \Sigma ^n\). We also view \(\mathcal{C}\) as an encoding map \(\mathcal{C}: \Sigma ^k \to \Sigma ^n\) where \(k = \log _{|\Sigma |}|\mathcal{C}|\).

Definition 2 Linear code

For a finite field \(\mathbb {F}\), a code \(\mathcal{C}\subseteq \mathbb {F}^n\) is linear if it is an \(\mathbb {F}\)-linear subspace of \(\mathbb {F}^n\). Its dimension is its dimension as a subspace.

Definition 3 Hamming distance
#

For \(x,y \in \Sigma ^n\), the Hamming distance \(\Delta (x,y)\) is the number of coordinates \(\alpha \in [n]\) on which \(x_\alpha \neq y_\alpha \). (Standard; recalled so later nodes may depend on it.)

Definition 4 Hamming weight
#

For \(x \in \mathbb {F}^n\), the Hamming weight \(\operatorname {weight}(x)\) is the number of nonzero coordinates of \(x\), i.e. \(\operatorname {weight}(x) = \Delta (x,0)\).

Definition 5 Rate

For a code \(\mathcal{C}\subseteq \Sigma ^n\), the rate is \(R = \frac{\log _{|\Sigma |}|\mathcal{C}|}{n} = \frac{k}{n}\), where \(k = \log _{|\Sigma |}|\mathcal{C}|\).

Definition 6 Relative distance

For a code \(\mathcal{C}\subseteq \Sigma ^n\) with \(|\mathcal{C}| \geq 2\), the (relative) distance is \(\delta = \frac{1}{n}\min _{c \neq c' \in \mathcal{C}} \Delta (c,c')\). For a linear code this equals \(\frac{1}{n}\min _{0 \neq c \in \mathcal{C}}\operatorname {weight}(c)\).

Definition 7 Random linear code

A code \(\mathcal{C}\subseteq \mathbb {F}_2^n\) is a random linear code of dimension \(k\) if it is chosen uniformly among all subspaces of \(\mathbb {F}_2^n\) of dimension \(k\). (More generally over \(\mathbb {F}_q\).)

2.2 The field trace and inner products

Definition 8 Field trace
#

Let \(q = 2^{k_0}\) and \(\mathbb {F}_q\) be the field with \(q\) elements, viewed as a degree \(k_0\) extension of \(\mathbb {F}_2\). The field trace is \(\operatorname {Tr}(x) = x + x^2 + \cdots + x^{2^{k_0-1}}\). It is \(\mathbb {F}_2\)-linear with image \(\mathbb {F}_2\).

Lemma 9 Self-dual trace basis

For \(q = 2^{k_0}\) there exists a self-dual basis \(\nu _1,\dots ,\nu _{k_0}\) of \(\mathbb {F}_q\) over \(\mathbb {F}_2\), i.e. \(\operatorname {Tr}(\nu _i \nu _j) = \mathbf{1}[i=j]\).

Definition 10 Inner products

For \(x,y \in \mathbb {F}_2^N\) set \(\langle x, y\rangle = \sum _{\alpha =1}^N x_\alpha y_\alpha \in \mathbb {F}_2\). For \(\mathbb {F}_q\) with \(q {\gt} 2\) a power of \(2\) and \(x,y \in \mathbb {F}_q^n\) set \(\langle x, y\rangle = \operatorname {Tr}\! \left(\sum _\alpha x_\alpha y_\alpha \right)\). Fixing a self-dual basis (Lemma 9) to identify \(\mathbb {F}_q\) with \(\mathbb {F}_2^{k_0}\), these agree: \(\operatorname {Tr}(\alpha \cdot \beta ) = \langle \alpha , \beta \rangle \) for \(\alpha ,\beta \in \mathbb {F}_q\cong \mathbb {F}_2^{k_0}\).

Lemma 11 Trace as a dot product

Fix a self-dual basis \(\nu _1,\dots ,\nu _{k_0}\). If \(\alpha = \sum _i \alpha _i \nu _i\) and \(\beta = \sum _j \beta _j \nu _j\) with \(\alpha _i,\beta _j \in \mathbb {F}_2\), then \(\operatorname {Tr}(\alpha \cdot \beta ) = \langle (\alpha _1,\dots ,\alpha _{k_0}), (\beta _1,\dots ,\beta _{k_0})\rangle \).

Proof

By \(\mathbb {F}_2\)-bilinearity of \((\alpha ,\beta ) \mapsto \operatorname {Tr}(\alpha \beta )\),

\[ \operatorname {Tr}(\alpha \cdot \beta ) = \sum _{i,j} \alpha _i \beta _j \operatorname {Tr}(\nu _i\nu _j) = \sum _{i,j}\alpha _i\beta _j\, \mathbf{1}[i=j] = \sum _i \alpha _i\beta _i = \langle \alpha , \beta \rangle , \]

using the self-dual property \(\operatorname {Tr}(\nu _i\nu _j) = \mathbf{1}[i=j]\).

Definition 12 Dual code

For a linear code \(\mathcal{C}\subseteq \mathbb {F}^n\) the dual code is

\[ \mathcal{C}^\perp = \Big\{ g \in \mathbb {F}^n : \sum _{\alpha \in [n]} g_\alpha c_\alpha = 0 \ \ \forall c \in \mathcal{C}\Big\} . \]

Over \(\mathbb {F}_q\) this uses the trace pairing of Definition 10.

2.3 The Gilbert–Varshamov bound and the goal

Definition 13 Binary entropy function
#

The binary entropy function is \(h_2(x) = x\log _2(1/x) + (1-x)\log _2(1/(1-x))\) for \(x \in (0,1)\), with \(h_2(0)=h_2(1)=0\).

Theorem 14 GV bound, Theorem 2.1

Let \(\delta \in [0,1/2]\) and \(\eta \in (0, 1 - h_2(\delta ))\). Then for any \(n {\gt} 1/\eta \) there exists a linear code \(\mathcal{C}\subseteq \mathbb {F}_2^n\) with rate \(R \geq 1 - h_2(\delta ) - \eta \) and relative distance at least \(\delta \).

Definition 15 Approaching the GV bound at low rate, Goal 1

Since \(1 - h_2(1/2 - \varepsilon ) = \Theta (\varepsilon ^2)\) as \(\varepsilon \to 0\), we say a family of low-rate codes \(\mathcal{C}_{N,\varepsilon } \subseteq \mathbb {F}_2^N\) approaches the GV bound if \(\mathcal{C}_{N,\varepsilon }\) has rate \(\Omega (\varepsilon ^2)\) and relative distance \(1/2 - O(\varepsilon )\), the asymptotics being as \(\varepsilon \to 0\) and \(N \to \infty \).

2.4 Concatenated codes and default parameters

Definition 16 Concatenated code

Throughout, \(\mathcal{C}_{\mathrm{out}}\subseteq \mathbb {F}_q^n\) and \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) are linear codes of rate \(\varepsilon \), with \(q = 2^{k_0}\), \(k = \varepsilon n\), \(k_0 = \varepsilon n_0 = \log _2 q\), \(N = n_0 n\) and \(K = k_0 k\). Identifying \(\mathbb {F}_q\cong \mathbb {F}_2^{k_0}\), the concatenated code \(\mathcal{C}= \mathcal{C}_{\mathrm{out}}\circ \mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^N\) is the encoding map \(\mathcal{C}: \mathbb {F}_2^K \to \mathbb {F}_2^N\) given, for a message \(m\) interpreted in \(\mathbb {F}_q^k\) with \(c = \mathcal{C}_{\mathrm{out}}(m)\), by

\[ \mathcal{C}(m) = \big(\mathcal{C}_{\mathrm{in}}(c_1), \mathcal{C}_{\mathrm{in}}(c_2), \dots , \mathcal{C}_{\mathrm{in}}(c_n)\big) \in \mathbb {F}_2^N. \]

Its rate is the product of the rates and its distance is at least the product of the distances of \(\mathcal{C}_{\mathrm{out}}\) and \(\mathcal{C}_{\mathrm{in}}\).

2.5 Weight distribution of the inner code

Definition 17 \(\tau \)-niceness of the inner code, Definition 2.2

Fix \(0 {\lt} \tau {\lt} \varepsilon \). The inner code \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) (of rate \(\varepsilon \)) is \(\tau \)-nice if for every \(i \in \{ 1,\dots ,n_0\} \), the number of codewords \(c \in \mathcal{C}_{\mathrm{in}}^\perp \) with \(\operatorname {weight}(c) = i\) is at most

\[ \binom {n_0}{i} \cdot 2^{-n_0(\varepsilon - \tau )} . \]

(The \(\varepsilon \) is dropped from the name because it is the common rate of \(\mathcal{C}_{\mathrm{in}}\) and \(\mathcal{C}_{\mathrm{out}}\) throughout.)

Lemma 18 Random inner codes are nice, Lemma 2.3

Let \(\mathcal{C}_{\mathrm{in}}\subseteq \mathbb {F}_2^{n_0}\) be a random linear code of dimension \(k_0 = \varepsilon n_0\). Then with probability at least \(1 - n_0 \cdot 2^{-\tau n_0}\), the code \(\mathcal{C}_{\mathrm{in}}\) is \(\tau \)-nice.

Proof

For \(i \in \{ 1,\dots ,n_0\} \) let \(E_i\) be the event that \(\mathcal{C}_{\mathrm{in}}^\perp \) has at most \(\binom {n_0}{i}2^{-n_0(\varepsilon - \tau )}\) codewords of weight \(i\). Since \(\mathcal{C}_{\mathrm{in}}^\perp \) is a random subspace of dimension \(n_0 - k_0\), a fixed nonzero vector lies in \(\mathcal{C}_{\mathrm{in}}^\perp \) with probability at most \(2^{-k_0} = 2^{-n_0\varepsilon }\); hence the expected number of weight-\(i\) codewords of \(\mathcal{C}_{\mathrm{in}}^\perp \) is at most \(\binom {n_0}{i}2^{-n_0\varepsilon }\). By Markov’s inequality (Lemma 23), the number exceeds \(2^{\tau n_0}\binom {n_0}{i}2^{-n_0\varepsilon } = \binom {n_0}{i}2^{-n_0(\varepsilon -\tau )}\) with probability at most \(2^{-\tau n_0}\), so \(\Pr [E_i] \geq 1 - 2^{-\tau n_0}\). By a union bound (Lemma 25) over the \(n_0\) values of \(i\), \(\Pr [\mathcal{C}_{\mathrm{in}}\text{ is } \tau \text{-nice}] \geq 1 - n_0 2^{-\tau n_0}\).

Lemma 19 Negative correlation in a random linear code, Lemma 2.4

Let \(\mathcal{C}\subseteq \mathbb {F}_2^n\) be a random linear code of dimension \(Rn\) and let \(x,y \in \mathbb {F}_2^n \setminus \{ 0\} \) with \(x \neq y\). Then

\[ \Pr _{\mathcal{C}}[x,y \in \mathcal{C}] \leq \Pr _\mathcal{C}[x \in \mathcal{C}]\cdot \Pr _\mathcal{C}[y \in \mathcal{C}]. \]
Proof

A random linear code of dimension \(Rn\) contains a fixed \(d\)-dimensional subspace of \(\mathbb {F}_2^n\) with probability \(\prod _{i=0}^{d-1} \frac{2^{Rn}-2^i}{2^n - 2^i}\). Since \(x \neq y\) are nonzero and distinct they span a \(2\)-dimensional subspace, so

\[ \Pr [x,y \in \mathcal{C}] = \frac{(2^{Rn}-1)(2^{Rn}-2)}{(2^n-1)(2^n-2)} \leq \left(\frac{2^{Rn}-1}{2^n-1}\right)^2 = \Pr [x \in \mathcal{C}]\cdot \Pr [y \in \mathcal{C}], \]

where \(\Pr [x \in \mathcal{C}] = \frac{2^{Rn}-1}{2^n-1}\) for each nonzero \(x\).

2.6 Fourier analysis over \(\mathbb {F}_2^{k_0 n}\)

Definition 20 Fourier transform

Identifying \(\mathbb {F}_q\cong \mathbb {F}_2^{k_0}\) via a self-dual basis, for \(\varphi : \mathbb {F}_q^n \to \mathbb {R}\) and \(\omega \in \mathbb {F}_q^n\), the Fourier transform is

\[ \hat{\varphi }(\omega ) = \frac{1}{q^n}\sum _{x \in \mathbb {F}_q^n}\varphi (x) (-1)^{\operatorname {Tr}(\langle \omega , x\rangle )} = \frac{1}{2^{k_0 n}}\sum _{x \in \mathbb {F}_2^{k_0 n}}\varphi (x) (-1)^{\langle \omega , x\rangle } . \]
Lemma 21 Fourier inversion, Equation (2)

For any \(\varphi : \mathbb {F}_q^n \to \mathbb {R}\) and any \(x\), \(\varphi (x) = \sum _{\omega \in \mathbb {F}_q^n}\hat{\varphi }(\omega )(-1)^{\langle x, \omega \rangle }\).

Proof

Using the orthogonality relation \(\sum _{\omega }(-1)^{\langle \omega , z\rangle } = q^n \cdot \mathbf{1}[z = 0]\) (a sum of a nontrivial character over the group vanishes), we compute

\[ \sum _{\omega }\hat{\varphi }(\omega )(-1)^{\langle x, \omega \rangle } = \frac{1}{q^n}\sum _y \varphi (y)\sum _\omega (-1)^{\langle \omega , x+y\rangle } = \frac{1}{q^n}\sum _y \varphi (y)\, q^n\, \mathbf{1}[x = y] = \varphi (x). \]
Lemma 22 Fourier transform of a subspace indicator, Equation (3)

For an \(\mathbb {F}_q\)-subspace \(V \subseteq \mathbb {F}_q^n\) and any \(g \in \mathbb {F}_q^n\),

\[ \widehat{\mathbf{1}_V}(g) = \frac{|V|}{q^n}\, \mathbf{1}_{V^\perp }(g). \]
Proof

By definition \(\widehat{\mathbf{1}_V}(g) = \frac{1}{q^n}\sum _{x \in V} (-1)^{\langle g, x\rangle }\). The map \(x \mapsto (-1)^{\langle g, x\rangle }\) is a character of the group \(V\); it is trivial precisely when \(g \in V^\perp \). Hence the inner sum equals \(|V|\) if \(g \in V^\perp \) and \(0\) otherwise, giving \(\frac{|V|}{q^n}\mathbf{1}_{V^\perp }(g)\).

2.7 Standard probabilistic and combinatorial facts

Lemma 23 Markov’s inequality
#

For a nonnegative random variable \(Z\) and \(t {\gt} 0\), \(\Pr [Z \geq t] \leq \mathbb {E}[Z]/t\).

Lemma 24 Chebyshev’s inequality
#

For a random variable \(Z\) with finite variance and \(t {\gt} 0\), \(\Pr [|Z - \mathbb {E}[Z]| \geq t] \leq \operatorname {Var}[Z]/t^2\).

Lemma 25 Union bound
#

For events \(A_1,\dots ,A_t\), \(\Pr [\bigcup _i A_i] \leq \sum _i \Pr [A_i]\).

Lemma 26 Poisson distribution
#

The Poisson distribution \(\operatorname {Poi}(\lambda )\) has \(\Pr [\operatorname {Poi}(\lambda ) = r] = \frac{\lambda ^r e^{-\lambda }}{r!}\) for \(r \in \mathbb {N}\).

Lemma 27 Bernoulli distribution
#

The Bernoulli distribution \(\operatorname {Ber}(p)\) takes value \(1\) with probability \(p\) and \(0\) with probability \(1-p\).

Lemma 28 Total variation distance
#

For probability distributions \(\mathcal{P},\mathcal{Q}\) on a finite set, \(\Delta _{\mathrm{TV}}(\mathcal{P},\mathcal{Q}) = \frac12\sum _\sigma |\mathcal{P}(\sigma ) - \mathcal{Q}(\sigma )|\).

Lemma 29 Parity of a Poisson variable

If \(Y \sim \operatorname {Poi}(\lambda )\) then \(\Pr [Y \text{ odd}] = \tfrac 12(1 - e^{-2\lambda })\).

Proof

Summing the Poisson pmf over odd values,

\[ \Pr [Y \text{ odd}] = \sum _{j=0}^\infty \frac{\lambda ^{2j+1}e^{-\lambda }}{(2j+1)!} = e^{-\lambda }\cdot \frac{e^{\lambda } - e^{-\lambda }}{2} = \tfrac 12\big(1 - e^{-2\lambda }\big), \]

using \(\sum _{j\geq 0}\frac{\lambda ^{2j+1}}{(2j+1)!} = \sinh \lambda = \frac{e^\lambda - e^{-\lambda }}{2}\).

Lemma 30 Poissonisation / Poisson splitting
#

If \(r \sim \operatorname {Poi}(\Lambda )\) balls are thrown independently and uniformly into \(M\) bins, then the occupancy counts \(Y_1,\dots ,Y_M\) are independent, each distributed as \(\operatorname {Poi}(\Lambda /M)\).

Lemma 31 Poisson Chernoff tail
#

If \(r \sim \operatorname {Poi}(\lambda )\) then \(\Pr [r \geq \lambda + t] \leq \exp \! \big(\frac{-t^2}{\lambda + t}\big)\). In particular \(\Pr [r \geq \lambda + 100\sqrt{\lambda }] \leq \exp (-100^2/2)\).

Lemma 32 Entropy upper bound on binomial sums

For \(\alpha \in [0,1/2]\) and \(n \in \mathbb {N}\), \(\sum _{i=0}^{\lfloor \alpha n\rfloor }\binom {n}{i} \leq 2^{nh_2(\alpha )}\).

Proof

For \(\alpha \in (0,1/2]\) we have \(\frac{\alpha }{1-\alpha }\leq 1\), so for \(i \leq \alpha n\) the factor \((\alpha /(1-\alpha ))^{i - \alpha n}\geq 1\). Hence

\[ \sum _{i \leq \alpha n}\binom {n}{i} \leq \sum _{i \leq \alpha n}\binom {n}{i} \left(\tfrac {\alpha }{1-\alpha }\right)^{i-\alpha n} \leq \left(\tfrac {\alpha }{1-\alpha }\right)^{-\alpha n} \sum _{i=0}^n \binom {n}{i}\left(\tfrac {\alpha }{1-\alpha }\right)^{i} = \alpha ^{-\alpha n}(1-\alpha )^{-(1-\alpha )n} = 2^{nh_2(\alpha )}, \]

using the binomial theorem \(\sum _i \binom {n}{i}t^i = (1+t)^i\) with \(t = \alpha /(1-\alpha )\), so \((1+t)^n = (1-\alpha )^{-n}\). The cases \(\alpha = 0\) is trivial.

Lemma 33 Entropy lower bound on a binomial

For \(\alpha \in [0,1]\) with \(\alpha n \in \mathbb {N}\), \(\binom {n}{\alpha n} \geq \frac{1}{n+1}2^{nh_2(\alpha )} \geq 2^{(h_2(\alpha ) - \frac{\log _2(2n)}{2n})n}\).

Proof

Among the \(n+1\) terms \(\binom {n}{i}\) the term \(\binom {n}{\alpha n}\) is the largest at \(i = \alpha n \leq n/2\) (or by symmetry), and \(\binom {n}{\alpha n} \alpha ^{\alpha n}(1-\alpha )^{(1-\alpha )n}\) is the largest term of \(1 = (\alpha + (1-\alpha ))^n = \sum _i \binom {n}{i}\alpha ^i(1-\alpha )^{n-i}\), whence \(\binom {n}{\alpha n}\alpha ^{\alpha n}(1-\alpha )^{(1-\alpha )n}\geq \frac{1}{n+1}\), i.e. \(\binom {n}{\alpha n}\geq \frac{1}{n+1}2^{nh_2(\alpha )}\). Finally \(\frac{1}{n+1}\geq 2^{-\log _2(2n)}\) for \(n \geq 1\) gives the second inequality.

Lemma 34 Linear upper bound on \(h_2\)

For all \(x \in [0,1]\), \(h_2(x) \leq -x\log _2 x + 2x\).

Proof

We have \(h_2(x) = -x\log _2 x - (1-x)\log _2(1-x)\). It suffices to show \(-(1-x)\log _2(1-x) \leq 2x\) for \(x \in [0,1]\). Writing \(u = 1-x \in [0,1]\) this is \(-u\log _2 u \leq 2(1-u)\). Since \(\ln u \geq 1 - 1/u\) gives \(-\ln u \leq \frac{1-u}{u}\), we get \(-u\ln u \leq 1-u\), so \(-u\log _2 u = \frac{-u\ln u}{\ln 2}\leq \frac{1-u}{\ln 2}\leq 2(1-u)\), as \(1/\ln 2 {\lt} 2\).

Lemma 35 Inverse entropy: quadratic upper bound

For \(x \in [0,1]\), \(h_2^{-1}(1 - x^2) \leq \frac12 - \frac{\sqrt{\ln 2}}{2}x\).

Lemma 36 Inverse entropy: linear lower bound

For \(x \in [0,\tfrac 14]\), \(h_2^{-1}(1 - x) \geq \frac12 - 5x^2\).

Lemma 37 Counting bounded even compositions

The number of ways to choose a multiset of \(s\) even positive integers (an ordered assignment of even values to \(\leq s/2\) chosen positions summing to a prescribed even total \(S\) with \(s = S\)) is at most \(2^{3S/2}\); concretely, the number of nonnegative integer compositions counted in the proof of Claim 4.3 is \(\binom {S/2 + S - 1}{S} \leq 2^{3S/2}\).

Proof

By stars and bars, the number of ways to distribute is \(\binom {S/2 + S - 1}{S} \leq 2^{S/2 + S - 1}\leq 2^{3S/2}\), using \(\binom {a+b}{b}\leq 2^{a+b}\).