Dependency graph

Legend

Boxes: definitions
Ellipses: theorems and lemmas
Blue border: the statement of this result is ready to be formalized; all prerequisites are done
Orange border: the statement of this result is not ready to be formalized; the blueprint needs more work
Blue background: the proof of this result is ready to be formalized; all prerequisites are done
Green border: the statement of this result is formalized
Green background: the proof of this result is formalized
Dark green background: the proof of this result and all its ancestors are formalized
Dark green border: this is in Mathlib

Definition 369\(\)j\(th-level siblings, Def.~ 16.5.1\)

A pair \(\)(i,i’)\( are \)\(\)j\(th-level siblings\) if they are in the same block at the \(\)j\(th level and \)i’=i+2^t-j\(. For \)i[n=2^t]\( and \)0≤j≤t\(, let \[ S_{i,j} \stackrel{\text{def}}{=} \{ k\in [n]\mid i\equiv k\! \! \pmod{2^{t-j}}\} . \] \)

Lemma 370 Eq. (16.12)

\(\)P_n(Z) = (Z_1^(t),…,Z_n^(t))\(. \)

Lemma

Proof ▶

By induction on

\(\)t\( (i.e., on \)n=2^t\(). For \)t=0\( (\)n=1\(), \)P_1(Z)=Z=(Z_1^(0))\( trivially. For \)t≥1\(, write \)Z=(U,V)\( with \)U=(Z_1,…,Z_n/2)\(, \)V=(Z_n/2+1,…,Z_n)\(; by Definition~ \ref{def:c16-basic-polarizing-matrix}, \)P_n(Z)=(P_n/2(U+V),P_n/2(V))\(. By definition, for \)i≤n/2\(, \)Z_i^(1)=Z_i+Z_i+n/2=U_i+V_i\(, i.e., the first block at level \)1\( consists of \)U+V\(; for \)i>n/2\(, \)Z_i^(1)=V_i-n/2\( carries \)V\( unchanged. Applying the inductive hypothesis (for \)t-1\() to \)U+V\( gives \)P_n/2(U+V)=((U+V)_1^(t-1),…,(U+V)_n/2^(t-1))\(, and since level-\)j\( variables (\)j≥1\() for indices \)≤n/2\( are computed entirely from \)Z^(1)|_{1,…,n/2}=U+V\( by the same recursive rule shifted by one level, \)(U+V)_i^(t-1)=Z_i^(t)\( for \)i≤n/2\(. Symmetrically, \)P_n/2(V)_i=Z_n/2+i^(t)\( for \)i≤n/2\(. Combining, \)P_n(Z)=(Z_1^(t),…,Z_n^(t))\(. \)

Proof ▶

Definition 371 Local Polarization, Def. 16.5.2

A sequence of random variables

\(\)X_0,…,X_j,…\( with \)X_j[0,1]\( is \emph{locally polarizing} if: \begin{enumerate} \item \textbf{(Unbiased)} For every \end{enumerate}\)j\( and \)a[0,1]\(: \)E[X_j+1∣X_j=a]=a\(. \item \textbf{(Variance in the middle)} For every \)τ>0\( there exists \)θ=θ(τ)>0\( such that for all \)j\(: if \)X_j(τ,1-τ)\( then \)|X_j+1-X_j|≥θ\(. \item \textbf{(Suction at the ends)} For every \)c<∞\( there exists \)τ=τ(c)>0\( such that (i) if \)X_j≤τ\( then \)[X_j+1≤X_j/c]≥1/2\(; and (ii) if \)1-X_j≤τ\( then \)[1-X_j+1≤(1-X_j)/c]≥1/2\(. \)

The sequence is simple if for every sequence \(\)a_0,…,a_j\(, conditioned on \)X_0=a_0,…,X_j=a_j\(, there are two values \)a^+,a^ℓ\( such that \)X_j+1\( equals \)a^+\( with probability \)1/2\( and \)a^ℓ\( with probability \)1/2\(. \)

LaTeX

Lemma 159List-decoding capacity: existence, Thm 7.4.2(i)

Let \(q \geq 2\) be an integer, \(0 {\lt} \rho {\lt} 1 - \frac1q\) a real number, and \(L \geq 1\) an integer. Then there exists a \((\rho , L)\)-list decodable code with rate

\[ R \leq 1 - H_q(\rho ) - \frac1L. \]

Proof ▶

We use the probabilistic method: we pick a random code and show it satisfies the required property with nonzero probability, by showing the probability of a “bad event” is small.

Pick a code \(C\) at random where \(|C| = q^k\) with \(k \leq \left(1 - H_q(\rho ) - \frac1L\right) n\), i.e., for every message \(\mathbf{m} \in [q]^k\), choose \(C(\mathbf{m})\) uniformly and independently at random from \([q]^n\).

Given \(\mathbf{y} \in [q]^n\) and distinct \(\mathbf{m}_0, \ldots , \mathbf{m}_L \in [q]^k\), say the tuple \((\mathbf{y}, \mathbf{m}_0, \ldots , \mathbf{m}_L)\) defines a bad event if

\[ C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n) \text{ for all } 0 \leq i \leq L. \]

A code is \((\rho ,L)\)-list decodable if and only if no bad event occurs (a bad event witnesses \(L+1\) codewords all within distance \(\rho n\) of some \(\mathbf{y}\), exceeding the list-size bound \(L\)).

Fix \(\mathbf{y}\) and \(\mathbf{m}_0, \ldots , \mathbf{m}_L\). For each fixed \(i\), by the random choice of \(C\),

\[ \Pr [C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n)] = \frac{\mathrm{Vol}_q(\rho n, n)}{q^n} \leq q^{-n(1-H_q(\rho ))}. \]

Since the random choices of codewords for distinct messages are independent,

\[ \Pr \Big[ \bigwedge _{i=0}^L C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n) \Big] = \prod _{i=0}^L \Pr [C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n)] \leq \left(q^{-n(1-H_q(\rho ))}\right)^{L+1} = q^{-n(L+1)(1-H_q(\rho ))}. \]

By the union bound over the \(q^n\) choices of \(\mathbf{y}\) and the \(\binom {q^k}{L+1}\) choices of \(L+1\) distinct messages,

\[ \Pr [\text{there is a bad event}] \leq q^n \binom {q^k}{L+1} q^{-n(L+1)(1-H_q(\rho ))} \leq q^n \cdot q^{k(L+1)} \cdot q^{-n(L+1)(1-H_q(\rho ))} = q^{-n(L+1)\left[1 - H_q(\rho ) - \frac{1}{L+1} - R\right]}, \]

using \(\binom {a}{b} \leq a^b\) and \(k = Rn\). By assumption \(R \leq 1 - H_q(\rho ) - 1/L\), so \(1 - H_q(\rho ) - R \geq 1/L\), hence

\[ 1 - H_q(\rho ) - \frac{1}{L+1} - R \geq \frac1L - \frac{1}{L+1} = \frac{1}{L(L+1)}, \]

and so

\[ \Pr [\text{there is a bad event}] \leq q^{-n(L+1) \cdot \frac{1}{L(L+1)}} = q^{-n/L} {\lt} 1. \]

Since the probability of a bad event is strictly less than \(1\), by the probabilistic method there exists a code \(C\) with no bad event, i.e., a \((\rho , L)\)-list decodable code of rate \(R \leq 1 - H_q(\rho ) - 1/L\).

LaTeX

Lemma 429Locality of C^*

With notation as in Construction 426, every codeword symbol of \(\)C^*\( has a local recovery set of size \)r\(; that is, \)C^*\( satisfies property (ii) of Definition~ \ref{def:c19-lrc} with locality \)r\(. \begin{proof} \uses{constr:c19-cstar, lem:c19-roots-unity-product} Let \end{proof}\)U = {1, ω, ω^2, …, ω^r}\(, the set of \)(r+1)\('th roots of unity in \)F_q^*\(. Since \)F_q^*\( is cyclic of order \)q - 1\( and \)|U| = r+1\( divides \)q-1\(, the distinct multiplicative cosets \)αU = {α, αω, …, αω^r}\(, for \)α\( ranging over coset representatives, partition \)F_q^*\( into \)(q-1)/(r+1)\( cosets of size \)r+1\( each; it suffices to prove the local recovery property on each individual coset \)αU\(. Fix a codeword symbol location \)αω^i_0 αU\( coming from a polynomial \)f(X) = ∑_i=0^k’-1 f_i X^i\( with \)f_i = 0\( whenever \)i ≡r r+1\( (as in Construction~ \ref{constr:c19-cstar}). Reducing \)f(X)\( modulo \)X^r+1 - α^r+1\( gives a polynomial \)g^(α)(X)\( of degree at most \)r\( that agrees with \)f\( on all of \)αU\( (since each point of \)αU\( is a root of \)X^r+1 - α^r+1\(, by Lemma~ \ref{lem:c19-roots-unity-product} applied with this \)α\(, so \)f(X) ≡g^(α)(X) X^r+1 - α^r+1\( implies \)f\( and \)g^(α)\( take the same values at every root of \)X^r+1-α^r+1\(). Since only the coefficients \)f_i\( with \)i ≡r r+1\( are forced to vanish, and reduction modulo \)X^r+1-α^r+1\( only combines coefficients whose indices are congruent modulo \)r+1\(, the coefficient of \)X^r\( in \)g^(α)(X)\( is (up to a nonzero scalar power of \)α\() a sum of the vanishing coefficients \)f_r, f_2r+1, …\( of \)f\(, and hence is itself \)0\(. Thus \)g^(α)(X)\( has degree at most \)r - 1\( (strictly less than \)r\(). By Lemma~ \ref{lem:c19-roots-unity-product}, the \)r+1\( points of \)αU\( are precisely the roots of \)X^r+1 - α^r+1 = ∏_i=0^r (X - αω^i)\(. A polynomial \)g^(α)\( of degree at most \)r-1\( evaluated at these \)r+1\( distinct points satisfies a single nontrivial linear constraint \)∑_i=0^r λ_αω^i g^(α)(αω^i) = 0\( with all \)λ_αω^i ≠0\(: this is because the \)(r+1)×(r+1)\( Vandermonde-type evaluation matrix at the \)r+1\( points \){αω^i}\( has a \)1\(-dimensional left null space (since evaluation of degree \)≤r-1\( polynomials at \)r+1\( points has rank \)r\(, the space of degree \)≤r\( polynomials), and the coefficient \)λ_αω^i\( corresponding to a given point is (up to scalar) the value at that point of the degree-\)r\( polynomial vanishing at the other \)r\( points of \)αU\(, which is nonzero at \)αω^i\( itself (else it would have \)r+1\( roots while having degree \)r\(). Since \)f(X)\( and \)g^(α)(X)\( agree on \)αU\(, this same linear constraint \)∑_i=0^r λ_αω^i f(αω^i) = 0\( holds among the codeword symbols indexed by \)αU\(, with every coefficient nonzero. In particular, for each \)i_0\(, the codeword symbol at \)αω^i_0\( is recoverable (via this linear equation) from the other \)r\( codeword symbols at \)αU ∖{αω^i_0}\(. As \)αU\( ranges over all cosets partitioning \)F_q^*\(, every codeword location has a local recovery set of size \)r\(, as required. \)

Proof ▶

LaTeX

Proposition 95Entropy bounds on the volume of a Hamming ball, Prop 3.3.3

Let \(q \ge 2\) be an integer and \(0 \le p \le 1 - \frac{1}{q}\) be a real number. Then:

\(\mathrm{Vol}_q(pn, n) \le q^{H_q(p)n}\); and
for large enough \(n\), \(\mathrm{Vol}_q(pn,n) \ge q^{H_q(p)n - o(n)}\).

Proof ▶

Proof of (i). Starting from the binomial expansion of \(1 = (p + (1-p))^n\),

\begin{align*} 1 & = \sum _{i=0}^n \binom {n}{i} p^i (1-p)^{n-i} \\ & = \sum _{i=0}^{pn} \binom {n}{i} p^i (1-p)^{n-i} + \sum _{i=pn+1}^{n} \binom {n}{i} p^i (1-p)^{n-i}\\ & \ge \sum _{i=0}^{pn} \binom {n}{i} p^i (1-p)^{n-i} \\ & = \sum _{i=0}^{pn} \binom {n}{i} (q-1)^i \left(\frac{p}{q-1}\right)^i (1-p)^{n-i}\\ & = \sum _{i=0}^{pn} \binom {n}{i} (q-1)^i (1-p)^n \left(\frac{p}{(q-1)(1-p)}\right)^i . \end{align*}

Since \(0 \le p \le 1 - 1/q\) implies \(\frac{p}{(q-1)(1-p)} \le 1\), and \(i \le pn\) in the sum, raising a number in \([0,1]\) to a larger power can only decrease it, so \(\left(\frac{p}{(q-1)(1-p)}\right)^i \ge \left(\frac{p}{(q-1)(1-p)}\right)^{pn}\) for every \(i \le pn\). Hence

\begin{align*} 1 & \ge \sum _{i=0}^{pn} \binom {n}{i} (q-1)^i (1-p)^n \left(\frac{p}{(q-1)(1-p)}\right)^{pn}\\ & = \sum _{i=0}^{pn} \binom {n}{i} (q-1)^i \left(\frac{p}{q-1}\right)^{pn} (1-p)^{(1-p)n} \\ & \ge \mathrm{Vol}_q(pn,n) \cdot q^{-H_q(p)n}, \end{align*}

where the last line uses \(q^{-H_q(p)n} = \left(\frac{p}{q-1}\right)^{pn} (1-p)^{(1-p)n}\), which follows directly from the definition of \(H_q\). Rearranging gives \(\mathrm{Vol}_q(pn,n) \le q^{H_q(p)n}\), which proves (i).

Proof of (ii). By Stirling’s approximation for \(n!\),

\[ \binom {n}{pn} = \frac{n!}{(pn)!((1-p)n)!} {\gt} \frac{(n/e)^n}{(pn/e)^{pn}((1-p)n/e)^{(1-p)n}} \cdot \frac{e^{\lambda _1(n) - \lambda _2(pn) - \lambda _2((1-p)n)}}{\sqrt{2\pi p(1-p) n}} = \frac{1}{p^{pn}(1-p)^{(1-p)n}} \cdot \ell (n), \]

where \(\ell (n) = \dfrac {e^{\lambda _1(n) - \lambda _2(pn) - \lambda _2((1-p)n)}}{\sqrt{2\pi p(1-p)n}}\) (with \(\lambda _1, \lambda _2\) the error terms of Stirling’s approximation). Then, keeping only the last term of the sum defining \(\mathrm{Vol}_q(pn,n)\),

\[ \mathrm{Vol}_q(pn,n) \ge \binom {n}{pn}(q-1)^{pn} {\gt} \frac{(q-1)^{pn}}{p^{pn}(1-p)^{(1-p)n}} \cdot \ell (n) \ge q^{H_q(p)n - o(n)}, \]

where the last inequality follows from the definition of \(H_q(\cdot )\) and the fact that \(\ell (n) = q^{-o(n)}\) for large enough \(n\). This proves (ii) and completes the proof.

LaTeX

Theorem 164List decoding from random errors, Thm 7.5.1

Let \(\varepsilon {\gt} 0\) be a real and \(q \geq 2^{\Omega (1/\varepsilon )}\) be an integer. Then the following is true for every \(0 {\lt} \delta {\lt} 1 - 1/q\) and large enough \(n\). Let \(C \subseteq \{ 0, 1, \ldots , q-1\} ^n\) be a code with relative distance \(\delta \) and let \(S \subseteq [n]\) be such that \(|S| = (1-\rho )n\), where \(0 {\lt} \rho \leq \delta - \varepsilon \).

Then, for all \(\mathbf{c} \in C\) and all but a \(q^{-\Omega (\varepsilon n)}\) fraction of error patterns \(\mathbf{e} \in \{ 0, 1, \ldots , q-1\} ^n\) such that

\[ \mathbf{e}_S = \mathbf{0} \quad \text{and} \quad \mathrm{wt}(\mathbf{e}) = \rho n, \]

the only codeword within Hamming distance \(\rho n\) of \(\mathbf{c} + \mathbf{e}\) is \(\mathbf{c}\) itself.

Proof ▶

Fix \(\mathbf{c} \in C\) for the rest of the proof, and let \(d = \delta n\). Let \(\mathcal{E}_S\) be the set of all error patterns \(\mathbf{e}\) with \(\mathbf{e}_S = \mathbf{0}\) and \(\mathrm{wt}(\mathbf{e}) = \rho n\); since \(\mathbf{e}_S = \mathbf{0}\) and \(\mathrm{wt}(\mathbf{e}) = \rho n = |[n] \setminus S|\), every position of \([n] \setminus S\) must be nonzero in \(\mathbf{e}\), and each such position has \(q-1\) nonzero choices, so

\[ |\mathcal{E}_S| = (q-1)^{\rho n}. \]

Call \(\mathbf{e} \in \mathcal{E}_S\) bad if there exists another codeword \(\mathbf{c}' \neq \mathbf{c}\) with \(\Delta (\mathbf{c}', \mathbf{c} + \mathbf{e}) \leq \rho n\). We must show the number of bad error patterns is at most \(q^{-\Omega (\varepsilon n)}|\mathcal{E}_S|\).

For a bad \(\mathbf{e}\), let \(c(\mathbf{e}) \neq \mathbf{c}\) be its associated codeword (Definition 162), and let \(A \subseteq [n]\) be the set of positions where \(c(\mathbf{e})\) agrees with \(\mathbf{c} + \mathbf{e}\). Write \(|A| = \alpha n\). Since \(\Delta (c(\mathbf{e}), \mathbf{c}+\mathbf{e}) \leq \rho n\), they agree on at least \((1-\rho )n\) positions, so

\[ \alpha \geq 1 - \rho \geq 1 - \delta + \varepsilon . \]

Fix \(A\) with \(|A| = \alpha n\); we count how many bad \(\mathbf{e}\) map to this \(A\), and later aggregate over all (at most \(2^n\)) choices of \(A\). Let \(A_1 = A \cap S\) and \(A_2 = A \setminus A_1\), and write \(|A_1| = \beta n\), so \(|A_2| = (\alpha - \beta ) n\) and \(\beta \leq \alpha \).

We overestimate the number of \(\mathbf{e}\) mapping to \((A_1, A_2)\) as follows.

First, the values of \(\mathbf{e}\) on \([n] \setminus (S \cup A_2)\) must all be nonzero (since \(\mathbf{e}\) is supported exactly on \([n]\setminus S\)), so the number of possible values of \(\mathbf{e}_{[n]\setminus (S \cup A_2)}\) is at most

\[ (q-1)^{n - |S| - |A_2|} \leq q^{\, n - (1-\rho )n - (\alpha -\beta )n}. \]
Fix a nonzero choice \(\mathbf{x}\) for \(\mathbf{e}_{[n]\setminus (S \cup A_2)}\). Since \(A_1 \subseteq S\), we have \(\mathbf{e}_{A_1} = \mathbf{0}\), so \((\mathbf{c} + \mathbf{e})_{A_1} = \mathbf{c}_{A_1}\); since \(A_1 \subseteq A\), \(c(\mathbf{e})\) agrees with \(\mathbf{c}+\mathbf{e}\) on \(A_1\), so \(c(\mathbf{e})_{A_1} = \mathbf{c}_{A_1}\) is already determined (as \(\mathbf{c}\) is fixed). By Lemma 163 applied to \(C\) (an \((n, k, d)_q\) code), fixing the values of \(c(\mathbf{e})\) on any \(n - d + 1 = (1-\delta )n + 1\) positions determines \(c(\mathbf{e})\) completely; since \(|A_1| = \beta n\) positions of \(c(\mathbf{e})\) are already fixed (equal to \(\mathbf{c}_{A_1}\)), fixing a further \((1-\delta )n + 1 - \beta n\) positions of \(c(\mathbf{e})_{A_2}\) determines \(c(\mathbf{e})\) completely, and hence determines \(\mathbf{e}\) completely (via \(\mathbf{e} = c(\mathbf{e}) - \mathbf{c}\) on \(A\), together with the already-fixed values elsewhere). Thus the number of choices for \(\mathbf{e}\) compatible with this \(\mathbf{x}\) is at most

\[ q^{(1-\delta )n + 1 - \beta n}. \]

Combining, the number of bad \(\mathbf{e}\) mapping to \((A_1, A_2)\) is at most

\[ q^{\, n - (1-\rho )n - (\alpha -\beta )n} \cdot q^{(1-\delta )n + 1 - \beta n} = q^{\, \rho n - \alpha n + (1-\delta ) n + 1} \leq q^{\, \rho n - \varepsilon n + 1} = q^{-\varepsilon n + 1} |\mathcal{E}_S|, \]

where the inequality uses \(\alpha \geq 1 - \delta + \varepsilon \) (so \(-\alpha n \leq -(1 - \delta + \varepsilon ) n\), and \(\rho n - \alpha n + (1-\delta )n \leq \rho n - \varepsilon n\)), and the last equality uses \(|\mathcal{E}_S| = (q-1)^{\rho n} \leq q^{\rho n}\) together with absorbing the base change into the asymptotic notation (as in the book).

Finally, summing over all (at most \(2^n\)) choices of \(A = (A_1, A_2)\), the total number of bad error patterns is at most

\[ 2^n \cdot q^{-\varepsilon n + 1} |\mathcal{E}_S| = q^{n/\log _2 q} \cdot q^{-\varepsilon n + 1} |\mathcal{E}_S| = q^{\, n/\log _2 q - \varepsilon n + 1} |\mathcal{E}_S| \leq q^{- \varepsilon n/4} |\mathcal{E}_S|, \]

where the last inequality holds because for \(q \geq \Omega (1/\varepsilon )\) and \(n\) large enough, \(n/\log _2 q + 1 {\lt} 3\varepsilon n /4\). Hence the fraction of bad error patterns in \(\mathcal{E}_S\) is at most \(q^{-\Omega (\varepsilon n)}\), i.e., for all but a \(q^{-\Omega (\varepsilon n)}\) fraction of \(\mathbf{e} \in \mathcal{E}_S\), \(\mathbf{e}\) is not bad, which means \(\mathbf{c}\) is the unique codeword within Hamming distance \(\rho n\) of \(\mathbf{c} + \mathbf{e}\), as required.

LaTeX

Theorem 209Existence of a good ensemble of inner codes, Thm 10.3.1

Let \(\varepsilon {\gt} 0\). There exists an ensemble of inner codes \(C_{in}^1, C_{in}^2,\ldots ,C_{in}^N\) of rate \(\frac{1}{2}\), where \(N = q^k - 1\), such that for at least \((1-\varepsilon )N\) values of \(i\), the code \(C_{in}^i\) has relative distance \(\ge H_q^{-1}\big(\frac{1}{2}-\varepsilon \big)\). In fact, this ensemble may be taken to be the Wozencraft ensemble \(\{ C_{in}^\alpha \} _{\alpha \in \mathbb {F}_{q^k}^*}\) of Definition 207.

Proof ▶

Fix \(\mathbf{y} = (\mathbf{y}_1,\mathbf{y}_2) \in \mathbb {F}_{q^k}^{2} \setminus \{ \mathbf{0}\} \); note this forces \(\mathbf{y}_1 \ne \mathbf{0}\) or \(\mathbf{y}_2 \ne \mathbf{0}\) (not both zero). We first claim \(\mathbf{y} \in C_{in}^\alpha \) for at most one \(\alpha \in \mathbb {F}_{q^k}^*\). Note that if \(\mathbf{y} \in C_{in}^\alpha \) then \(\mathbf{y}_2 = \alpha \cdot \mathbf{y}_1\). We case on \(\mathbf{y}\):

If \(\mathbf{y}_1 \ne \mathbf{0}\) and \(\mathbf{y}_2 \ne \mathbf{0}\), then \(\mathbf{y} \in C_{in}^\alpha \) exactly for \(\alpha = \mathbf{y}_2/\mathbf{y}_1\), a single value.
If \(\mathbf{y}_1 \ne \mathbf{0}\) and \(\mathbf{y}_2 = \mathbf{0}\), then \(\mathbf{y} \notin C_{in}^\alpha \) for every \(\alpha \in \mathbb {F}_{q^k}^*\), since \(\alpha \mathbf{y}_1 \ne \mathbf{0}\) whenever \(\alpha , \mathbf{y}_1 \in \mathbb {F}_{q^k}^*\).
If \(\mathbf{y}_1 = \mathbf{0}\) and \(\mathbf{y}_2 \ne \mathbf{0}\), then \(\mathbf{y} \notin C_{in}^\alpha \) for every \(\alpha \in \mathbb {F}_{q^k}^*\), since \(\alpha \mathbf{y}_1 = \mathbf{0} \ne \mathbf{y}_2\).

This proves the claim.

Now assume \(\mathrm{wt}(\mathbf{y}) {\lt} H_q^{-1}\big(\frac{1}{2}- \varepsilon \big)\cdot 2k\). If \(\mathbf{y} \in C_{in}^\alpha \) for some \(\alpha \), then (by the claim just proved) \(C_{in}^\alpha \) is “bad”, i.e. has relative distance \({\lt} H_q^{-1}\big(\frac{1}{2}-\varepsilon \big)\), for at most this one value of \(\alpha \). So the total number of bad values of \(\alpha \) is at most the number of nonzero vectors of weight \({\lt} H_q^{-1}(\frac12-\varepsilon )\cdot 2k\) in \(\mathbb {F}_q^{2k}\), which by the upper bound on the volume of a Hamming ball (Proposition 95, part (i)) is at most

\[ \mathrm{Vol}_q\Big(H_q^{-1}\big(\tfrac 12-\varepsilon \big)\cdot 2k,\, 2k\Big) \le q^{H_q\left(H_q^{-1}(\frac12-\varepsilon )\right)\cdot 2k} = q^{(\frac12-\varepsilon )\cdot 2k} = \frac{q^k}{q^{2\varepsilon k}} {\lt} \varepsilon (q^k-1) = \varepsilon N, \]

where the strict inequality holds for large enough \(k\). Thus for at least \((1-\varepsilon )N\) values of \(\alpha \in \mathbb {F}_{q^k}^*\), the code \(C_{in}^\alpha \) has relative distance at least \(H_q^{-1}\big(\frac12-\varepsilon \big)\), as desired. Each \(C_{in}^\alpha \) has rate \(k/(2k) = 1/2\) by construction, completing the proof.

LaTeX

Theorem 342Efficiently achieving the capacity of BSC_p\(, Theorem 15.4.1\)

For every constant \(\)p\( and \)0<ε<1-H(p)\(, there exists a linear code \)C^*\( of block length \)N\( and rate at least \)1-H(p)- ε\(, such that \begin{enumerate} \item[(a)]\end{enumerate}\)C^*\( can be constructed in time \)poly(N)+ 2^O(ε^-5)\(; \item [(b)] \)C^*\( can be encoded in time \)O(N^2)\(; and \item [(c)] there exists a \)poly(N)+N·2^O(ε^-5)\( time decoding algorithm that has an error probability of at most \)2^-Ω(ε^6N)\( over the \)BSC_p\(. \)

Proof ▶

Set

\(\)γ=ε^3\(, let \)C_in\( be as in Construction~ \ref{constr:c15-inner-code} and \)C_out\( as in Construction~ \ref{constr:c15-outer-code}, and let \)C^*=C_out○C_in\(. By Proposition~ \ref{prop:c15-rate-of-concatenation} and Proposition~ \ref{prop:c15-outer-code-rate} (which shows \)R≥1-ε/2\() together with Construction~ \ref{constr:c15-inner-code} (which shows \)r≥1-H(p)-ε/2\(), \)C^*\( has rate at least \)1-H(p)-ε\(. \emph{(a) Construction time.} With \)γ=ε^3\(, the parameter \)k\( of Construction~ \ref{constr:c15-inner-code} is \)k=Θ(1/γ)/ε^2=Θ(1/ ε)/ε^2\(, and since \)p\( is constant, \)n= Θ(k)\(. By Proposition~ \ref{prop:c15-inner-code-complexity}, \)C_in\( can be constructed in time \)2^O(n^2)=2^Oε^-4 ^2(1/ε)\(. Since for any constant \)a>0\(, \)ε^-a^O(1)(1/ε)=O(ε^-a-1)\( (applying this with \)a=4\(), this construction time is \)2^O(ε^-5)\(. The outer code \)C_out\( (Construction~ \ref{constr:c15-outer-code}) is constructed in time \)poly(N)=poly(N)\(, via Theorem~ \ref{thm:c10-zyablov} and Lemma~ \ref{lem:c15-folding}. Hence the overall construction time for \)C^*\( is \)poly(N)+ 2^O(ε^-5)\(. \emph{(b) Encoding time.} By Proposition~ \ref{prop:c15-encoding-decoding-time}, encoding \)C^*\( takes time \)O(N^2)\(. \emph{(c) Decoding time and error probability.} By Proposition~ \ref{prop:c15-inner-code-complexity}, \)T_in(k)=2^O(k)\(; since \)k≤n\(, \)2^O(k)≤2^O(n^2)=2^O(ε^-5)\( as computed above. By Proposition~ \ref{prop:c15-encoding-decoding-time} (using \)T_out(N)=poly(N)\(, which holds for the GMD-based \)D_out\( of Construction~ \ref{constr:c15-outer-code}), the decoding time is \)poly(N)+N·2^O(k)≤poly (N)+N·2^O(ε^-5)\(. For the error probability, by Proposition~ \ref{prop:c15-decoding-error-probability}, decoding fails with probability at most \)2^-Ω(γN/n)\(. Since \)γ= ε^3\( and \)n=Θε^-2(1/ε) \(, this exponent is \[ \Omega \left(\frac{\gamma N}{n}\right) = \Omega \left(\frac{\varepsilon ^3}{\varepsilon ^{-2}\log (1/\varepsilon )} \cdot \frac{N n}{n}\right) =\Omega \left(\frac{\varepsilon ^5}{\log (1/\varepsilon )}\cdot N\right) =\Omega \left(\frac{\varepsilon ^5}{\log (1/\varepsilon )}\cdot \frac{\mathcal N}{n}\right), \] and since \)ε^5/(1/ε)≥ε^6\( for all small enough \)ε\( (as \)ε(1/ε)0\(), and \)n=O(ε^-2(1/ε))=O(ε^-δ)\( for any \)δ>0\( can be absorbed into the \)Ω(·)\(, this gives a decoding error probability of at most \)2^-Ω(ε^6 N)\(. \)

Proof ▶

LaTeX