10 From Large to Small Alphabets: Code Concatenation

Recall Open Question 8.3.2 (cross-ref, Ch. 8): is there an explicit asymptotically good binary code, i.e., one with rate \(R{\gt}0\) and relative distance \(\delta {\gt}0\)? Table 10.1 of the book recalls the strongly explicit codes seen so far – the Hamming code (Section 2.4), the Hadamard code (Section 2.6), and “binary-RS” codes obtained by writing the symbols of a Reed–Solomon code over \(\mathbb {F}_{2^s}\) as bit strings – and observes that each has one of rate or relative distance extremely good at the expense of the other, with binary-RS codes achieving \(R\delta = \Theta (1/\log n)\), the best of the three but still short of \(R\delta = \Omega (1)\). This chapter introduces code concatenation, a general way of building a code over a small alphabet out of a large-alphabet “outer” code and a small-alphabet “inner” code, and uses it to construct explicit (Section 10.2) and then strongly explicit (Section 10.3) asymptotically good binary codes.

10.1 Code Concatenation: The Basic Idea

Definition 202 Concatenated code

Fix \(q \ge 2\), \(k \ge 1\) and \(Q = q^k\). Consider two codes, called the outer code and inner code:

\[ C_{out} : [Q]^K \to [Q]^N, \qquad C_{in} : [q]^k \to [q]^n. \]

Since the alphabet size of \(C_{out}\) equals \(Q = q^k\), the number of possible messages of \(C_{in}\), we may fix a bijection between \([Q]\) and \([q]^k\) and thereby use \(C_{in}\) to encode any single symbol of a codeword of \(C_{out}\). The concatenated code \(C_{out}\circ C_{in} : [q]^{kK} \to [q]^{nN}\) is then defined, for \(\mathbf{m} = (m_1,\ldots ,m_K) \in [Q]^K\), by

\[ C_{out}\circ C_{in}(\mathbf{m}) = \big(C_{in}(C_{out}(\mathbf{m})_1), \ldots , C_{in}(C_{out}(\mathbf{m})_N)\big), \]

where \(C_{out}(\mathbf{m}) = (C_{out}(\mathbf{m})_1,\ldots , C_{out}(\mathbf{m})_N)\). (Unlike string concatenation, this is a recursive construction: the outer code is used to reduce the block length we need from the inner code.)

Theorem 203 Parameters of concatenated codes, Thm 10.1.1

If \(C_{out}\) is an \((N,K,D)_{q^k}\) code and \(C_{in}\) is an \((n,k,d)_q\) code, then \(C_{out}\circ C_{in}\) is an \((nN,kK,dD)_q\) code. In particular, if \(C_{out}\) (resp. \(C_{in}\)) has rate \(R\) (resp. \(r\)) and relative distance \(\delta _{out}\) (resp. \(\delta _{in}\)), then \(C_{out}\circ C_{in}\) has rate \(Rr\) and relative distance \(\delta _{out}\cdot \delta _{in}\).

Proof ▶

The claim on block length, dimension and alphabet follows directly from Definition 202; the claim on rate and relative distance follows immediately once the claim on block length, dimension and distance \(\ge dD\) is established. So it remains to show \(C_{out}\circ C_{in}\) has distance at least \(dD\).

Consider arbitrary \(\mathbf{m}_1 \ne \mathbf{m}_2 \in [Q]^K\). Since \(C_{out}\) has distance \(D\),

\[ \Delta \big(C_{out}(\mathbf{m}_1), C_{out}(\mathbf{m}_2)\big) \ge D. \]

Define \(S = \{ i \in [N] \mid C_{out}(\mathbf{m}_1)_i \ne C_{out}(\mathbf{m}_2)_i \} \); by the displayed inequality, \(|S| \ge D\). For each \(i \in S\), since \(C_{out}(\mathbf{m}_1)_i \ne C_{out}(\mathbf{m}_2)_i\) and \(C_{in}\) has distance \(d\),

\[ \Delta \big(C_{in}(C_{out}(\mathbf{m}_1)_i), C_{in}(C_{out}(\mathbf{m}_2)_i)\big) \ge d. \]

Since \(C_{out}\circ C_{in}(\mathbf{m}_1)\) and \(C_{out}\circ C_{in}(\mathbf{m}_2)\) disagree by at least \(d\) coordinates within the block corresponding to inner-code position \(i\), for each of the at least \(D\) positions \(i \in S\), we get

\[ \Delta \big(C_{out}\circ C_{in}(\mathbf{m}_1), C_{out}\circ C_{in}(\mathbf{m}_2)\big) \ge dD. \]

As \(\mathbf{m}_1 \ne \mathbf{m}_2\) were arbitrary, \(C_{out}\circ C_{in}\) has distance at least \(dD\), completing the proof.

Lemma 204 Concatenation of linear codes is linear

If \(C_{out} \subseteq \mathbb {F}_{q^k}^N\) is an \(\mathbb {F}_{q^k}\)-linear code and \(C_{in} \subseteq \mathbb {F}_q^n\) is an \(\mathbb {F}_q\)-linear code of dimension \(k\), then (using an \(\mathbb {F}_q\)-linear identification of \(\mathbb {F}_{q^k}\) with \(\mathbb {F}_q^k\) to instantiate the bijection of Definition 202) the code \(C_{out}\circ C_{in} \subseteq \mathbb {F}_q^{nN}\) is \(\mathbb {F}_q\)-linear.

Proof ▶

Fix an \(\mathbb {F}_q\)-linear bijection \(\varphi : \mathbb {F}_{q^k} \to \mathbb {F}_q^k\), e.g. by fixing an \(\mathbb {F}_q\)-basis of \(\mathbb {F}_{q^k}\) and mapping each element to its coordinate vector, and use \(\varphi \) (applied coordinatewise) to realize the bijection between \([Q]=[q^k]\) and \([q]^k\) required in Definition 202. Let \(\mathbf{G}_{out}\) be a \(K\times N\) generator matrix for \(C_{out}\) over \(\mathbb {F}_{q^k}\) and \(\mathbf{G}_{in}\) a \(k\times n\) generator matrix for \(C_{in}\) over \(\mathbb {F}_q\). The outer encoding map \(\mathbf{m}\mapsto \mathbf{m}\mathbf{G}_{out}\) is \(\mathbb {F}_{q^k}\)-linear and, since \(\mathbb {F}_q \subseteq \mathbb {F}_{q^k}\), is in particular \(\mathbb {F}_q\)-linear once source and target are viewed as \(\mathbb {F}_q\)-vector spaces via \(\varphi \) applied coordinatewise. For each coordinate, encoding a symbol \(c \in \mathbb {F}_{q^k}\) under \(C_{in}\) is, via \(\varphi \), the \(\mathbb {F}_q\)-linear map \(\varphi (c) \mapsto \varphi (c)\mathbf{G}_{in}\). The map \(\mathbf{m} \mapsto C_{out}\circ C_{in}(\mathbf{m})\) is thus the coordinatewise application, to the \(\mathbb {F}_q\)-linear image \(\varphi (C_{out}(\mathbf{m}))\) of an \(\mathbb {F}_q\)-linear map of \(\mathbf{m}\), of the \(\mathbb {F}_q\)-linear map \(\varphi (c)\mapsto \varphi (c)\mathbf{G}_{in}\); a composition of \(\mathbb {F}_q\)-linear maps is \(\mathbb {F}_q\)-linear, so \(C_{out}\circ C_{in}\) is \(\mathbb {F}_q\)-linear. Explicitly, a generator matrix for \(C_{out}\circ C_{in}\) is obtained from \(\mathbf{G}_{out}\) by replacing each entry \(\mathbf{G}_{out}[i,j] \in \mathbb {F}_{q^k}\) with the \(k\times n\) block \(\varphi (\mathbf{G}_{out}[i,j])\, \mathbf{G}_{in}\).

10.2 Zyablov Bound

We now instantiate the outer and inner codes of Theorem 203 to obtain a new lower bound on the rate given a relative distance, called the Zyablov bound.

Lemma 205 Rate–distance tradeoff of GV-inner/Singleton-outer concatenation

Fix a prime power \(q\) and \(\varepsilon {\gt}0\). Let \(C_{out}\) be a \(q^k\)-ary code meeting the Singleton bound (Theorem 110) with rate \(R\), i.e. with relative distance \(\delta _{out} \ge 1-R\) (for instance a Reed–Solomon code, Definition 128, by Theorem 131), and let \(C_{in}\) be a \(q\)-ary code meeting the Gilbert–Varshamov bound (Theorem 105) with rate \(r{\gt}0\), i.e. with relative distance \(\delta _{in} \ge H_q^{-1}(1-r) - \varepsilon \). Then \(C := C_{out}\circ C_{in}\) has rate \(rR\) and relative distance \(\delta = (1-R)\big(H_q^{-1}(1-r)-\varepsilon \big)\). Consequently, expressing \(R\) as a function of \(\delta \) and \(r\), and optimizing over the choice of \(r\), the rate of such a concatenated code satisfies

\[ \mathscr {R} \ge \lim _{\varepsilon \to 0} \left\{ \max _{0 {\lt} r {\lt} 1 - H_q(\delta +\varepsilon )} r \left( 1 - \frac{\delta }{H_q^{-1}(1-r) - \varepsilon } \right) \right\} , \]

where the constraint \(r {\lt} 1-H_q(\delta +\varepsilon )\) is exactly what is needed to ensure the maximized quantity is positive. This lower bound on the rate (as a function of the relative distance \(\delta \)) is called the Zyablov bound.

Proof ▶

By Theorem 203, \(C = C_{out}\circ C_{in}\) has rate \(Rr\) and relative distance \(\delta _{out}\cdot \delta _{in} \ge (1-R)\big(H_q^{-1}(1-r)-\varepsilon \big)\). Setting \(\delta = (1-R)\big(H_q^{-1}(1-r)-\varepsilon \big)\) and solving for \(R\) gives

\[ R = 1 - \frac{\delta }{H_q^{-1}(1-r) - \varepsilon }. \]

Substituting into the rate \(Rr\) of \(C\), and then maximizing over the free parameter \(0 {\lt} r {\lt} 1\) (subject to the constraint, coming from Theorem 105, that \(\delta _{in} = H_q^{-1}(1-r) - \varepsilon \) needs to be a valid relative distance, i.e. \(\delta _{in}{\gt}0\), which after rearranging using monotonicity of \(H_q\) gives \(r {\lt} 1 - H_q(\delta +\varepsilon )\)) and letting \(\varepsilon \to 0\) yields the stated bound on the rate \(\mathscr {R}\) achievable by concatenated codes of relative distance \(\delta \).

Theorem 206 Explicit codes on the Zyablov bound, Thm 10.2.1

For every prime power \(q\), there exists an explicit \(q\)-ary code that achieves the Zyablov bound. Specifically, for every \(\varepsilon {\gt}0\) there exists an algorithm that, given \(\delta \in \big[0, 1-\frac{1}{q}\big]\) and \(n\), outputs, in time polynomial in \(n\), the generator matrix of a code of block length \(n\) and rate

\[ \mathscr {R} \ge \max _{0 {\lt} r {\lt} 1 - H_q(\delta +\varepsilon )} r \left( 1 - \frac{\delta }{H_q^{-1}(1-r) - \varepsilon } \right). \]

Proof ▶

Fix \(\varepsilon {\gt} 0\), \(\delta \) and \(n\). Let \(C_{out}\) be an \([N,K]_Q\) Reed–Solomon code (Definition 128) evaluated over \(\mathbb {F}_Q^*\) where \(Q = q^k\) and \(N = Q-1\); by Theorem 131, \(C_{out}\) meets the Singleton bound, and it has polynomial-time (indeed strongly explicit) construction given \(N\) and \(K\), so \(k = \Theta (\log N)\).

It remains to construct an inner code \(C_{in}\) of block length \(n' = O(k) = O(\log N)\) meeting the Gilbert–Varshamov bound (which exists by Theorem 105). We do not have a poly\((k)\)-time construction of such a code (that would resolve the general open question of an explicit code on the GV bound), but since \(k = O(\log N)\), it suffices to construct \(C_{in}\) in time exponential in \(k\) while remaining polynomial in \(N\). Concretely, one may either:

perform an exhaustive search among all \(q^{O(kn')}\) candidate generator matrices for one meeting the GV bound (which exists by Theorem 105); using \(k = \Theta (n')\) this search takes \(q^{O(k^2)} = N^{O(\log N)}\) time, which is quasi-polynomial in \(N\); or
construct \(C_{in}\) directly in \(q^{O(n')} = q^{O(k)} = N^{O(1)}\) time.

Using the latter (polynomial-in-\(N\)) construction for \(C_{in}\), and combining \(C_{out}\) and \(C_{in}\) via Definition 202, the overall construction of the generator matrix of \(C = C_{out}\circ C_{in}\) takes time polynomial in \(N\), hence polynomial in \(n = n' N\). By Lemma 205, letting \(\varepsilon \to 0\) along this construction, \(C\) has rate at least the claimed bound, completing the proof.

10.3 Advanced Concatenation and Strongly Explicit Constructions

Theorem 206 answers the question of an explicit code on the Zyablov bound, but its construction requires a brute-force search for a good inner code, and is thus not strongly explicit. In this section we describe a family of codes due to Justesen – the Justesen codes – that generalizes the concatenation idea of Section 10.1 and achieves the parameters of the Zyablov bound strongly explicitly. The key insight is to use \(N\) different inner codes (one per outer-codeword coordinate) drawn from a small, strongly explicit ensemble of codes, most of which lie on the GV bound; this sidesteps the need to search for a single good inner code.

Definition 207 Wozencraft ensemble

Fix a prime power \(q\) and a positive integer \(k\), and let \(N = q^k - 1\). For \(\alpha \in \mathbb {F}_{q^k}^*\), define the inner code \(C_{in}^\alpha : \mathbb {F}_q^k \to \mathbb {F}_q^{2k}\) by

\[ C_{in}^\alpha (x) = (x, \alpha x), \]

where \(x\) and \(\alpha x\) are elements of \(\mathbb {F}_{q^k}\), viewed as vectors in \(\mathbb {F}_q^k\) via the standard identification. The Wozencraft ensemble is the collection of codes \(\{ C_{in}^\alpha \} _{ \alpha \in \mathbb {F}_{q^k}^*}\), indexed by the \(N\) elements of \(\mathbb {F}_{q^k}^*\).

Lemma 208 Wozencraft codes are linear and strongly explicit

For every \(\alpha \in \mathbb {F}_{q^k}^*\), the code \(C_{in}^\alpha \) of Definition 207 is \(\mathbb {F}_q\)-linear and strongly explicit.

Proof ▶

Theorem 209 Existence of a good ensemble of inner codes, Thm 10.3.1

Let \(\varepsilon {\gt} 0\). There exists an ensemble of inner codes \(C_{in}^1, C_{in}^2,\ldots ,C_{in}^N\) of rate \(\frac{1}{2}\), where \(N = q^k - 1\), such that for at least \((1-\varepsilon )N\) values of \(i\), the code \(C_{in}^i\) has relative distance \(\ge H_q^{-1}\big(\frac{1}{2}-\varepsilon \big)\). In fact, this ensemble may be taken to be the Wozencraft ensemble \(\{ C_{in}^\alpha \} _{\alpha \in \mathbb {F}_{q^k}^*}\) of Definition 207.

Proof ▶

Fix \(\mathbf{y} = (\mathbf{y}_1,\mathbf{y}_2) \in \mathbb {F}_{q^k}^{2} \setminus \{ \mathbf{0}\} \); note this forces \(\mathbf{y}_1 \ne \mathbf{0}\) or \(\mathbf{y}_2 \ne \mathbf{0}\) (not both zero). We first claim \(\mathbf{y} \in C_{in}^\alpha \) for at most one \(\alpha \in \mathbb {F}_{q^k}^*\). Note that if \(\mathbf{y} \in C_{in}^\alpha \) then \(\mathbf{y}_2 = \alpha \cdot \mathbf{y}_1\). We case on \(\mathbf{y}\):

If \(\mathbf{y}_1 \ne \mathbf{0}\) and \(\mathbf{y}_2 \ne \mathbf{0}\), then \(\mathbf{y} \in C_{in}^\alpha \) exactly for \(\alpha = \mathbf{y}_2/\mathbf{y}_1\), a single value.
If \(\mathbf{y}_1 \ne \mathbf{0}\) and \(\mathbf{y}_2 = \mathbf{0}\), then \(\mathbf{y} \notin C_{in}^\alpha \) for every \(\alpha \in \mathbb {F}_{q^k}^*\), since \(\alpha \mathbf{y}_1 \ne \mathbf{0}\) whenever \(\alpha , \mathbf{y}_1 \in \mathbb {F}_{q^k}^*\).
If \(\mathbf{y}_1 = \mathbf{0}\) and \(\mathbf{y}_2 \ne \mathbf{0}\), then \(\mathbf{y} \notin C_{in}^\alpha \) for every \(\alpha \in \mathbb {F}_{q^k}^*\), since \(\alpha \mathbf{y}_1 = \mathbf{0} \ne \mathbf{y}_2\).

This proves the claim.

Now assume \(\mathrm{wt}(\mathbf{y}) {\lt} H_q^{-1}\big(\frac{1}{2}- \varepsilon \big)\cdot 2k\). If \(\mathbf{y} \in C_{in}^\alpha \) for some \(\alpha \), then (by the claim just proved) \(C_{in}^\alpha \) is “bad”, i.e. has relative distance \({\lt} H_q^{-1}\big(\frac{1}{2}-\varepsilon \big)\), for at most this one value of \(\alpha \). So the total number of bad values of \(\alpha \) is at most the number of nonzero vectors of weight \({\lt} H_q^{-1}(\frac12-\varepsilon )\cdot 2k\) in \(\mathbb {F}_q^{2k}\), which by the upper bound on the volume of a Hamming ball (Proposition 95, part (i)) is at most

\[ \mathrm{Vol}_q\Big(H_q^{-1}\big(\tfrac 12-\varepsilon \big)\cdot 2k,\, 2k\Big) \le q^{H_q\left(H_q^{-1}(\frac12-\varepsilon )\right)\cdot 2k} = q^{(\frac12-\varepsilon )\cdot 2k} = \frac{q^k}{q^{2\varepsilon k}} {\lt} \varepsilon (q^k-1) = \varepsilon N, \]

where the strict inequality holds for large enough \(k\). Thus for at least \((1-\varepsilon )N\) values of \(\alpha \in \mathbb {F}_{q^k}^*\), the code \(C_{in}^\alpha \) has relative distance at least \(H_q^{-1}\big(\frac12-\varepsilon \big)\), as desired. Each \(C_{in}^\alpha \) has rate \(k/(2k) = 1/2\) by construction, completing the proof.

10.3.1 The Justesen code

Definition 210 Justesen code

Fix a prime power \(q\), a positive integer \(k\), and \(0{\lt}R{\lt}1\). Let \(C_{out}\) be the \([N,K]_{q^k}\) Reed–Solomon code (Definition 128) of rate \(R\), evaluated over \(\mathbb {F}_{q^k}^*\), where \(N = q^k-1\); it has relative distance \(\delta _{out} = 1-R\). Let \(\{ C_{in}^\alpha \} _{\alpha \in \mathbb {F}_{q^k}^*}\) be the Wozencraft ensemble of Definition 207, and write \(C_{in}^1,\ldots , C_{in}^N\) for its \(N\) codes in some fixed order matching the \(N\) coordinates of \(C_{out}\). The Justesen code is the concatenated code

\[ C^* \overset {\mathrm{def}}{=} C_{out} \circ (C_{in}^1,\ldots ,C_{in}^N), \]

defined, for a message \(\mathbf{m}\in (\mathbb {F}_{q^k})^K\) with \((c_1,\ldots ,c_N)=C_{out}(\mathbf{m})\), by \(C^*(\mathbf{m}) = \big(C_{in}^1(c_1),\ldots ,C_{in}^N(c_N)\big)\). Since each \(C_{in}^i\) has rate \(\tfrac 12\), \(C^*\) has rate \(\tfrac {R}{2}\).

Proposition 211 Distance of the Justesen code, Prop 10.3.2

Let \(\varepsilon {\gt}0\). The Justesen code \(C^*\) of Definition 210 has relative distance at least \((1-R-\varepsilon )\cdot H_q^{-1}\big(\frac12-\varepsilon \big)\).

Proof ▶

Consider arbitrary \(\mathbf{m}^1 \ne \mathbf{m}^2 \in (\mathbb {F}_{q^k})^K\). By the distance of the outer Reed–Solomon code, \(|S| \ge (1-R)N\) where

\[ S = \{ i \mid C_{out}(\mathbf{m}^1)_i \ne C_{out}(\mathbf{m}^2)_i \} . \]

Call the \(i\)-th inner code \(C_{in}^i\) good if it has relative distance at least \(d/(2k)\) where \(d \overset {\mathrm{def}}{=} H_q^{-1}\big(\frac12-\varepsilon \big)\cdot 2k\), and bad otherwise. By Theorem 209, at most \(\varepsilon N\) of the \(N\) inner codes are bad. Let \(S_g\) be the good indices in \(S\) and \(S_b\) the bad indices in \(S\); since \(|S_b| \le \varepsilon N\),

\[ |S_g| = |S| - |S_b| \ge (1-R-\varepsilon )N. \]

For each good \(i \in S\), since \(C_{out}(\mathbf{m}^1)_i \ne C_{out}(\mathbf{m}^2)_i\) and \(C_{in}^i\) has relative distance at least \(d/(2k)\),

\[ \Delta \big(C_{in}^i(C_{out}(\mathbf{m}^1)_i), C_{in}^i(C_{out}(\mathbf{m}^2)_i)\big) \ge d. \]

Since \(C^*(\mathbf{m}^1)\) and \(C^*(\mathbf{m}^2)\) disagree in at least \(d\) coordinates within the block corresponding to each of the at least \(|S_g| \ge (1-R-\varepsilon )N\) good positions in \(S\), the distance of \(C^*\) is at least

\[ (1-R-\varepsilon )N \cdot d = (1-R-\varepsilon )\, N \cdot H_q^{-1} \big(\tfrac 12-\varepsilon \big)\cdot 2k. \]

Dividing by the block length \(2kN\) of \(C^*\) gives relative distance at least \((1-R-\varepsilon )\cdot H_q^{-1}\big(\frac12-\varepsilon \big)\), as desired.

Corollary 212 The Justesen code is asymptotically good and strongly explicit, Cor 10.3.3

The concatenated code \(C^*\) of Definition 210 is an asymptotically good code and is strongly explicit.

Proof ▶

Asymptotically good. For any fixed \(0{\lt}R{\lt}1\), \(C^*\) has rate \(R/2 {\gt} 0\), a constant independent of the block length. Fixing \(\varepsilon {\lt} \frac12(1-R)\), Proposition 211 shows the relative distance of \(C^*\) is bounded below by the positive constant \((1-R-\varepsilon )H_q^{-1}(\frac12-\varepsilon ) {\gt} 0\), independent of the block length. Hence both the rate and relative distance of the family \(\{ C^*\} \) (as \(k\), and hence the block length \(2kN=2k(q^k-1)\), grows) are bounded away from \(0\), so \(C^*\) is asymptotically good.

Strongly explicit. By Lemma 208, each code \(C_{in}^\alpha \) in the Wozencraft ensemble is strongly explicit, i.e. every entry of its (size \(k\times 2k\)) generator matrix can be computed in time polynomial in \(\log (2k)\). The Reed–Solomon outer code \(C_{out}\) has the explicit Vandermonde generator matrix of Construction 132, whose \((i,j)\)-th entry \(\alpha _j^{\, i}\) is likewise computable in time polynomial in \(\log N\) given standard succinct representations of \(\mathbb {F}_{q^k}\)-arithmetic. Composing these two strongly explicit descriptions as in the generator matrix construction of Lemma 204, each entry of a generator matrix of \(C^* = C_{out}\circ (C_{in}^1,\ldots , C_{in}^N)\) can be computed in time polynomial in \(\log (2kN)\), i.e. polynomial in the logarithm of the block length of \(C^*\). Hence \(C^*\) is strongly explicit.

10.4 Summary of Concatenation

By concatenating an outer code of distance \(D\) with an inner code of distance \(d\), we obtain a code of designed distance \(Dd\) (Theorem 203). Taking the outer code to meet the Singleton bound and the inner code to meet the Gilbert–Varshamov bound gives the Zyablov bound (Lemma 205), which can be made explicit (Theorem 206) and, via the Justesen construction with a strongly explicit ensemble of inner codes on (most of) the GV bound (the Wozencraft ensemble, Theorem 209), even strongly explicit (Corollary 212). This gives a strongly explicit family of asymptotically good binary codes, answering Question 8.3.2 (cross-ref, Ch. 8). The remaining natural question of whether concatenated codes can be efficiently decoded up to half their designed distance is deferred to Chapter 14 (cross-ref).