← All blueprints

Essential Coding Theory — Blueprint

7 Bridging the Gap Between Shannon and Hamming: List Decoding

Definition 153 List decodable code, Def 7.2.1

Given \(0 \leq \rho \leq 1\) and \(L \geq 1\), a code \(C \subseteq \Sigma ^n\) is \((\rho , L)\)-list decodable if for every received word \(\mathbf{y} \in \Sigma ^n\),

\[ \big| \{ c \in C \mid \Delta (\mathbf{y}, c) \leq \rho n \} \big| \leq L. \]

A list-decoding algorithm for \(C\), given an error parameter \(\rho \), a received word \(\mathbf{y}\), is required to output all codewords of \(C\) within relative Hamming distance \(\rho \) of \(\mathbf{y}\); in particular, if the fraction of errors that occurred is at most \(\rho \), the transmitted codeword is guaranteed to appear in the output list. If \(C\) is \((\rho , L)\)-list decodable then any such algorithm outputs at most \(L\) codewords. The list-decodability of an infinite family of codes \(\mathscr {C}\) (with polynomial-sized lists) is the limit of \(\rho \) over all polynomials \(L\) such that all sufficiently long codes \(C \in \mathscr {C}\) are \((\rho , L)\)-list decodable.

Theorem 154 Johnson Bound, Thm 7.3.1

Let \(C \subseteq [q]^n\) be a code of distance \(d\). If \(\rho {\lt} J_q\! \left(\frac{d}{n}\right)\), then \(C\) is a \((\rho , qdn)\)-list decodable code, where the function \(J_q(\delta )\) is defined as

\[ J_q(\delta ) = \left(1 - \frac{1}{q}\right)\left(1 - \sqrt{1 - \frac{q\delta }{q-1}}\right). \]
Proof

We prove the theorem here only for the binary case, \(q = 2\); the general case is left by the book as an exercise. We use double counting: we bound the same quantity above and below, and derive the desired bound from the resulting inequality.

For notational convenience, let \(e = \rho n\). We must show that for every binary code \(C \subseteq \{ 0,1\} ^n\) with distance \(d\) (i.e., \(\Delta (\mathbf{c}_1, \mathbf{c}_2) \geq d\) for all distinct \(\mathbf{c}_1 \neq \mathbf{c}_2 \in C\)) and every \(\mathbf{y} \in \{ 0,1\} ^n\),

\[ |B(\mathbf{y}, e) \cap C| \leq 2dn. \]

Fix arbitrary \(C\) and \(\mathbf{y}\). Let \(\mathbf{c}_1, \ldots , \mathbf{c}_M \in B(\mathbf{y}, e)\); we show \(M \leq 2dn\). Define \(\mathbf{c}_i' = \mathbf{c}_i - \mathbf{y}\) for \(1 \leq i \leq M\), and let

\[ S = \sum _{i {\lt} j} \Delta (\mathbf{c}_i', \mathbf{c}_j'). \]

We bound \(S\) above and below.

Lower bound on \(S\). Since \(\mathbf{c}_i \in B(\mathbf{y}, e)\), we have \(\mathrm{wt}(\mathbf{c}_i') \leq e\) for all \(1 \leq i \leq M\). Since \(\Delta (\mathbf{c}_i, \mathbf{c}_j) \geq d\) for \(i \neq j\), subtracting \(\mathbf{y}\) from both coordinates preserves Hamming distance, so \(\Delta (\mathbf{c}_i', \mathbf{c}_j') \geq d\) for every \(i \neq j\). Hence

\[ S \geq \binom {M}{2} d. \]

Upper bound on \(S\). Consider the \(n \times M\) matrix whose columns are \(\mathbf{c}_1'^{\, T}, \ldots , \mathbf{c}_M'^{\, T}\). Let \(m_i\) be the number of \(1\)’s in row \(i\), for \(1 \leq i \leq n\). Row \(i\) contributes \(m_i(M - m_i)\) to \(S\), since this is exactly the number of \(0\)-\(1\) pairs in that row (each such pair contributes exactly \(1\) to the Hamming distance sum \(S\)). Hence

\[ S = \sum _{i=1}^n m_i (M - m_i). \]

Define \(\bar e\) by \(\bar e M = \sum _{i=1}^n m_i\). Since \(\sum _{i=1}^n m_i = \sum _{j=1}^M \mathrm{wt}(\mathbf{c}_j') \leq eM\), we get \(\bar e \leq e\). By the Cauchy-Schwarz inequality (taking \(\mathbf{x} = (m_1, \ldots , m_n)\), \(\mathbf{z} = (1/n, \ldots , 1/n)\)),

\[ \left( \frac{\sum _{i=1}^n m_i}{n} \right)^2 \leq \left( \sum _{i=1}^n m_i^2 \right) \frac{1}{n}, \]

i.e., \(\sum _{i=1}^n m_i^2 \geq (\bar e M)^2 / n\). Thus

\[ S = M \sum _{i=1}^n m_i - \sum _{i=1}^n m_i^2 \leq M^2 \bar e - \frac{(M \bar e)^2}{n} = M^2 \left( \bar e - \frac{\bar e^2}{n} \right). \]

Combining the two bounds,

\[ M^2 \left( \bar e - \frac{\bar e^2}{n} \right) \geq \binom {M}{2} d = \frac{M(M-1)}{2} d, \]

which, dividing by \(M {\gt} 0\) and rearranging, gives

\[ M \leq \frac{dn}{dn - 2n\bar e + 2\bar e^2} = \frac{2dn}{(n - 2\bar e)^2 - n(n-2d)} \leq \frac{2dn}{(n-2e)^2 - n(n-2d)}, \]

where the middle equality is a direct algebraic identity, and the last inequality holds because \(\bar e \leq e \leq n/2\), so \((n - 2\bar e)^2 \geq (n-2e)^2\) and hence the denominator only shrinks.

Finally, since \(\rho = e/n {\lt} J_2(d/n) = \frac12\left(1 - \sqrt{1 - 2d/n}\right)\), we get \(n - 2e {\gt} \sqrt{n(n-2d)}\), i.e., \((n-2e)^2 {\gt} n(n-2d)\). Since \(n, e\) are integers, \((n-2e)^2 - n(n-2d) \geq 1\), and therefore \(M \leq 2dn/1 = 2dn\), as required.

Lemma 155 Properties of \(J_q\), Lemma 7.3.2
#

Let \(q \geq 2\) be an integer and let \(0 \leq x \leq 1 - \frac1q\). Then the following inequalities hold:

\[ J_q(x) \geq 1 - \sqrt{1-x} \geq \frac{x}{2}, \]

where the second inequality is strict for \(x {\gt} 0\).

Proof

We first prove

\[ \left(1 - \frac1q\right)\left(1 - \sqrt{1 - \frac{xq}{q-1}}\right) \geq 1 - \sqrt{1-x}. \]

Both sides are zero at \(x=0\). The derivative of the left-hand side is \(\frac{1}{2\sqrt{1 - xq/(q-1)}}\) and of the right-hand side is \(\frac{1}{2\sqrt{1-x}}\); the former is always at least as large as the latter (since \(xq/(q-1) \geq x\)), so the left-hand side increases at least as rapidly as the right-hand side as \(x\) increases from \(0\), which proves the inequality (this is exactly \(J_q(x) \geq 1 - \sqrt{1-x}\)).

For the second inequality, since \(x \geq 0\),

\[ 1 - x + \frac{x^2}{4} \geq 1-x, \]

which implies \(\left(1 - \frac{x}{2}\right)^2 \geq 1-x\), i.e., \(1 - \frac{x}{2} \geq \sqrt{1-x}\) (both sides nonnegative for \(x \leq 1\)), which in turn gives \(1 - \sqrt{1-x} \geq x/2\). Both inequalities used here are strict for \(x{\gt}0\), so \(1 - \sqrt{1-x} {\gt} x/2\) for every \(x {\gt} 0\), as desired.

Theorem 156 Alphabet-Free Johnson Bound, Thm 7.3.3

For every \(q\)-ary code with block length \(n\) and distance \(d\), if \(e \leq n - \sqrt{n(n-d)}\), then the code is \((e/n, qnd)\)-list decodable.

Proof

Let \(\delta = d/n\) and \(\rho = e/n\). The hypothesis \(e \leq n - \sqrt{n(n-d)}\) is equivalent to

\[ \rho \leq 1 - \sqrt{\frac{n-d}{n}} = 1 - \sqrt{1-\delta }. \]

By Lemma 155 (with \(x = \delta \)), \(J_q(\delta ) \geq 1 - \sqrt{1-\delta } \geq \rho \). Hence \(\rho \leq J_q(\delta ) = J_q(d/n)\), so by the Johnson Bound (Theorem 154), the code is \((\rho , qdn)\)-list decodable, i.e., \((e/n, qnd)\)-list decodable.

Theorem 157 List-decoding trade-off, Thm 7.4.1

Let \(q \geq 2\), \(0 \leq \rho {\lt} 1 - \frac1q\), and \(\varepsilon {\gt} 0\) be a small enough real. Then the following holds for codes of large enough block length \(n\):

  1. If \(R \leq 1 - H_q(\rho ) - \varepsilon \), then there exists a \(\big(\rho , O(\frac1\varepsilon )\big)\)-list decodable code.

  2. If \(R \geq 1 - H_q(\rho ) + \varepsilon \), every \((\rho , L)\)-list decodable code has \(L \geq q^{\Omega (\varepsilon n)}\).

Proof

Immediate from Theorem 161 by taking \(L = \lceil 1/\varepsilon \rceil \): part (i) of Theorem 161 then gives a \((\rho , L)\)-list decodable code, i.e., a \((\rho , O(1/\varepsilon ))\)-list decodable code, of rate \(R \leq 1 - H_q(\rho ) - 1/L \leq 1 - H_q(\rho ) - \varepsilon /2\) (for \(\varepsilon \) small enough this is at most \(1 - H_q(\rho ) - \varepsilon \) up to renaming constants), giving (i); and part (ii) of Theorem 161 directly gives (ii).

Definition 158 List-decoding capacity

The list-decoding capacity (of the \(q\)-ary worst-case error channel), as a function of the fraction of errors \(\rho \), is \(1 - H_q(\rho )\).

Lemma 159 List-decoding capacity: existence, Thm 7.4.2(i)

Let \(q \geq 2\) be an integer, \(0 {\lt} \rho {\lt} 1 - \frac1q\) a real number, and \(L \geq 1\) an integer. Then there exists a \((\rho , L)\)-list decodable code with rate

\[ R \leq 1 - H_q(\rho ) - \frac1L. \]
Proof

We use the probabilistic method: we pick a random code and show it satisfies the required property with nonzero probability, by showing the probability of a “bad event” is small.

Pick a code \(C\) at random where \(|C| = q^k\) with \(k \leq \left(1 - H_q(\rho ) - \frac1L\right) n\), i.e., for every message \(\mathbf{m} \in [q]^k\), choose \(C(\mathbf{m})\) uniformly and independently at random from \([q]^n\).

Given \(\mathbf{y} \in [q]^n\) and distinct \(\mathbf{m}_0, \ldots , \mathbf{m}_L \in [q]^k\), say the tuple \((\mathbf{y}, \mathbf{m}_0, \ldots , \mathbf{m}_L)\) defines a bad event if

\[ C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n) \text{ for all } 0 \leq i \leq L. \]

A code is \((\rho ,L)\)-list decodable if and only if no bad event occurs (a bad event witnesses \(L+1\) codewords all within distance \(\rho n\) of some \(\mathbf{y}\), exceeding the list-size bound \(L\)).

Fix \(\mathbf{y}\) and \(\mathbf{m}_0, \ldots , \mathbf{m}_L\). For each fixed \(i\), by the random choice of \(C\),

\[ \Pr [C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n)] = \frac{\mathrm{Vol}_q(\rho n, n)}{q^n} \leq q^{-n(1-H_q(\rho ))}. \]

Since the random choices of codewords for distinct messages are independent,

\[ \Pr \Big[ \bigwedge _{i=0}^L C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n) \Big] = \prod _{i=0}^L \Pr [C(\mathbf{m}_i) \in B(\mathbf{y}, \rho n)] \leq \left(q^{-n(1-H_q(\rho ))}\right)^{L+1} = q^{-n(L+1)(1-H_q(\rho ))}. \]

By the union bound over the \(q^n\) choices of \(\mathbf{y}\) and the \(\binom {q^k}{L+1}\) choices of \(L+1\) distinct messages,

\[ \Pr [\text{there is a bad event}] \leq q^n \binom {q^k}{L+1} q^{-n(L+1)(1-H_q(\rho ))} \leq q^n \cdot q^{k(L+1)} \cdot q^{-n(L+1)(1-H_q(\rho ))} = q^{-n(L+1)\left[1 - H_q(\rho ) - \frac{1}{L+1} - R\right]}, \]

using \(\binom {a}{b} \leq a^b\) and \(k = Rn\). By assumption \(R \leq 1 - H_q(\rho ) - 1/L\), so \(1 - H_q(\rho ) - R \geq 1/L\), hence

\[ 1 - H_q(\rho ) - \frac{1}{L+1} - R \geq \frac1L - \frac{1}{L+1} = \frac{1}{L(L+1)}, \]

and so

\[ \Pr [\text{there is a bad event}] \leq q^{-n(L+1) \cdot \frac{1}{L(L+1)}} = q^{-n/L} {\lt} 1. \]

Since the probability of a bad event is strictly less than \(1\), by the probabilistic method there exists a code \(C\) with no bad event, i.e., a \((\rho , L)\)-list decodable code of rate \(R \leq 1 - H_q(\rho ) - 1/L\).

Lemma 160 List-decoding capacity: necessity, Thm 7.4.2(ii)

Let \(q \geq 2\) be an integer and \(0 {\lt} \rho {\lt} 1 - \frac1q\) a real number. For every \((\rho , L)\) code \(C\) of rate \(1 - H_q(\rho ) + \varepsilon \), it is necessary that \(L \geq 2^{\Omega (\varepsilon n)}\).

Proof

We show, via the probabilistic method, that there exists \(\mathbf{y} \in [q]^n\) such that \(|C \cap B(\mathbf{y}, \rho n)|\) is exponentially large (in particular \(q^{\Omega (\varepsilon n)}\)); since a \((\rho , L)\)-list decodable code must have \(|C \cap B(\mathbf{y}, \rho n)| \leq L\) for every \(\mathbf{y}\), this forces \(L \geq q^{\Omega (\varepsilon n)} = 2^{\Omega (\varepsilon n)}\) (as \(q \geq 2\) is a constant).

Pick \(\mathbf{y} \in [q]^n\) uniformly at random. Fix \(\mathbf{c} \in C\); then

\[ \Pr [\mathbf{c} \in B(\mathbf{y}, \rho n)] = \Pr [\mathbf{y} \in B(\mathbf{c}, \rho n)] = \frac{\mathrm{Vol}_q(\rho n, n)}{q^n} \geq q^{-n(1-H_q(\rho )) - o(n)}. \]

By linearity of expectation,

\[ \mathbb {E}\big[ |C \cap B(\mathbf{y}, \rho n)| \big] = \sum _{\mathbf{c} \in C} \Pr [\mathbf{c} \in B(\mathbf{y}, \rho n)] \geq |C| \cdot q^{-n(1-H_q(\rho )) - o(n)} = q^{n[R - 1 + H_q(\rho ) - o(1)]} \geq q^{\Omega (\varepsilon n)}, \]

using \(|C| = q^{Rn}\) and \(R \geq 1 - H_q(\rho ) + \varepsilon \). Since the expectation of \(|C \cap B(\mathbf{y}, \rho n)|\) over uniformly random \(\mathbf{y}\) is at least \(q^{\Omega (\varepsilon n)}\), there must exist a specific \(\mathbf{y}\) with \(|C \cap B(\mathbf{y}, \rho n)| \geq q^{\Omega (\varepsilon n)}\), as desired.

Theorem 161 List-Decoding Capacity, Thm 7.4.2

Let \(q \geq 2\) be an integer, and \(0 {\lt} \rho {\lt} 1 - \frac1q\) be a real number.

  1. Let \(L \geq 1\) be an integer; then there exists a \((\rho , L)\)-list decodable code with rate \(R \leq 1 - H_q(\rho ) - \frac1L\).

  2. For every \((\rho , L)\) code of rate \(1 - H_q(\rho ) + \varepsilon \), it is necessary that \(L \geq 2^{\Omega (\varepsilon n)}\).

Proof

Part (i) is Lemma 159 and part (ii) is Lemma 160.

Definition 162 Codeword associated with an error pattern, Def 7.5.3

Given a code \(C\), a fixed codeword \(\mathbf{c} \in C\), and an error pattern \(\mathbf{e}\) such that some codeword of \(C\) lies within Hamming distance \(\rho n\) of \(\mathbf{c} + \mathbf{e}\), the error pattern \(\mathbf{e}\) is associated with a codeword \(c(\mathbf{e})\), namely the closest codeword of \(C\) that lies within Hamming distance \(\rho n\) from \(\mathbf{c} + \mathbf{e}\).

Lemma 163 Agreement determines the codeword, Lemma 7.5.2

Let \(C\) be an \((n,k,d)_q\) code. If we fix the values in \(n-d+1\) out of the \(n\) positions in a possible codeword, then at most one codeword in \(C\) can agree with the fixed values.

Proof

Suppose two distinct codewords \(\mathbf{c}_1 \neq \mathbf{c}_2 \in C\) both agree with the fixed values on the same set of \(n-d+1\) positions. Then \(\mathbf{c}_1\) and \(\mathbf{c}_2\) agree with each other on those \(n-d+1\) positions, so they can disagree on at most \(n - (n-d+1) = d-1\) positions, i.e., \(\Delta (\mathbf{c}_1, \mathbf{c}_2) \leq d - 1 {\lt} d\). This contradicts the minimum distance of \(C\) being \(d\). Hence at most one codeword of \(C\) can agree with the fixed values on any fixed set of \(n-d+1\) positions.

Theorem 164 List decoding from random errors, Thm 7.5.1

Let \(\varepsilon {\gt} 0\) be a real and \(q \geq 2^{\Omega (1/\varepsilon )}\) be an integer. Then the following is true for every \(0 {\lt} \delta {\lt} 1 - 1/q\) and large enough \(n\). Let \(C \subseteq \{ 0, 1, \ldots , q-1\} ^n\) be a code with relative distance \(\delta \) and let \(S \subseteq [n]\) be such that \(|S| = (1-\rho )n\), where \(0 {\lt} \rho \leq \delta - \varepsilon \).

Then, for all \(\mathbf{c} \in C\) and all but a \(q^{-\Omega (\varepsilon n)}\) fraction of error patterns \(\mathbf{e} \in \{ 0, 1, \ldots , q-1\} ^n\) such that

\[ \mathbf{e}_S = \mathbf{0} \quad \text{and} \quad \mathrm{wt}(\mathbf{e}) = \rho n, \]

the only codeword within Hamming distance \(\rho n\) of \(\mathbf{c} + \mathbf{e}\) is \(\mathbf{c}\) itself.

Proof

Fix \(\mathbf{c} \in C\) for the rest of the proof, and let \(d = \delta n\). Let \(\mathcal{E}_S\) be the set of all error patterns \(\mathbf{e}\) with \(\mathbf{e}_S = \mathbf{0}\) and \(\mathrm{wt}(\mathbf{e}) = \rho n\); since \(\mathbf{e}_S = \mathbf{0}\) and \(\mathrm{wt}(\mathbf{e}) = \rho n = |[n] \setminus S|\), every position of \([n] \setminus S\) must be nonzero in \(\mathbf{e}\), and each such position has \(q-1\) nonzero choices, so

\[ |\mathcal{E}_S| = (q-1)^{\rho n}. \]

Call \(\mathbf{e} \in \mathcal{E}_S\) bad if there exists another codeword \(\mathbf{c}' \neq \mathbf{c}\) with \(\Delta (\mathbf{c}', \mathbf{c} + \mathbf{e}) \leq \rho n\). We must show the number of bad error patterns is at most \(q^{-\Omega (\varepsilon n)}|\mathcal{E}_S|\).

For a bad \(\mathbf{e}\), let \(c(\mathbf{e}) \neq \mathbf{c}\) be its associated codeword (Definition 162), and let \(A \subseteq [n]\) be the set of positions where \(c(\mathbf{e})\) agrees with \(\mathbf{c} + \mathbf{e}\). Write \(|A| = \alpha n\). Since \(\Delta (c(\mathbf{e}), \mathbf{c}+\mathbf{e}) \leq \rho n\), they agree on at least \((1-\rho )n\) positions, so

\[ \alpha \geq 1 - \rho \geq 1 - \delta + \varepsilon . \]

Fix \(A\) with \(|A| = \alpha n\); we count how many bad \(\mathbf{e}\) map to this \(A\), and later aggregate over all (at most \(2^n\)) choices of \(A\). Let \(A_1 = A \cap S\) and \(A_2 = A \setminus A_1\), and write \(|A_1| = \beta n\), so \(|A_2| = (\alpha - \beta ) n\) and \(\beta \leq \alpha \).

We overestimate the number of \(\mathbf{e}\) mapping to \((A_1, A_2)\) as follows.

  1. First, the values of \(\mathbf{e}\) on \([n] \setminus (S \cup A_2)\) must all be nonzero (since \(\mathbf{e}\) is supported exactly on \([n]\setminus S\)), so the number of possible values of \(\mathbf{e}_{[n]\setminus (S \cup A_2)}\) is at most

    \[ (q-1)^{n - |S| - |A_2|} \leq q^{\, n - (1-\rho )n - (\alpha -\beta )n}. \]
  2. Fix a nonzero choice \(\mathbf{x}\) for \(\mathbf{e}_{[n]\setminus (S \cup A_2)}\). Since \(A_1 \subseteq S\), we have \(\mathbf{e}_{A_1} = \mathbf{0}\), so \((\mathbf{c} + \mathbf{e})_{A_1} = \mathbf{c}_{A_1}\); since \(A_1 \subseteq A\), \(c(\mathbf{e})\) agrees with \(\mathbf{c}+\mathbf{e}\) on \(A_1\), so \(c(\mathbf{e})_{A_1} = \mathbf{c}_{A_1}\) is already determined (as \(\mathbf{c}\) is fixed). By Lemma 163 applied to \(C\) (an \((n, k, d)_q\) code), fixing the values of \(c(\mathbf{e})\) on any \(n - d + 1 = (1-\delta )n + 1\) positions determines \(c(\mathbf{e})\) completely; since \(|A_1| = \beta n\) positions of \(c(\mathbf{e})\) are already fixed (equal to \(\mathbf{c}_{A_1}\)), fixing a further \((1-\delta )n + 1 - \beta n\) positions of \(c(\mathbf{e})_{A_2}\) determines \(c(\mathbf{e})\) completely, and hence determines \(\mathbf{e}\) completely (via \(\mathbf{e} = c(\mathbf{e}) - \mathbf{c}\) on \(A\), together with the already-fixed values elsewhere). Thus the number of choices for \(\mathbf{e}\) compatible with this \(\mathbf{x}\) is at most

    \[ q^{(1-\delta )n + 1 - \beta n}. \]

Combining, the number of bad \(\mathbf{e}\) mapping to \((A_1, A_2)\) is at most

\[ q^{\, n - (1-\rho )n - (\alpha -\beta )n} \cdot q^{(1-\delta )n + 1 - \beta n} = q^{\, \rho n - \alpha n + (1-\delta ) n + 1} \leq q^{\, \rho n - \varepsilon n + 1} = q^{-\varepsilon n + 1} |\mathcal{E}_S|, \]

where the inequality uses \(\alpha \geq 1 - \delta + \varepsilon \) (so \(-\alpha n \leq -(1 - \delta + \varepsilon ) n\), and \(\rho n - \alpha n + (1-\delta )n \leq \rho n - \varepsilon n\)), and the last equality uses \(|\mathcal{E}_S| = (q-1)^{\rho n} \leq q^{\rho n}\) together with absorbing the base change into the asymptotic notation (as in the book).

Finally, summing over all (at most \(2^n\)) choices of \(A = (A_1, A_2)\), the total number of bad error patterns is at most

\[ 2^n \cdot q^{-\varepsilon n + 1} |\mathcal{E}_S| = q^{n/\log _2 q} \cdot q^{-\varepsilon n + 1} |\mathcal{E}_S| = q^{\, n/\log _2 q - \varepsilon n + 1} |\mathcal{E}_S| \leq q^{- \varepsilon n/4} |\mathcal{E}_S|, \]

where the last inequality holds because for \(q \geq \Omega (1/\varepsilon )\) and \(n\) large enough, \(n/\log _2 q + 1 {\lt} 3\varepsilon n /4\). Hence the fraction of bad error patterns in \(\mathcal{E}_S\) is at most \(q^{-\Omega (\varepsilon n)}\), i.e., for all but a \(q^{-\Omega (\varepsilon n)}\) fraction of \(\mathbf{e} \in \mathcal{E}_S\), \(\mathbf{e}\) is not bad, which means \(\mathbf{c}\) is the unique codeword within Hamming distance \(\rho n\) of \(\mathbf{c} + \mathbf{e}\), as required.