1 The Fundamental Question
A code of block length \(n\) over an alphabet \(\Sigma \) is a subset \(C \subseteq \Sigma ^n\). We use \(q\) to denote the alphabet size \(|\Sigma |\) (which need not be a constant and may depend on \(n\)).
Equivalently, writing \(|C| = M\), we may view \(C\) as a mapping
where \([M] \overset {\mathrm{def}}{=} \{ 1,2,\dots ,M\} \); this is the convention we use whenever we speak of “the codeword corresponding to message \(\mathbf m\),” written \(C(\mathbf m)\).
Given a code \(C \subseteq \Sigma ^n\) with \(|\Sigma |=q\), its dimension is
The parity code \(C_\oplus \) is the binary code (\(\Sigma =\{ 0,1\} \)) of block length \(5\) with message length \(4\) defined by
where \(\oplus \) denotes XOR. That is, \(C_\oplus \) appends the parity of the message bits to the message. It has parameters \(q=2\), \(k=4\), \(n=5\), \(R=4/5\).
The repetition code \(C_{3,rep}\) is the binary code of block length \(12\) with message length \(4\) defined by
It has parameters \(q=2\), \(k=4\), \(n=12\), \(R=1/3\).
The rate of a code with dimension \(k\) and block length \(n\) is
Since \(k \le n\) (and we always assume \(k{\gt}0\), \(n{\lt}\infty \)), we have \(0 {\lt} R \le 1\).
Let \(C \subseteq \Sigma ^n\). An encoding function for \(C\) is an injective map \(E : [|C|] \to \Sigma ^n\) whose image is \(C\).
Let \(C \subseteq \Sigma ^n\) be a code. A mapping \(D : \Sigma ^n \to [|C|]\) is called a decoding function for \(C\).
Given two vectors \(\mathbf u, \mathbf v \in \Sigma ^n\), the Hamming distance between \(\mathbf u\) and \(\mathbf v\), denoted \(\Delta (\mathbf u,\mathbf v)\), is the number of positions in which \(\mathbf u\) and \(\mathbf v\) differ. The relative Hamming distance is
An \(n\)-symbol \(t\)-Error Channel over the alphabet \(\Sigma \) is a function \(\mathrm{Ch} : \Sigma ^n \to \Sigma ^n\) that satisfies \(\Delta (\mathbf v, \mathrm{Ch}(\mathbf v)) \le t\) for every \(\mathbf v \in \Sigma ^n\).
Let \(C \subseteq \Sigma ^n\) be a code and let \(t \ge 1\) be an integer. \(C\) is said to be a \(t\)-error-correcting code if there exists a decoding function \(D\) such that for every message \(\mathbf m \in [|C|]\) and every \(t\)-error channel \(\mathrm{Ch}\) we have \(D(\mathrm{Ch}(C(\mathbf m))) = \mathbf m\).
Let \(C \subseteq \Sigma ^n\) be a code and let \(t \ge 1\) be an integer. \(C\) is said to be a \(t\)-error-detecting code if there exists a detecting procedure \(D\) such that for every message \(\mathbf m\) and every received vector \(\mathbf v \in \Sigma ^n\) satisfying \(\Delta (C(\mathbf m),\mathbf v) \le t\), it holds that \(D\) outputs \(1\) if \(\mathbf v = C(\mathbf m)\) and \(0\) otherwise; i.e.
An \(n\)-symbol \(t\)-Erasure Channel over the alphabet \(\Sigma \) is a function \(\mathrm{Ch} : \Sigma ^n \to (\Sigma \cup \{ ?\} )^n\) that satisfies \(\Delta (\mathbf v,\mathrm{Ch}(\mathbf v)) \le t\) for every \(\mathbf v \in \Sigma ^n\) (viewing both arguments of \(\Delta (\cdot ,\cdot )\) as elements of \((\Sigma \cup \{ ?\} )^n\)), and such that for every \(i\in [n]\) with \(\mathbf v_i \neq \mathrm{Ch}(\mathbf v)_i\) we have \(\mathrm{Ch}(\mathbf v)_i = {?}\). A coordinate \(i\) with \(\mathrm{Ch}(\mathbf v)_i = {?}\) is called an erasure.
Let \(C \subseteq \Sigma ^n\) be a code and let \(t \ge 1\) be an integer. \(C\) is said to be a \(t\)-erasure-correcting code if there exists a decoding function \(D\) such that for every message \(\mathbf m \in [|C|]\) and every \(t\)-erasure channel \(\mathrm{Ch}\) we have \(D(\mathrm{Ch}(C(\mathbf m))) = \mathbf m\).
\(C_{3,rep}\) is a \(1\)-error-correcting code.
Consider the decoding function that, given a received word \(\mathbf y \in \{ 0,1\} ^{12}\), splits it into four consecutive blocks \(\mathbf y_1,\mathbf y_2,\mathbf y_3,\mathbf y_4\) of three bits each and outputs, for each block, the majority bit among its three bits as the corresponding message bit.
Suppose a codeword \(C_{3,rep}(x_1,x_2,x_3,x_4)\) is transmitted over a \(1\)-error channel, i.e. the total number of bit flips among the \(12\) transmitted bits is at most \(1\). Since a single error can affect at most one of the four (disjoint) three-bit blocks, at least three of the four blocks are received with no error at all, and hence all three bits in each such block equal the corresponding \(x_i\), so their majority is \(x_i\). The (at most one) remaining block has at most one of its three bits flipped from \(x_i\); since the other two bits of that block still equal \(x_i\), the majority of the block is still \(x_i\). Hence in all cases the decoding function recovers \((x_1,x_2,x_3,x_4)\) exactly, so \(C_{3,rep}\) is \(1\)-error-correcting.
Input: received word \(\mathbf y = (y_1,y_2,y_3,y_4,y_5)\). Output: \(1\) if \(\mathbf y \in C_\oplus \) and \(0\) otherwise.
\(b \leftarrow y_1 \oplus y_2 \oplus y_3 \oplus y_4 \oplus y_5\).
Return \(1 \oplus b\).
The parity code \(C_\oplus \) can detect an odd number of errors: for every codeword \(\mathbf c \in C_\oplus \) and every received word \(\mathbf y \in \{ 0,1\} ^5\) with \(\Delta (\mathbf c,\mathbf y)\) odd, Algorithm 15 on input \(\mathbf y\) outputs \(0\).
Write \(\mathbf c = (x_1,x_2,x_3,x_4,x_1\oplus x_2\oplus x_3\oplus x_4)\). The XOR of the five coordinates of \(\mathbf c\) is \(x_1\oplus x_2 \oplus x_3 \oplus x_4 \oplus (x_1\oplus x_2\oplus x_3 \oplus x_4) = 0\). If \(\mathbf y = \mathbf c \oplus \mathbf e\) (coordinatewise XOR) for an error pattern \(\mathbf e \in \{ 0,1\} ^5\) with \(\mathrm{wt}(\mathbf e) = t = \Delta (\mathbf c,\mathbf y)\) non-zero flipped positions, then the XOR of the coordinates of \(\mathbf y\) equals the XOR of the coordinates of \(\mathbf c\) XORed with the XOR of the coordinates of \(\mathbf e\), i.e.
The XOR of a \(0/1\)-vector’s coordinates equals the parity of its number of \(1\)’s, so \(b\) equals \(t \bmod 2\). If \(t\) is odd then \(b=1\) and the algorithm returns \(1\oplus b = 0\), as required.
Let \(C \subseteq \Sigma ^n\). The minimum distance (or just distance) of \(C\), denoted \(\Delta (C)\), is
The relative minimum distance of \(C\) is
Given a code \(C \subseteq \Sigma ^n\), the maximum likelihood decoding (MLD) function \(D_{MLD} : \Sigma ^n \to C\) outputs, for every received word \(\mathbf y \in \Sigma ^n\), a codeword closest to \(\mathbf y\) in Hamming distance (with ties broken arbitrarily):
Input: received word \(\mathbf y \in \Sigma ^n\). Output: \(D_{MLD}(\mathbf y)\).
Pick an arbitrary \(\mathbf c \in C\) and set \(\mathbf z \leftarrow \mathbf c\).
For every \(\mathbf c' \in C\) with \(\mathbf c' \neq \mathbf c\): if \(\Delta (\mathbf c',\mathbf y) {\lt} \Delta (\mathbf z,\mathbf y)\) then \(\mathbf z \leftarrow \mathbf c'\).
Return \(\mathbf z\).
Given a code \(C \subseteq \Sigma ^n\), the following are equivalent:
\(C\) has minimum distance \(d \ge 2\).
If \(d\) is odd, \(C\) can correct \((d-1)/2\) errors.
\(C\) can detect \(d-1\) errors.
\(C\) can correct \(d-1\) erasures.
We freely use the elementary fact that Hamming distance satisfies the triangle inequality \(\Delta (\mathbf x,\mathbf z) \le \Delta (\mathbf x,\mathbf y) + \Delta (\mathbf y,\mathbf z)\) for all \(\mathbf x,\mathbf y,\mathbf z \in \Sigma ^n\) (if \(\mathbf x_i \neq \mathbf z_i\) at some position \(i\), then \(\mathbf x_i \neq \mathbf y_i\) or \(\mathbf y_i \neq \mathbf z_i\), so every position counted on the left is counted at least once on the right).
We first show that (1) implies (2), (3) and (4); we then show that if (1) fails, none of (2), (3), (4) can hold.
(1)\(\Rightarrow \)(2). Assume \(C\) has distance \(d=2t+1\). We claim the MLD function \(D_{MLD}\) (Definition 18, e.g. via Algorithm 19) corrects every error pattern of at most \(t\) errors. Suppose not: let \(\mathbf c_1\) be the transmitted codeword and \(\mathbf y\) the received word, so \(\Delta (\mathbf y,\mathbf c_1) \le t\), and suppose \(D_{MLD}(\mathbf y) = \mathbf c_2 \neq \mathbf c_1\). By definition of MLD, \(\Delta (\mathbf y,\mathbf c_2) \le \Delta (\mathbf y,\mathbf c_1)\). Then
contradicting that \(C\) has distance \(d\).
(1)\(\Rightarrow \)(3). We exhibit a detection procedure for \(d-1\) errors: on input \(\mathbf y\), check (by exhaustive search over \(C\)) whether \(\mathbf y \in C\), output \(1\) if so and \(0\) otherwise. If no error occurred, \(\mathbf y = \mathbf c_1\) (the transmitted codeword) and the algorithm outputs \(1\), as required. If \(1 \le \Delta (\mathbf y,\mathbf c_1) \le d-1\), then since \(C\) has distance \(d\), \(\mathbf y \notin C\) (otherwise \(\mathbf y\) and \(\mathbf c_1\) would be two codewords at distance \({\lt}d\)), so the algorithm outputs \(0\), as required.
(1)\(\Rightarrow \)(4). Let \(\mathbf y \in (\Sigma \cup \{ ?\} )^n\) be a received word with at most \(d-1\) erasures. We claim there is a unique \(\mathbf c=(c_1,\dots ,c_n) \in C\) agreeing with \(\mathbf y\) on every unerased position. If not, there are distinct \(\mathbf c_1,\mathbf c_2 \in C\) both agreeing with \(\mathbf y\) on the unerased positions; then \(\mathbf c_1,\mathbf c_2\) agree on all positions \(i\) with \(y_i \neq {?}\), so \(\Delta (\mathbf c_1,\mathbf c_2) \le |\{ i : y_i = {?}\} | \le d-1\), contradicting that \(C\) has distance \(d\). Given uniqueness, the algorithm that exhaustively searches \(C\) for the codeword agreeing with \(\mathbf y\) on the unerased positions correctly recovers the transmitted codeword.
\(\neg \)(1)\(\Rightarrow \neg \)(2). Suppose \(C\) has distance \(d-1\) (so property (1) fails for the value \(d\) used above). We show no decoding function can correct all error patterns of weight \(\le (d-1)/2\) (with \(d\) odd, so \((d-1)/2\) is an integer). Let \(\mathbf c_1 \neq \mathbf c_2 \in C\) with \(\Delta (\mathbf c_1,\mathbf c_2) = d-1\). Since \(d\) is odd, there is a vector \(\mathbf y\) with \(\Delta ( \mathbf y,\mathbf c_1) = \Delta (\mathbf y,\mathbf c_2) = (d-1)/2\): split the \(d-1\) differing positions between \(\mathbf c_1\) and \(\mathbf c_2\) into two halves of size \((d-1)/2\) each, let \(\mathbf y\) agree with \(\mathbf c_1\) on one half of the differing positions and with \(\mathbf c_2\) on the other half, and agree with both on all positions where they already agree. Since \(\mathbf y\) could have arisen from either \(\mathbf c_1\) or \(\mathbf c_2\) being transmitted with at most \((d-1)/2\) errors, no decoding function can correctly recover the transmitted codeword in both cases.
\(\neg \)(1)\(\Rightarrow \neg \)(3). With \(\mathbf c_1,\mathbf c_2\) as above (\(\Delta (\mathbf c_1,\mathbf c_2)=d-1\)), let \(\mathbf y = \mathbf c_2\). If \(\mathbf c_1\) is transmitted and \(\mathbf y=\mathbf c_2\) is received (which requires \(d-1\) errors, within the claimed detection radius \(d-1\)), then either the algorithm declares "no error" (incorrect, since \(\mathbf y \neq \mathbf c_1\)) or it declares "error" even though, when \(\mathbf c_2\) is transmitted with \(0\) errors, the same received word \(\mathbf y = \mathbf c_2\) must be accepted. So no detection procedure can be correct on both scenarios.
\(\neg \)(1)\(\Rightarrow \neg \)(4). Let \(\mathbf y\) be the received word obtained by erasing exactly the positions where \(\mathbf c_1\) and \(\mathbf c_2\) differ (a total of \(d-1\) erasures). Then \(\mathbf y\) is consistent with both \(\mathbf c_1\) and \(\mathbf c_2\) having been transmitted, so no algorithm correcting \(d-1\) erasures can work in this case.
Let \(C\) be a code with minimum distance \(d\), for \(d\) even. Then \(C\) can correct \(d/2-1\) errors but cannot correct \(d/2\) errors. Consequently, if a code \(C\) is \(t\)-error correctable (for the largest possible such \(t\)), then \(C\) has distance either \(2t+1\) or \(2t+2\).
\(C\) corrects \(d/2-1\) errors. The argument for (1)\(\Rightarrow \)(2) in the proof of Proposition 20 used only that \(2t \le d-1\) to derive the contradiction \(\Delta (\mathbf c_1, \mathbf c_2) \le 2t \le d-1\); taking \(t = d/2-1\) gives \(2t = d-2 \le d-1\), so the same MLD-based argument shows \(C\) corrects \(d/2-1\) errors.
\(C\) cannot correct \(d/2\) errors. Let \(\mathbf c_1 \neq \mathbf c_2 \in C\) with \(\Delta (\mathbf c_1,\mathbf c_2)=d\). Split the \(d\) differing positions into two halves of size \(d/2\) each (possible as \(d\) is even), and let \(\mathbf y\) agree with \(\mathbf c_1\) on one half of the differing positions and with \(\mathbf c_2\) on the other, and with both elsewhere. Then \(\Delta (\mathbf y,\mathbf c_1) = \Delta (\mathbf y,\mathbf c_2) = d/2\), so \(\mathbf y\) is consistent with at most \(d/2\) errors having occurred regardless of whether \(\mathbf c_1\) or \(\mathbf c_2\) was transmitted, and no decoding function can be correct in both cases.
Consequence. If \(C\) is \(t\)-error correctable for the largest possible \(t\) and \(C\) has distance \(d\), then by Proposition 20 (odd \(d\)) or the first part above (even \(d\)), \(t \ge \lceil d/2 \rceil - [d\text{ even}]\), i.e. \(t = (d-1)/2\) if \(d\) is odd and \(t=d/2-1\) if \(d\) is even; in either case \(d \in \{ 2t+1,2t+2\} \).
The Hamming code \(C_H\) is the binary code of block length \(7\) with message length \(4\) defined, for \((x_1,x_2,x_3,x_4) \in \{ 0,1\} ^4\), by
It has parameters \(q=2\), \(k=4\), \(n=7\), \(R=4/7\).
Let \(q \ge 2\). Given a vector \(\mathbf v \in \{ 0,1,\dots ,q-1\} ^n\), its Hamming weight, denoted \(\mathrm{wt}(\mathbf v)\), is the number of non-zero symbols in \(\mathbf v\).
For any \(\mathbf u,\mathbf v \in \{ 0,1\} ^n\), where \(\mathbf u + \mathbf v\) denotes coordinatewise XOR,
For each coordinate \(i \in [n]\), \(u_i \neq v_i\) if and only if \(u_i \oplus v_i = 1\), i.e. if and only if \((\mathbf u+\mathbf v)_i \neq 0\). Hence the set of positions where \(\mathbf u\) and \(\mathbf v\) differ coincides with the set of non-zero positions of \(\mathbf u+\mathbf v\), so the two sets have the same size: \(\Delta (\mathbf u,\mathbf v) = \mathrm{wt}(\mathbf u+\mathbf v)\).
\(C_H\) has distance \(3\).
We prove the claimed distance using two facts:
and
Proof of (a). Let \(\mathbf x=(x_1,x_2,x_3,x_4)\) be the message vector and write the three parity bits as \(p_1 = x_2\oplus x_3\oplus x_4\), \(p_2 = x_1\oplus x_3\oplus x_4\), \(p_3 = x_1\oplus x_2\oplus x_4\), so \(\mathrm{wt}(C_H(\mathbf x)) = \mathrm{wt}(\mathbf x) + \mathrm{wt} (p_1,p_2,p_3)\).
If \(\mathrm{wt}(\mathbf x)=0\) then \(\mathbf x=\mathbf0\) and \(C_H(\mathbf x)=\mathbf0\), which we exclude.
If \(\mathrm{wt}(\mathbf x)=1\), a direct check of the four basis vectors gives: \(\mathbf x=(1,0,0,0) \Rightarrow (p_1,p_2,p_3) = (0,1,1)\); \(\mathbf x=(0,1,0,0) \Rightarrow (1,0,1)\); \(\mathbf x=(0,0,1,0) \Rightarrow (1,1,0)\); \(\mathbf x=(0,0,0,1) \Rightarrow (1,1,1)\). In every case at least two of \(p_1,p_2,p_3\) equal \(1\), so \(\mathrm{wt}(C_H(\mathbf x)) \ge 1+2 = 3\).
If \(\mathrm{wt}(\mathbf x)=2\), a direct check of the six weight-\(2\) vectors gives: \((1,1,0,0)\Rightarrow (1,1,0)\); \((1,0,1,0)\Rightarrow (1,0,1)\); \((1,0,0,1)\Rightarrow (1,0,0)\); \((0,1,1,0)\Rightarrow (0,1,1)\); \((0,1,0,1)\Rightarrow (0,1,0)\); \((0,0,1,1)\Rightarrow (0,0,1)\). In every case at least one of \(p_1,p_2,p_3\) equals \(1\), so \(\mathrm{wt}(C_H(\mathbf x)) \ge 2+1=3\).
If \(\mathrm{wt}(\mathbf x) \ge 3\) then already \(\mathrm{wt}(C_H(\mathbf x)) \ge \mathrm{wt}(\mathbf x) \ge 3\).
Hence \(\min _{\mathbf c\neq \mathbf0} \mathrm{wt}(\mathbf c) \ge 3\). Since \(C_H(1,0,0,0) = (1,0,0,0,0,1,1)\) has weight \(3\), equality holds, proving (a).
Proof of (b). Let \(\mathbf x = (x_1,x_2,x_3,x_4)\) and \(\mathbf y=(y_1,y_2,y_3,y_4)\) be two messages. Since each coordinate of \(C_H\) is an XOR of a subset of the message coordinates, and XOR is associative and commutative, \(C_H(\mathbf x) + C_H(\mathbf y) = C_H(\mathbf x + \mathbf y)\) (coordinatewise XOR on both sides; e.g. the last coordinate gives \((x_1\oplus x_2\oplus x_4) \oplus (y_1\oplus y_2\oplus y_4) = (x_1\oplus y_1)\oplus (x_2\oplus y_2) \oplus (x_4\oplus y_4)\), which is exactly the last coordinate of \(C_H(\mathbf x+\mathbf y)\), and likewise for the other coordinates). By Lemma 24,
where the last equality holds because \(\{ \mathbf x+\mathbf y \mid \mathbf x \neq \mathbf y \in \{ 0,1\} ^4\} = \{ \mathbf z \in \{ 0,1\} ^4 \mid \mathbf z \neq \mathbf0\} \) (for any nonzero \(\mathbf z\), take \(\mathbf x=\mathbf0,\mathbf y=\mathbf z\); and \(\mathbf x\neq \mathbf y\) forces \(\mathbf x+\mathbf y \neq \mathbf0\)). This proves (b).
Combining (a) and (b), \(C_H\) has distance \(3\).
A binary code \(C \subseteq \{ 0,1\} ^n\) of dimension \(k\) is called a binary linear code if there is a \(k\times n\) matrix \(G\) over \(\{ 0,1\} \) (addition being XOR, multiplication being AND) such that \(C = \{ \mathbf x \cdot G \mid \mathbf x \in \{ 0,1\} ^k\} \), and, writing \(C(\mathbf x) \overset {\mathrm{def}}{=} \mathbf x \cdot G\) (viewing \(\mathbf x\) as a row vector), the map \(\mathbf x \mapsto C(\mathbf x)\) is a bijection from \(\{ 0,1\} ^k\) onto \(C\). The matrix \(G\) is called a generator matrix for \(C\).
For any binary linear code \(C\) (not necessarily distinct) messages \(\mathbf x,\mathbf y\), \(C(\mathbf x) + C(\mathbf y) = C(\mathbf x + \mathbf y)\) (coordinatewise XOR throughout).
Let \(G\) be a generator matrix for \(C\). Using distributivity of AND over XOR and associativity/commutativity of XOR,
For any binary linear code \(C\), its minimum distance equals the minimum Hamming weight of any non-zero codeword:
Let \(C\) have dimension \(k\). Exactly as in the proof of part (b) of Proposition 25, but using Lemma 27 in place of the direct XOR computation for \(C_H\): for messages \(\mathbf x,\mathbf y \in \{ 0,1\} ^k\), Lemma 27 gives \(C(\mathbf x)+C(\mathbf y) = C(\mathbf x + \mathbf y)\), and Lemma 24 gives \(\Delta (C(\mathbf x),C(\mathbf y)) = \mathrm{wt}(C(\mathbf x)+C(\mathbf y))\). Hence
where the third equality uses that \(\{ \mathbf x + \mathbf y \mid \mathbf x \neq \mathbf y \in \{ 0,1\} ^k\} = \{ \mathbf z \in \{ 0,1\} ^k \mid \mathbf z \neq \mathbf0\} \) and the last equality uses that \(\mathbf x \mapsto C(\mathbf x)\) is a bijection \(\{ 0,1\} ^k \to C\) (Definition 26).
For any vector \(\mathbf x \in [q]^n\) and integer \(e \ge 0\), the Hamming ball of radius \(e\) centered at \(\mathbf x\) is
Every binary code with block length \(n\), dimension \(k\), distance \(d=3\) satisfies
Let \(C\) have distance \(3\). First, for any two distinct codewords \(\mathbf c_1,\mathbf c_2 \in C\), \(B(\mathbf c_1,1) \cap B(\mathbf c_2,1) = \varnothing \): if \(\mathbf y\) were in both, then \(\Delta ( \mathbf y,\mathbf c_1) \le 1\) and \(\Delta (\mathbf y,\mathbf c_2) \le 1\), so by the triangle inequality \(\Delta (\mathbf c_1,\mathbf c_2) \le 2 {\lt} 3\), a contradiction.
Next, for every \(\mathbf x \in \{ 0,1\} ^n\), \(|B(\mathbf x,1)| = n+1\): \(B(\mathbf x,1)\) consists of \(\mathbf x\) itself together with the \(n\) vectors obtained by flipping exactly one coordinate of \(\mathbf x\), and these \(n+1\) vectors are pairwise distinct.
The union \(\bigcup _{\mathbf c \in C} B(\mathbf c,1)\) is a subset of \(\{ 0,1\} ^n\), so \(\left|\bigcup _{\mathbf c \in C} B(\mathbf c,1)\right| \le 2^n\). Since the balls are pairwise disjoint,
using \(|C| = 2^k\). Hence \(2^k(n+1) \le 2^n\), i.e. \(2^k \le \frac{2^n}{n+1}\), and taking \(\log _2\) gives \(k \le n - \log _2(n+1)\).
A code \(C \subseteq \Sigma ^n\) with dimension \(k\) and distance \(d\) is called an \((n,k,d)_\Sigma \) code, also written an \((n,k,d)_{|\Sigma |}\) code.
For every \((n,k,d)_q\) code,
Let \(e = \lfloor (d-1)/2 \rfloor \) and let \(C\) be an \((n,k,d)_q\) code. As in the proof of Theorem 30, for any two distinct codewords \(\mathbf c_1,\mathbf c_2 \in C\), \(B(\mathbf c_1,e) \cap B(\mathbf c_2,e) = \varnothing \): otherwise \(\mathbf y \in B( \mathbf c_1,e) \cap B(\mathbf c_2,e)\) would give, by the triangle inequality, \(\Delta (\mathbf c_1,\mathbf c_2) \le 2e \le d-1\), contradicting that \(C\) has distance \(d\).
We claim that for all \(\mathbf x \in [q]^n\),
Indeed, any vector in \(B(\mathbf x,e)\) differs from \(\mathbf x\) in exactly \(i\) positions for some \(0 \le i \le e\); there are \(\binom n i\) ways to choose which \(i\) positions differ, and for each such choice, in each differing position the vector may take any of the \(q-1\) values other than the corresponding coordinate of \(\mathbf x\), giving \((q-1)^i\) choices; summing over \(i\) gives the claim.
Now \(\bigcup _{\mathbf c \in C} B(\mathbf c,e) \subseteq [q]^n\), so \(\left|\bigcup _{\mathbf c \in C} B(\mathbf c,e)\right| \le q^n\). Since the balls are pairwise disjoint and \(|C| = q^k\),
Combining, \(q^k \sum _{i=0}^{e}\binom n i (q-1)^i \le q^n\), and taking \(\log _q\) of both sides gives the desired bound:
Every code of distance \(d\) over an alphabet of size \(q\) has rate \(R\) at most
where \(e = \lfloor (d-1)/2 \rfloor \).
A code that meets the Hamming bound of Theorem 32 with equality is called a perfect code: i.e. an \((n,k,d)_q\) code \(C\) is perfect if
Equivalently, the Hamming balls of radius \(e\) centered at the codewords of \(C\) partition the entire ambient space \([q]^n\). The \((7,4,3)_2\) Hamming code \(C_H\) (Definition 22) is an example of a perfect code, since \(2^4 = 2^7/(7+1)\).
Let \(\{ n_i\} _{i\ge 1}\) be an increasing sequence of block lengths and suppose there exist sequences \(\{ k_i\} _{i\ge 1}\), \(\{ d_i\} _{i \ge 1}\) and \(\{ q_i\} _{i\ge 1}\) such that for all \(i \ge 1\) there exists an \((n_i,k_i,d_i)_{q_i}\) code \(C_i\). Then the sequence \(\mathscr C = \{ C_i\} _{i\ge 1}\) is a family of codes. The rate of \(\mathscr C\) is
when the limit exists. The relative distance of \(\mathscr C\) is
when the limit exists. If \(q_i = q\) for all \(i \ge 1\), then \(\mathscr C\) is referred to as a family of \(q\)-ary codes.
An algorithm related to a code of block length \(n\) is called efficient if it runs in time polynomial in \(n\). (Throughout the book, unless mentioned otherwise, a “code” implicitly refers to a family of codes, so that asymptotic run time as \(n \to \infty \) is meaningful.)