2 A Look at Some Nicely Behaved Codes: Linear Codes

Definition 37 Group, Def 2.1.1

A group \(\mathbb {G}\) is given by a pair \((S,\circ )\), where \(S\) is the set of elements and \(\circ \) is a function \(S\times S \to S\) satisfying:

Closure: for every \(a,b\in S\), \(a\circ b \in S\).
Associativity: for every \(a,b,c\in S\), \(a\circ (b\circ c) = (a\circ b)\circ c\).
Identity: there exists a special element \(e\in S\) such that for every \(a \in S\), \(a\circ e = e \circ a = a\).
Inverse: for every \(a \in S\) there exists a unique \(a^{-1}\in S\) such that \(a\circ a^{-1} = a^{-1}\circ a = e\).

If \(\mathbb {G}=(S,\circ )\) satisfies all properties except the existence of inverses then \(\mathbb {G}\) is called a monoid. \(\mathbb {G}\) is commutative if for every \(a,b\in S\), \(a\circ b = b \circ a\).

Definition 38 Field, Def 2.1.2

A field \(\mathbb {F}\) is given by a triple \((S,+,\cdot )\), where \(S\) is the set of elements and \(+,\cdot \) are functions \(S\times S \to S\) such that:

\((S,+)\) forms a commutative group with identity element \(0\in S\).
\((S\setminus \{ 0\} ,\cdot )\) forms a commutative group with identity element \(1\in S\setminus \{ 0\} \).
\(\cdot \) distributes over \(+\): for every \(a,b,c\in S\), \(a\cdot (b+c) = a\cdot b + a\cdot c\).

Definition 39 Finite field

A finite field is a field \(\mathbb {F} = (S,+,\cdot )\) whose set of elements \(S\) is finite. We write \(|\mathbb {F}| = |S|\) for its size, and denote a finite field of size \(q\) by \(\mathbb {F}_q\).

Theorem 40 Size of Finite Fields, Thm 2.1.3

Every finite field has size \(p^s\) for some prime \(p\) and integer \(s\ge 1\). Conversely, for every prime \(p\) and integer \(s\ge 1\) there exists a field of size \(p^s\).

Lemma 41 \(\mathbb {F}_p\) is a field, Lem 2.1.4

Let \(p\) be a prime. Then \(\mathbb {F}_p = (\{ 0,1,\dots ,p-1\} , +_p, \cdot _p)\), where \(+_p\) and \(\cdot _p\) are addition and multiplication modulo \(p\), is a field.

Theorem 42 Uniqueness of Finite Fields, Thm 2.1.5

For every prime power \(q\) there is a unique finite field with \(q\) elements, up to isomorphism. Thus \(\mathbb {F}_q\) unambiguously denotes “the” finite field on \(q\) elements.

Definition 43 Vector Space, Def 2.2.1

A vector space \(V\) over a field \(\mathbb {F}\) is given by a triple \((T,+,\cdot )\) such that \((T,+)\) forms a commutative group and \(\cdot \), referred to as the scalar product, is a function \(\mathbb {F}\times T \to T\) such that for every \(a,b\in \mathbb {F}\) and \(\mathbf{u},\mathbf{v}\in T\): \((a+b)\cdot \mathbf{u} = a\cdot \mathbf{u} + b\cdot \mathbf{u}\) and \(a\cdot (\mathbf{u}+\mathbf{v}) = a\cdot \mathbf{u} + a\cdot \mathbf{v}\).

Definition 44 Linear Subspace, Def 2.2.2

A non-empty subset \(S \subseteq \mathbb {F}_q^n\) is a linear subspace if the following hold:

For every \(\mathbf{x},\mathbf{y}\in S\), \(\mathbf{x}+\mathbf{y}\in S\) (vector addition over \(\mathbb {F}_q\), done componentwise).
For every \(a\in \mathbb {F}_q\) and \(\mathbf{x}\in S\), \(a\cdot \mathbf{x}\in S\) (componentwise scalar multiplication over \(\mathbb {F}_q\)).

Lemma 45 Zero vector in a subspace, Rem 2.2.3

Every linear subspace \(S \subseteq \mathbb {F}_q^n\) contains the zero vector \(\mathbf{0}\).

Definition 46 Span, Def 2.2.4

Given a set \(B = \{ \mathbf{v}_1,\dots ,\mathbf{v}_\ell \} \subseteq \mathbb {F}_q^n\), the span of \(B\) is the set of vectors

\[ \left\{ \sum _{i=1}^\ell a_i\cdot \mathbf{v}_i \, \middle |\, a_i \in \mathbb {F}_q \text{ for every } i \in [\ell ]\right\} . \]

Definition 47 Linear (in)dependence of vectors, Def 2.2.5

Vectors \(\mathbf{v}_1,\dots ,\mathbf{v}_k \in \mathbb {F}_q^n\) are linearly independent if for every \(1\le i \le k\) and every \((k-1)\)-tuple \((a_1,\dots ,a_{i-1},a_{i+1},\dots ,a_k)\in \mathbb {F}_q^{k-1}\),

\[ \mathbf{v}_i \ne a_1\mathbf{v}_1 + \dots + a_{i-1}\mathbf{v}_{i-1} + a_{i+1}\mathbf{v}_{i+1} + \dots + a_k\mathbf{v}_k. \]

In other words, \(\mathbf{v}_i\) is not in the span of \(\{ \mathbf{v}_1,\dots ,\mathbf{v}_{i-1},\mathbf{v}_{i+1},\dots ,\mathbf{v}_k\} \) for every \(1\le i \le k\). The vectors are linearly dependent if they are not linearly independent.

Definition 48 Rank of a matrix, Def 2.2.6

The rank of a matrix in \(\mathbb {F}_q^{k\times n}\) is the maximum number of linearly independent rows (equivalently, the maximum number of linearly independent columns). A matrix in \(\mathbb {F}_q^{k\times n}\) with rank \(\min (k,n)\) is said to have full rank.

Lemma 49 Size of the span of independent vectors

The span of \(k\) linearly independent vectors over \(\mathbb {F}_q\) has size exactly \(q^k\).

Lemma 50 Dimension of an orthogonal complement

Let \(S \subseteq \mathbb {F}_q^n\) be a linear subspace of dimension \(k\). Then \(N = \{ \mathbf{y}\in \mathbb {F}_q^n \mid \langle \mathbf{x},\mathbf{y}\rangle = 0 \text{ for every } \mathbf{x}\in S\} \) is itself a linear subspace of \(\mathbb {F}_q^n\), of dimension \(n-k\).

Theorem 51 Structure of a Linear Subspace, Thm 2.2.7

If \(S \subseteq \mathbb {F}_q^n\) is a linear subspace then:

\(|S| = q^k\) for some \(k\ge 0\). The parameter \(k\) is called the dimension of \(S\).
There exists at least one set of linearly independent vectors \(\mathbf{v}_1,\dots ,\mathbf{v}_k \in S\), called basis elements, such that every \(\mathbf{x}\in S\) can be expressed as \(\mathbf{x} = a_1\mathbf{v}_1+\dots +a_k\mathbf{v}_k\) where \(a_i\in \mathbb {F}_q\) for \(1\le i \le k\). In other words, there exists a full-rank \(k\times n\) matrix \(G\) (a generator matrix) with rows \(\mathbf{v}_1,\dots ,\mathbf{v}_k\), with entries from \(\mathbb {F}_q\), such that every \(\mathbf{x}\in S\) satisfies \(\mathbf{x} = (a_1,\dots ,a_k)\cdot G\) for some \((a_1,\dots ,a_k)\in \mathbb {F}_q^k\).
There exists a full-rank \((n-k)\times n\) matrix \(H\) (a parity check matrix) such that for every \(\mathbf{x}\in S\), \(H\mathbf{x}^T = \mathbf{0}\).
\(G\) and \(H\) are orthogonal, that is, \(G\cdot H^T = \mathbf{0}\).

Proof ▶

Property 1. For the sake of contradiction, assume \(q^k {\lt} |S| {\lt} q^{k+1}\) for some \(k\ge 0\). We construct a set of linearly independent vectors \(B\subseteq S\) with \(|B| = k+1\) greedily: pick \(\mathbf{v}_1\) to be any non-zero vector of \(S\) (possible since \(|S| {\gt} q^k \ge 1\)), and set \(B \leftarrow \{ \mathbf{v}_1\} \). Inductively, after step \(t {\lt} k+1\), \(|B| = t\) and, by Lemma 49, the span of \(B\) has size \(q^t \le q^k {\lt} |S|\); hence there exists \(\mathbf{v}_{t+1}\in S\setminus B\) linearly independent of \(B\), and we set \(B \leftarrow B\cup \{ \mathbf{v}_{t+1}\} \). Continuing until \(|B| = k+1\), the span of \(B\) is contained in \(S\) (as \(S\) is closed under linear combinations) and, by Lemma 49, has size \(q^{k+1} {\gt} |S|\), a contradiction. Hence \(|S| = q^k\) for some \(k\ge 0\).

Property 2. We may pick \(B = \{ \mathbf{v}_1,\dots ,\mathbf{v}_k\} \) to be any set of \(k\) linearly independent vectors in \(S\): the same greedy argument as in Property 1 shows that such a set exists (when \(|B| = t {\lt} k\), the span of \(B\) has size \(q^t {\lt} q^k = |S|\) so a vector outside the span can always be added, and the process must terminate once \(|B|=k\) since then the span already has \(q^k=|S|\) elements). The span of \(B\) is contained in \(S\), and by Lemma 49 it has size \(q^k = |S|\), so the span of \(B\) equals \(S\). Taking \(G\) to be the \(k\times n\) matrix with rows \(\mathbf{v}_1,\dots ,\mathbf{v}_k\) gives the desired generator matrix.

Property 3. By Lemma 50, the set \(N = \{ \mathbf{y}\in \mathbb {F}_q^n \mid \langle \mathbf{x},\mathbf{y}\rangle = 0 \text{ for every } \mathbf{x}\in S\} \) is a linear subspace of dimension \(n-k\). By Property 2 applied to \(N\), there exists a full-rank \((n-k)\times n\) matrix \(H\) whose rows form a basis of \(N\); by definition of \(N\), for every \(\mathbf{x}\in S\) and every row \(\mathbf{y}\) of \(H\), \(\langle \mathbf{x},\mathbf{y}\rangle = 0\), i.e. \(H\mathbf{x}^T = \mathbf{0}\).

Property 4. Each row \(\mathbf{v}_i\) of \(G\) is an element of \(S\) (by Property 2), so by Property 3, \(H\mathbf{v}_i^T = \mathbf{0}\) for every \(i\in [k]\). Stacking these \(k\) equations gives \(H\cdot G^T = \mathbf{0}\). Taking transposes, \(G\cdot H^T = (H\cdot G^T)^T = \mathbf{0}^T = \mathbf{0}\).

Lemma 52 Lem 2.2.8

Given a matrix \(G\) of dimension \(k\times n\) that is a generator matrix of subspace \(S_1\), and a matrix \(H\) of dimension \((n-k)\times n\) that is a parity check matrix of subspace \(S_2\), such that \(G H^T = \mathbf{0}\), then \(S_1 = S_2\).

Proof ▶

We first show \(S_1 \subseteq S_2\). Given any \(\mathbf{c}\in S_1\), there exists \(\mathbf{x}\in \mathbb {F}_q^k\) such that \(\mathbf{c} = \mathbf{x}G\). Then

\[ H\cdot \mathbf{c}^T = H\cdot (\mathbf{x}G)^T = HG^T\mathbf{x}^T = (GH^T)^T\mathbf{x}^T = \mathbf{0}, \]

which implies \(\mathbf{c}\in S_2\). To complete the proof, note that as \(H\) has full rank, its null space (which is \(S_2\)) has dimension \(n-(n-k) = k\), by the rank-nullity theorem. Now as \(G\) has full rank, the dimension of \(S_1\) is also \(k\). Thus, as \(S_1\subseteq S_2\) and both have the same size \(q^k\), it must be the case that \(S_1 = S_2\).

Definition 53 Linear Codes, Def 2.3.1

Let \(q\) be a prime power (i.e. \(q=p^s\) for some prime \(p\) and integer \(s\ge 1\)). \(C\subseteq \mathbb {F}_q^n\) is a linear code if it is a linear subspace of \(\mathbb {F}_q^n\). If \(C\) has dimension \(k\) and distance \(d\) then it is referred to as an \([n,k,d]_q\), or just an \([n,k]_q\), code.

Definition 54 Generator Matrix, Def 2.3.2

If \(C\) is an \([n,k]_q\) linear code then there exists a matrix \(G\in \mathbb {F}_q^{k\times n}\) of rank \(k\) satisfying

\[ C = \{ \mathbf{x}\cdot G \mid \mathbf{x}\in \mathbb {F}_q^k\} . \]

\(G\) is referred to as a generator matrix of \(C\): the code \(C\) is the set of all possible linear combinations of the rows of \(G\). Neither the generator matrix nor the parity check matrix of a code is unique; however, all generator matrices of \(C\) have the same \(k\times n\) dimensions. (We also allow matrices \(M\in \mathbb {F}_q^{m\times n}\) that are not of full row rank to generate a code \(C=\{ \mathbf{x}\cdot M \mid \mathbf{x}\in \mathbb {F}_q^m\} \), though we reserve the phrase “generator matrix” for full rank matrices.)

Definition 55 Parity Check Matrix, Def 2.3.2

If \(C\) is an \([n,k]_q\) linear code then there exists a matrix \(H\in \mathbb {F}_q^{(n-k)\times n}\) of rank \(n-k\) satisfying

\[ C = \{ \mathbf{y}\in \mathbb {F}_q^n \mid H\cdot \mathbf{y}^T = \mathbf{0}\} . \]

\(H\) is referred to as a parity check matrix of \(C\). All parity check matrices of \(C\) have the same \((n-k)\times n\) dimensions.

Proposition 56 Succinct Representation, Prop 2.3.3

Any \([n,k]_q\) linear code can be represented with \(\min (nk, n(n-k))\) symbols from \(\mathbb {F}_q\).

Proof ▶

If \(k\le n-k\), store the \(k\times n\) generator matrix \(G\) of \(C\), using \(nk\) symbols from \(\mathbb {F}_q\); this determines \(C\) since \(C=\{ \mathbf{x}G\mid \mathbf{x}\in \mathbb {F}_q^k\} \). Otherwise (\(k {\gt} n-k\)), store instead the \((n-k)\times n\) parity check matrix \(H\) of \(C\), using \(n(n-k)\) symbols from \(\mathbb {F}_q\); this determines \(C\) since \(C=\{ \mathbf{y}\mid H\mathbf{y}^T=\mathbf{0}\} \). In either case, \(C\) is represented using \(\min (nk,n(n-k))\) symbols from \(\mathbb {F}_q\).

Proposition 57 Encoding Complexity, Prop 2.3.4

For any \([n,k]_q\) linear code, given its generator matrix, encoding can be done with \(O(nk)\) operations over \(\mathbb {F}_q\).

Proof ▶

Given a message \(\mathbf{m}\in \mathbb {F}_q^k\), the corresponding codeword is \(C(\mathbf{m}) = \mathbf{m}\cdot G\), where \(G\) is the \(k\times n\) generator matrix of \(C\). Computing this matrix-vector product requires computing \(n\) coordinates, each a sum of \(k\) products of elements of \(\mathbb {F}_q\), for a total of \(O(nk)\) operations over \(\mathbb {F}_q\).

Proposition 58 Error Detection Complexity, Prop 2.3.5

For any \([n,k]_q\) linear code, given its parity check matrix, error detection can be performed in \(O(n(n-k))\) operations over \(\mathbb {F}_q\).

Proof ▶

Given a received word \(\mathbf{y}\in \mathbb {F}_q^n\) and the \((n-k)\times n\) parity check matrix \(H\) of \(C\), compute \(H\cdot \mathbf{y}^T\); this matrix-vector product costs \(O(n(n-k))\) operations over \(\mathbb {F}_q\). By definition of the parity check matrix, \(\mathbf{y}\in C\) if and only if \(H\cdot \mathbf{y}^T = \mathbf{0}\), which can be checked with \(O(n-k)\) further operations, for a total of \(O(n(n-k))\) operations over \(\mathbb {F}_q\).

2.3.1 On the Distance of a Linear Code

Proposition 59 Prop 2.3.6

For every \([n,k,d]_q\) code \(C\), \(d = \min _{\substack {\mathbf{c}\in C,\\ \mathbf{c}\ne \mathbf{0}}} \mathrm{wt}(\mathbf{c})\).

Proof ▶

We show \(d\) is no more than, and no less than, the minimum weight of a non-zero codeword.

\((\le )\): Let \(\mathbf{c}'\) be a non-zero codeword of \(C\) of minimum weight. Since \(\mathbf{0}\in C\) (Lemma 45), \(\Delta (\mathbf{0},\mathbf{c}') = \mathrm{wt}(\mathbf{c}')\), so \(d\le \mathrm{wt}(\mathbf{c}')\), i.e. \(d\) is at most the minimum weight.

\((\ge )\): Let \(\mathbf{c}_1 \ne \mathbf{c}_2 \in C\) be such that \(\Delta (\mathbf{c}_1,\mathbf{c}_2) = d\). Note that \(\mathbf{c}_1 - \mathbf{c}_2 \in C\): this is because \(-\mathbf{c}_2 = -1\cdot \mathbf{c}_2 \in C\), where \(-1\) is the additive inverse of \(1\) in \(\mathbb {F}_q\), and \(\mathbf{c}_1-\mathbf{c}_2 = \mathbf{c}_1+(-\mathbf{c}_2)\in C\) by closure of \(C\) under addition. Now \(\mathrm{wt}(\mathbf{c}_1-\mathbf{c}_2) = \Delta (\mathbf{c}_1,\mathbf{c}_2) = d\), since the non-zero symbols of \(\mathbf{c}_1-\mathbf{c}_2\) occur exactly at the positions where \(\mathbf{c}_1,\mathbf{c}_2\) differ. Since \(\mathbf{c}_1\ne \mathbf{c}_2\), \(\mathbf{c}_1-\mathbf{c}_2\ne \mathbf{0}\), so the minimum Hamming weight of a non-zero codeword of \(C\) is at most \(d\).

Proposition 60 Prop 2.3.7

For every \([n,k,d]_q\) code \(C\) with parity check matrix \(H\), \(d\) equals the size of the smallest subset of columns of \(H\) that is linearly dependent.

Proof ▶

By Proposition 59, it suffices to show that the minimum weight of a non-zero codeword of \(C\) equals the minimum number \(t\) of linearly dependent columns of \(H\); we show \(t\le d\) and \(t\ge d\).

Write the columns of \(H\) as \(H^1,\dots ,H^n\).

\((t\le d)\): Let \(\mathbf{c}\ne \mathbf{0}\in C\) with \(\mathrm{wt}(\mathbf{c}) = d\). Since \(H\cdot \mathbf{c}^T=\mathbf{0}\), working through the matrix multiplication gives \(\sum _{i=1}^n c_i H^i = \mathbf{0}\), where \(\mathbf{c}=(c_1,\dots ,c_n)\). We may skip the terms with \(c_i=0\), so the columns \(H^i\) with \(c_i\ne 0\) (there are exactly \(d\) of them, as \(\mathrm{wt}(\mathbf{c})=d\)) are linearly dependent. Hence \(t\le d\).

\((t\ge d)\): Let \(H^{i_1},\dots ,H^{i_t}\) be a minimum-size linearly dependent set of columns of \(H\). Then there exist non-zero elements \(c'_{i_1},\dots ,c'_{i_t}\in \mathbb {F}_q\) (non-zero, since no proper subset of the \(t\) columns is linearly dependent, by minimality of \(t\)) such that \(c'_{i_1}H^{i_1}+\dots +c'_{i_t}H^{i_t} = \mathbf{0}\). Extend \(c'_{i_1},\dots ,c'_{i_t}\) to a vector \(\mathbf{c}'\in \mathbb {F}_q^n\) by setting \(c'_j = 0\) for \(j\notin \{ i_1,\dots ,i_t\} \). Then \(H\cdot (\mathbf{c}')^T = \mathbf{0}\), so \(\mathbf{c}'\in C\); this implies, by Proposition 59, that \(d\le \mathrm{wt}(\mathbf{c}') = t\) (recalling that \(t\) is the minimum number of linearly dependent columns of \(H\)).

Definition 61 Binary Hamming Codes, Def 2.4.1

For any positive integer \(r\), define the matrix \(\mathbf{H}_r \in \mathbb {F}_2^{r\times (2^r-1)}\) to be the \(r\times (2^r-1)\) matrix whose \(i\)th column \(\mathbf{H}_r^i\) is the binary representation of \(i\), for \(1\le i \le 2^r-1\) (a vector in \(\{ 0,1\} ^r\)).

The \([2^r-1, 2^r-r-1]_2\) Hamming code, denoted \(C_{H,r}\), is the code with parity check matrix \(\mathbf{H}_r\). In other words,

\[ C_{H,r} = \{ \mathbf{c}\in \{ 0,1\} ^{2^r-1} \mid \mathbf{H}_r\cdot \mathbf{c}^T = \mathbf{0}\} . \]

Proposition 62 Prop 2.4.2

The Hamming code \([2^r-1, 2^r-r-1, 3]_2\) has distance 3.

Proof ▶

No two columns of \(\mathbf{H}_r\) are linearly dependent: over \(\mathbb {F}_2\), two columns \(\mathbf{H}_r^i, \mathbf{H}_r^j\) (\(i\ne j\)) are dependent if and only if \(\mathbf{H}_r^i + \mathbf{H}_r^j = \mathbf{0}\), i.e. \(\mathbf{H}_r^i = \mathbf{H}_r^j\); but this is impossible since \(\mathbf{H}_r^i,\mathbf{H}_r^j\) are the binary representations of the distinct integers \(i\ne j\), and so differ in at least one bit. Thus, by Proposition 60, the distance is at least 3. It is at most 3 since, e.g., \(\mathbf{H}_r^1+\mathbf{H}_r^2+\mathbf{H}_r^3=\mathbf{0}\) (as the binary representations of \(1,2,3\) satisfy \(1\oplus 2 = 3\)). Hence the distance is exactly 3.

Corollary 63 Hamming codes are perfect

The Hamming code \([2^r-1,2^r-r-1,3]_2\) is a perfect code: it meets the Hamming bound for \(d=3\) codes with equality.

Proof ▶

By Proposition 62, the Hamming code has distance \(d=3\). The Hamming bound for \(d=3\) codes of block length \(n\) states that \(k \le n - \log _2(n+1)\). Substituting \(n=2^r-1\) gives \(k\le 2^r-1-\log _2(2^r) = 2^r-r-1\), which is exactly the dimension of the Hamming code \(C_{H,r}\). Hence the Hamming code meets the Hamming bound with equality, i.e. it is a perfect code.

Algorithm 64 Naive Decoder for Hamming Code, Alg 2.5.1

Input: received word \(\mathbf{y}\in \{ 0,1\} ^n\), \(n=2^r-1\).

Output: \(\mathbf{c}\) if \(\Delta (\mathbf{y},\mathbf{c})\le 1\) for some \(\mathbf{c}\in C_{H,r}\), else Fail.

If \(\mathbf{y}\in C_{H,r}\), return \(\mathbf{y}\).
For \(i=1,\dots ,n\): let \(\mathbf{y}' \leftarrow \mathbf{y}+\mathbf{e}_i\), where \(\mathbf{e}_i\) is the \(i\)th standard basis vector; if \(\mathbf{y}'\in C_{H,r}\), return \(\mathbf{y}'\).
Return Fail.

Lemma 65 Correctness of Algorithm 2.5.1

Algorithm 64 correctly outputs \(\mathbf{c}\in C_{H,r}\) whenever \(\Delta (\mathbf{y},\mathbf{c})\le 1\) for some codeword \(\mathbf{c}\), and outputs Fail otherwise.

Proof ▶

Since \(C_{H,r}\) has distance 3 (Proposition 62), it can correct up to \(\lfloor (3-1)/2\rfloor = 1\) error, and whenever a codeword within distance 1 of \(\mathbf{y}\) exists it is unique (two codewords within distance 1 of a common point would be within distance 2 of each other, contradicting distance \(\ge 3\)).

If \(\Delta (\mathbf{y},\mathbf{c})=0\) then \(\mathbf{y}=\mathbf{c}\in C_{H,r}\), and Step 1 succeeds. If \(\Delta (\mathbf{y},\mathbf{c})=1\), then \(\mathbf{y}=\mathbf{c}+\mathbf{e}_i\) for the unique position \(i\) where \(\mathbf{y}\) and \(\mathbf{c}\) differ; then \(\mathbf{y}'=\mathbf{y}+\mathbf{e}_i=\mathbf{c}\in C_{H,r}\) is found at iteration \(i\) of the loop in Step 2 and returned. By uniqueness of the nearest codeword, no other trial \(\mathbf{y}+\mathbf{e}_j\) (\(j\ne i\)) is also in \(C_{H,r}\), so the algorithm cannot output an incorrect codeword. If no \(\mathbf{c}\in C_{H,r}\) with \(\Delta (\mathbf{y},\mathbf{c})\le 1\) exists, none of the checks in Steps 1-2 succeed, and the algorithm correctly outputs Fail.

Proposition 66 Running time of Algorithm 2.5.1

Algorithm 64 runs in time \(O(n^2\log n)\).

Proof ▶

If each membership check \(\mathbf{y}'\in C_{H,r}\) can be performed in time \(T(n)\), Algorithm 64 performs \(n+1\) such checks, for total time \(O(n\cdot T(n))\). Since \(C_{H,r}\) is a linear code of dimension \(k=n-O(\log n)\) (i.e. \(n-k=r+1=O(\log n)\)), Proposition 58 gives \(T(n)=O(n(n-k))=O(n\log n)\). Thus the overall running time is \(O(n\cdot n\log n)=O(n^2\log n)\).

Algorithm 67 Decoder for Any Linear Code, Alg 2.5.2

Input: received word \(\mathbf{y}\in \mathbb {F}_q^n\); a linear code \(C\) that is \([n,k,2t+1]_q\).

Output: \(\mathbf{c}\in C\) if \(\Delta (\mathbf{y},\mathbf{c})\le t\), else Fail.

For \(i=0,\dots ,t\): for each \(S\subseteq [n]\) with \(|S|=i\): for each \(\mathbf{z}\in \mathbb {F}_q^n\) supported on \(S\) with \(\mathrm{wt}(\mathbf{z})=i\): if \(\mathbf{y}-\mathbf{z}\in C\), return \(\mathbf{y}-\mathbf{z}\).
Return Fail.

Proposition 68 Running time of Algorithm 2.5.2

Algorithm 67, run on an \([n,k,2t+1]_q\) code \(C\), runs in time \(O(n^{t+2}q^t)\) operations over \(\mathbb {F}_q\), which is \(n^{O(t)}\) operations for \(q\) polynomial in \(n\).

Proof ▶

The number of error patterns \(\mathbf{z}\in \mathbb {F}_q^n\) with \(\mathrm{wt}(\mathbf{z})\le t\) considered by the algorithm is \(\sum _{i=0}^t \binom {n}{i}(q-1)^i = O((nq)^t)\). By Proposition 58, each membership check \(\mathbf{y}-\mathbf{z}\in C\) can be performed with \(O(n^2)\) operations over \(\mathbb {F}_q\) using the parity check matrix of \(C\). Hence the algorithm runs with \(O((nq)^t\cdot n^2) = O(n^{t+2}q^t)\) operations over \(\mathbb {F}_q\), which for \(q=n^{O(1)}\) is \(n^{O(t)}\).

Lemma 69 Syndrome locates the error

Let \(\mathbf{y} = \mathbf{c}+\mathbf{e}_i\) where \(\mathbf{c}\in C_{H,r}\) and \(\mathbf{e}_i\) is the unit vector with its sole non-zero entry at position \(i\) (with the convention \(\mathbf{e}_0=\mathbf{0}\), corresponding to no error). Then \(\mathbf{H}_r\cdot \mathbf{y}^T = \mathbf{H}_r^i\), the \(i\)th column of \(\mathbf{H}_r\) (with the convention that \(\mathbf{H}_r^0 = \mathbf{0}\)).

Proof ▶

Since \(\mathbf{c}\in C_{H,r}\), \(\mathbf{H}_r\cdot \mathbf{c}^T = \mathbf{0}\). Hence

\[ \mathbf{H}_r\cdot \mathbf{y}^T = \mathbf{H}_r\cdot \mathbf{c}^T + \mathbf{H}_r\cdot \mathbf{e}_i^T = \mathbf{H}_r\cdot \mathbf{e}_i^T = \mathbf{H}_r^i, \]

where the last equality holds because \(\mathbf{H}_r\cdot \mathbf{e}_i^T\) is exactly the \(i\)th column of \(\mathbf{H}_r\) (or \(\mathbf{0}\), by convention, if \(i=0\), i.e. if there was no error).

Algorithm 70 Efficient Decoder for Hamming Code, Alg 2.5.3

Input: received word \(\mathbf{y}\).

Output: \(\mathbf{c}\) if \(\Delta (\mathbf{y},\mathbf{c})\le 1\), else Fail.

\(\mathbf{b} \leftarrow H_r\cdot \mathbf{y}^T\).
Let \(i\in [n]\cup \{ 0\} \) be the number whose binary representation is \(\mathbf{b}\) (with \(i=0\) if \(\mathbf{b}=\mathbf{0}\)).
If \(\mathbf{y}-\mathbf{e}_i \in C_H\) (with \(\mathbf{e}_0=\mathbf{0}\)), return \(\mathbf{y}-\mathbf{e}_i\).
Return Fail.

Theorem 71 Thm 2.5.1

The \([n=2^r-1, 2^r-r-1, 3]_2\) Hamming code is 1-error correctable. Furthermore, decoding can be performed in time \(O(n\log n)\).

Proof ▶

Suppose \(\mathbf{y}=\mathbf{c}+\mathbf{e}_i\) for some \(\mathbf{c}\in C_{H,r}\) and \(i\in \{ 0\} \cup [n]\) (i.e. \(\Delta (\mathbf{y},\mathbf{c})\le 1\)). By Lemma 69, \(\mathbf{b} = \mathbf{H}_r\cdot \mathbf{y}^T = \mathbf{H}_r^i\), whose binary representation is exactly \(i\) (or \(\mathbf{b}=\mathbf{0}\) when \(i=0\)); hence Step 2 of Algorithm 70 correctly recovers \(i\), and Step 3 correctly recovers \(\mathbf{c} = \mathbf{y}-\mathbf{e}_i\). Since \(C_{H,r}\) has distance 3 (Proposition 62), the codeword \(\mathbf{c}\) within distance 1 of \(\mathbf{y}\) (if it exists) is unique, so this correctly decodes \(\mathbf{y}\) whenever \(\Delta (\mathbf{y},\mathbf{c})\le 1\) for some \(\mathbf{c}\in C_{H,r}\).

For the running time: \(\mathbf{H}_r\) is an \(r\times n\) matrix with \(n=2^r-1\), so \(r=\Theta (\log n)\). Step 1 is a matrix-vector product costing \(O(rn) = O(n\log n)\) operations over \(\mathbb {F}_2\). Step 2 costs \(O(r)=O(\log n)\) operations. Step 3 is a membership check in the \([n,n-r]_2\) code \(C_H\), which by Proposition 58 costs \(O(n\cdot r) = O(n\log n)\) operations. Hence Algorithm 70 runs in time \(O(n\log n)\) overall.

Definition 72 Dual of a code, Def 2.6.1

Let \(H\) be a parity check matrix of a code \(C\). The code generated by \(H\) is called the dual of \(C\), and is denoted \(C^\perp \). If \(C\) is an \([n,k]_q\) code, then \(C^\perp \) is an \([n,n-k]_q\) code, directly from the definition.

Definition 73 Simplex Code, Def 2.6.2

For a positive integer \(r\), the Simplex Code \(C_{Sim,r}\) is the code generated by \(\mathbf{H}_r\). Equivalently, \(C_{Sim,r} = C_{H,r}^\perp \).

Definition 74 Hadamard Code, Def 2.6.2

For a positive integer \(r\), the Hadamard Code \(C_{Had,r}\) is the code generated by the \(r\times 2^r\) matrix \(\overline{\mathbf{H}}_r\) obtained by adding an all-zero column to (say, in front of the columns of) \(\mathbf{H}_r\).

Proposition 75 Prop 2.6.3

\(C_{Sim,r}\) and \(C_{Had,r}\) both have distance \(2^{r-1}\).

Proof ▶

We first show the result for \(C_{Had,r}\). In fact we show something stronger: every non-zero codeword of \(C_{Had,r}\) has weight exactly \(2^{r-1}\); the claimed distance then follows from Proposition 59.

Consider a message \(\mathbf{x}\ne \mathbf{0}\) with \(i\)th entry \(x_i\). It is encoded as

\[ \mathbf{c} = (x_1,x_2,\dots ,x_r)\cdot (H_r^0, H_r^1,\dots ,H_r^{2^r-1}), \]

where \(H_r^j\) is the binary representation of \(0\le j\le 2^r-1\) (so the columns of \(\overline{\mathbf{H}}_r\) are exactly all vectors of \(\{ 0,1\} ^r\)). Note that the \(j\)th bit of \(\mathbf{c}\) is \(\langle \mathbf{x}, H_r^j\rangle \). Fix an index \(i\) with \(x_i=1\) (which exists since \(\mathbf{x}\ne \mathbf{0}\)) and group all the columns of \(\overline{\mathbf{H}}_r\) into pairs \((\mathbf{u},\mathbf{v})\) such that \(\mathbf{v}=\mathbf{u}+\mathbf{e}_i\) (i.e. \(\mathbf{u},\mathbf{v}\) agree everywhere except position \(i\)); this partitions all \(2^r\) columns into \(2^{r-1}\) disjoint pairs. Then,

\[ \langle \mathbf{x},\mathbf{v}\rangle = \langle \mathbf{x},\mathbf{u}+\mathbf{e}_i\rangle = \langle \mathbf{x},\mathbf{u}\rangle + \langle \mathbf{x},\mathbf{e}_i\rangle = \langle \mathbf{x},\mathbf{u}\rangle + x_i = \langle \mathbf{x},\mathbf{u}\rangle + 1. \]

Thus \(\langle \mathbf{x},\mathbf{v}\rangle \) is the negation of \(\langle \mathbf{x},\mathbf{u}\rangle \), i.e. exactly one of \(\langle \mathbf{x},\mathbf{v}\rangle ,\langle \mathbf{x},\mathbf{u}\rangle \) equals 1. Since the choice of pair \((\mathbf{u},\mathbf{v})\) was arbitrary, exactly one coordinate of \(\mathbf{c}\) in each of the \(2^{r-1}\) pairs is 1, so \(\mathrm{wt}(\mathbf{c}) = 2^{r-1}\).

For the simplex code, observe that every codeword of \(C_{Had,r}\) is obtained by padding a \(0\) in front of a codeword of \(C_{Sim,r}\) (since \(\overline{\mathbf{H}}_r\) is \(\mathbf{H}_r\) with a zero column prepended), so every non-zero codeword of \(C_{Sim,r}\) also has weight \(2^{r-1}\). By Proposition 59, the distance of \(C_{Sim,r}\) is also \(2^{r-1}\).

Proposition 76 Parameters of Simplex and Hadamard codes

\(C_{Sim,r}\) is a \([2^r-1, r, 2^{r-1}]_2\) code and \(C_{Had,r}\) is a \([2^r, r, 2^{r-1}]_2\) code.

Proof ▶

The block lengths \(2^r-1\) (for \(C_{Sim,r}\)) and \(2^r\) (for \(C_{Had,r}\)) follow immediately from the number of columns of the respective generator matrices \(\mathbf{H}_r\) and \(\overline{\mathbf{H}}_r\). The columns of \(\mathbf{H}_r\) include the binary representations of \(1,2,4,\dots ,2^{r-1}\), which are exactly the \(r\) standard basis vectors \(\mathbf{e}_1,\dots ,\mathbf{e}_r\in \{ 0,1\} ^r\); the corresponding \(r\times r\) sub-matrix of \(\mathbf{H}_r\) is (a column permutation of) the identity matrix, so \(\mathbf{H}_r\) has rank \(r\), giving \(C_{Sim,r}\) dimension \(r\). Likewise, \(\overline{\mathbf{H}}_r\), obtained from \(\mathbf{H}_r\) by appending a zero column, still contains this same rank-\(r\) sub-matrix, so it also has rank \(r\), giving \(C_{Had,r}\) dimension \(r\). The common distance \(2^{r-1}\) follows from Proposition 75.

We remark (without a formal node, as it is a summary observation rather than a new claim) that the family of Hamming codes has rate approaching \(1\) and relative distance approaching \(0\) as \(r\to \infty \), while the families of Simplex/Hadamard codes have rate approaching \(0\) and relative distance \(1/2\).