23 Complexity of Coding Problems

23.1 Nearest Codeword Problem (NCP)

Definition 493 Nearest Codeword Problem (NCP), Problem 23.1.1

The Nearest Codeword Problem is the following decision problem.

Input:

(F,G,v,t)$ where $GF^k×n$, $vF^n$ and $tZ^≥0$. \item \textbf{Output}: \textsc{Yes} if there exists $xF^k$ such that $Δ(v,xG)≤t$, and \textsc{No} otherwise. $

(Here Δ$ denotes Hamming distance and $xG$ ranges over the codewords of the linear code generated by $G$.) $

Definition

Definition 494 Maximum Cut Problem (MaxCut), Problem 23.1.2

The Maximum Cut Problem is the following decision problem.

Input: a graph

H=(V,E)$ with $E ⊆V2$, and an integer $ℓ$. \item \textbf{Output}: \textsc{Yes} if there exists a cut in $H$ of size at least $ℓ$, i.e., a set $S⊆V$ such that $Cut(S) {eE ∣|e∩S|=1}$ satisfies $|Cut(S)|≥ℓ$, and \textsc{No} otherwise. $

Definition

Theorem 495 MaxCut is NP$-complete, Theorem 23.1.3$

MaxCut (Problem 494) is NP$-complete. \begin{proof} \notready \end{proof} $

Theorem 496 The Nearest Codeword Problem is NP$-complete, Theorem 23.1.4$

The Nearest Codeword Problem (Problem 493) is NP$-complete. \begin{proof} \uses{def:c23-ncp, def:c23-maxcut, thm:c23-maxcut-npc} \textbf{Membership in $\mathsf{NP}$.} Given an instance \end{proof}$(F,G,v,t)$, a certificate is a vector $xF^k$; the verifier checks $Δ(v,xG)≤t$ in polynomial time and accepts iff this holds. Clearly a \textsc{Yes} instance has such a certificate and the verifier runs in polynomial time, so NCP is in $NP$. \textbf{$\mathsf{NP}$-hardness.} We reduce MaxCut to NCP with $F=F_2$. Given a graph $H$ on vertex set $V=[k]$ with edge set $E$, let $n=|E|$. Let $GF_2^k×n$ be the incidence matrix of $H$: rows indexed by $V$, columns indexed by $E$, and $G_u,e=1$ iff vertex $u$ is incident to edge $e$ (and $0$ otherwise). Set $v=1_n$ (the all-ones vector) and $t=n-ℓ$. This gives the NCP instance $(F_2,G,v,t)$. There is a one-to-one correspondence between $xF_2^k$ and subsets $S⊆V$ via $x_i=1$ iff $iS$. For this $x$, the vector $xG$ is exactly the characteristic vector of $Cut(S)$: for an edge $e={i,j}$, the coordinate $(xG)_e = x_i+x_j 2$ equals $1$ iff exactly one of $i,j$ is in $S$, i.e., iff $eCut(S)$. Hence \[ \Delta (\mathbf{x}\mathbf{G},\mathbf{1}_n) = n - |\mathrm{Cut}(S)|. \] Consequently there exists $xF_2^k$ with $Δ(xG,1_n)≤n-ℓ$ if and only if there exists $S⊆V$ with $|Cut(S)|≥ℓ$. Thus $(F_2,G,v,t)$ is a \textsc{Yes} instance of NCP if and only if $(H,ℓ)$ is a \textsc{Yes} instance of MaxCut. Since this reduction is computable in polynomial time and MaxCut is $NP$-hard (Theorem~ \ref{thm:c23-maxcut-npc}), NCP is $NP$-hard. Combining membership in $NP$ and $NP$-hardness, NCP is $NP$-complete. $

Proof ▶

23.2 Decoding with Preprocessing

Definition 497 Nearest Codeword Problem with Preprocessing (NCPP){G_n}_n$, Problem 23.2.1$

Fix a family of codes {C_n}_n$ given by generator matrices ${G_n}_n$, with $G_nF^k(n)×n$. The problem $NCPP{G_n}_n$ is the following decision problem. \begin{itemize} \item \textbf{Input}: \end{itemize}$(v,t)$ where $vF^n$ and $tZ^≥0$. \item \textbf{Output}: \textsc{Yes} if there exists $xF^k(n)$ such that $Δ(v,xG_n)≤t$, and \textsc{No} otherwise. $

Definition 498 Non-uniform algorithm with advice for NCPP

A non-uniform algorithmic solution to NCPP{G_n}_n$ consists of an algorithm $A(·,·;·)$ together with a sequence of advice strings ${B_n}_n$ such that $A(v,t;B_n)$ correctly solves $NCPP{G_n}_n$ for every $(v,t)$ with $vF^n$, subject to: \begin{enumerate} \item \end{enumerate}$A$ runs in time polynomial in $n$ (in particular $B_n$ has length polynomial in $n$); \item the time needed to \emph{compute} $B_n$ from $G_n$ may be unboundedly large. $

Theorem 499 Theorem 23.2.2

There exists a family of codes with generators

{G_n}_n$ such that $NCPP{G_n}_n$ (Definition~ \ref{def:c23-ncpp}) is $NP$-hard. Specifically there is a polynomial time reduction from MaxCut to $NCPP{G_n}_n$. \begin{proof} \uses{def:c23-maxcut, thm:c23-maxcut-npc} For every positive integer \end{proof}$k$ and $n=k(k-1)$, define $G_nF_2^k×n$ by indexing rows by $[k]$ and columns by ordered pairs $(i,j)$ with $i≠j$, $i,j[k]$, and setting $G_n[r,(i,j)]=1$ if $r=i$ or $r=j$, and $0$ otherwise. Thus $G_n$ has two (identical) columns for every undirected edge ${i,j}$ of the complete graph on $k$ vertices, indexed by $(i,j)$ and $(j,i)$; in particular for every $xF_2^k$ and every ${i,j}$, $(xG_n)_(i,j) = (xG_n)_(j,i)$, since both columns of $G_n$ coincide. Given an instance $(H,ℓ)$ of MaxCut with $H=(V,E)$, $V=[k]$, we produce an instance of $NCPP{G_n}_n$ as follows. Define $vF_2^n$ by: if ${i,j}E$ then $v_(i,j)=v_(j,i)=1$; otherwise (assuming $i<j$) set $v_(i,j)=0$ and $v_(j,i)=1$. Set $t = n2+|E|-2ℓ$. This gives the instance $(v,t)$ of $NCPP{G_n}_n$. As in the proof of Theorem~ \ref{thm:c23-ncp-npc}, identify $xF_2^k$ with a subset $S⊆V$ (via $x_i=1$ iff $iS$), and let $c(S)=|{{i,j}E ∣x_i+x_j=1}|$ denote the number of edges of $H$ cut by $S$. Let $w=xG_n$; by the observation above, $w_(i,j)=w_(j,i)$ for all $i≠j$. We claim $Δ(w,v) = n2+|E|-2c(S)$. Indeed, for a pair ${i,j}E$ we have $w_(i,j)=w_(j,i)$ while by construction $v_(i,j)≠v_(j,i)$; hence exactly one of the two coordinates $(i,j),(j,i)$ contributes a mismatch, so pairs not in $E$ contribute a total of $n2-|E|$ to $Δ(w,v)$ (there are $n2 - |E|$ such unordered pairs). For a pair ${i,j}E$: if ${i,j}$ is cut by $S$ then $w_(i,j)=w_(j,i)=v_(i,j)=v_(j,i)=1$, contributing $0$; if ${i,j}$ is not cut by $S$ then $w_(i,j)=w_(j,i)=0$ while $v_(i,j)=v_(j,i)=1$, contributing $2$ mismatches. Summing over the $|E|$ edges of $H$ contributes $2(|E|-c(S))$. Altogether \[ \Delta (\mathbf{w},\mathbf{v}) = \Big(\tfrac n2 - |E|\Big) + 2\big(|E|-c(S)\big) = \tfrac n2 + |E| - 2c(S). \] Thus $Δ(w,v)≤t = n2+|E|-2ℓ$ iff $c(S)≥ℓ$. Hence a vector $x$ with $Δ(xG_n,v)≤t$ exists iff a cut of size at least $ℓ$ exists in $H$, so $(v,t)$ is a \textsc{Yes} instance of $NCPP{G_n}_n$ iff $(H,ℓ)$ is a \textsc{Yes} instance of MaxCut. As the map $(H,ℓ)↦(v,t)$ is computable in polynomial time, and MaxCut is $NP$-hard (Theorem~ \ref{thm:c23-maxcut-npc}), this shows $NCPP{G_n}_n$ is $NP$-hard. $

Proof ▶

Theorem

23.3 Approximate NCP

Definition 500 Gap Nearest Codeword Problem (GapNCP), Problem 23.3.1

For a real number

g≥1$, let \[ \mathrm{Gap}_g\mathrm{NCP}_{\mathrm{Yes}} = \{ (\mathbb {F},\mathbf{G},\mathbf{v},t)\mid \exists \, \mathbf{x}\in \mathbb {F}^k \text{ s.t. } \Delta (\mathbf{v},\mathbf{x}\mathbf{G})\le t\} \] and \[ \mathrm{Gap}_g\mathrm{NCP}_{\mathrm{No}} = \{ (\mathbb {F},\mathbf{G},\mathbf{v},t)\mid \forall \, \mathbf{x}\in \mathbb {F}^k,\ \Delta (\mathbf{v},\mathbf{x}\mathbf{G}){\gt} g\cdot t\} . \] The problem $Gap_gNCP$ is the restriction of NCP to inputs from $Gap_gNCP_Yes ∪Gap_gNCP_No$: it outputs \textsc{Yes} if $(F,G,v,t)Gap_gNCP_Yes$, and \textsc{No} otherwise (i.e. on $Gap_gNCP_No$ instances). $

Definition

Definition 501 Gap Maximum Cut Problem

For a real number

g≥1$, let \[ \mathrm{Gap}_g\mathrm{CUT}_{\mathrm{Yes}} = \{ (H,\ell )\mid \exists \text{ cut in } H \text{ of size at least } \ell \} \] and \[ \mathrm{Gap}_g\mathrm{CUT}_{\mathrm{No}} = \{ (H,\ell )\mid \text{every cut in } H \text{ has size at most } \ell /g\} . \] $Gap_gCUT$ is the restriction of MaxCut to $Gap_gCUT_Yes∪Gap_gCUT_No$. $

Definition

Theorem 502 Theorem 23.3.2

There exists

g>1$ such that $Gap_gCUT$ is $NP$-hard. \begin{proof} \notready \end{proof} $

Theorem

Lemma 503 Lemma 23.3.3

There exists

g>1$ such that $Gap_gNCP$ is $NP$-hard. Furthermore the hardness holds even when restricted to instances $(F,G,v,t)$ where $F=F_2$ and the target vector $v=1_n$. \begin{proof} \notready \end{proof} $

Lemma

Theorem 504 Theorem 23.3.4

For every

g_1<∞$, $Gap_g_1NCP$ is $NP$-hard. \begin{proof} \uses{lem:c23-gapncp-nphard-base} \textbf{Main reduction.} We show that for every \end{proof}$g$, $Gap_gNCP$ reduces to $Gap_g^2NCP$, provided the target vector is $v=1_n$. Given such a reduction, the theorem follows: by Lemma~ \ref{lem:c23-gapncp-nphard-base}, $Gap_g_0NCP$ (with $v=1_n$) is $NP$-hard for some $g_0>1$. Composing the reduction with itself $s$ times shows $Gap_g_0NCP$ reduces to $Gap_g_0^2^sNCP$ for every positive integer $s$. Choosing $s$ large enough that $g_0^2^s>g_1$, we get that $Gap_g_0NCP$ reduces to $Gap_g_1NCP$, so the latter is $NP$-hard. \textbf{The reduction from $\mathrm{Gap}_g\mathrm{NCP}$ to $\mathrm{Gap}_{g^2}\mathrm{NCP}$.} Let $(F_2,G,v,t)$ with $v=1_n$ be an instance of $Gap_gNCP$, and let $C$ be the code of block length $n$ generated by $G$. Let $R⊆F_2^n$ be the $n$-fold repetition code $R={0^n,1_n}$, and let $I=F_2^n$ be the identity code (generated by the $n×n$ identity matrix). Given two codes $C,D$, let $C⊗D$ denote the code whose codewords are $n×n$ matrices all of whose rows are codewords of $C$ and all of whose columns are codewords of $D$. Let $C_R = C⊗R$ and $C_I = I⊗C$, and set \[ C_2 = C_R + C_I = \{ \mathbf{c}_R+\mathbf{c}_I \mid \mathbf{c}_R\in C_R,\ \mathbf{c}_I\in C_I\} , \] a linear code of block length $n^2$ whose generator matrix $G_2$ can be computed in polynomial time from $G$ (since $G_R,G_I$ are computable in polynomial time, and the generator matrix of a direct sum of two codes with known generators is computable in polynomial time). The reduction outputs $(F_2,G_2,1_n^2,t^2)$. Every matrix in $C_R$ has identical rows, each equal to some codeword $w_aC$ (rows are codewords of $C$, and columns must be codewords of $R={0^n,1_n}$, forcing every row to be the same). Every matrix in $C_I$ has arbitrary rows so long as every column is a codeword of $C$. \textit{Completeness.} We show: if there exists $wC$ with $Δ(w,1_n)≤t$ then there exists $w_2C_2$ with $Δ(w_2,1_n^2)≤t^2$. Regard row vectors as column vectors via transpose; write $v=1_n$. Let $w_2 = v^T·w + w^T·(v-w)$ (an $n×n$ matrix). By construction $v^T·wC_R$ (every row equals $wC$) and $w^T·(v-w)C_I$ (every column is a scalar multiple of $w$, hence in $C$, using linearity), so $w_2C_2$. Let $S={i ∣w_i=0}$, so $|S|=Δ(w,1_n)≤t$. A direct coordinatewise check shows $(w_2)_i,j=0$ if and only if $i,jS$ (both summands vanish exactly on $S×S$, while for $iS$ or $jS$ at least one summand is nonzero). Hence $w_2$ has at most $|S|^2≤t^2$ zero coordinates, giving $Δ(w_2,1_n^2)≤t^2$. \textit{Soundness.} Conversely suppose $w_2C_2$ satisfies $Δ(w_2,1_n^2)≤g^2t^2$ (i.e. $w_2$ has at most $g^2t^2$ zero coordinates). We show there exists $wC$ with $Δ(w,1_n)≤gt$. Write $w_2=w_R+w_I$ with $w_RC_R$, $w_IC_I$. By the structure of $C_R$, $w_R$ consists of $n$ identical rows, each equal to some codeword $w_aC$. If $Δ(w_a,1_n)≤gt$ we are done (take $w=w_a$). Otherwise $w_a$ has strictly more than $gt$ zero coordinates; let $T$ be the set of (more than $gt$) columns where $w_R$ is entirely zero. Restricting $w_2$ to the columns in $T$ gives a matrix each of whose columns equals $w_I$ on that column, hence is a codeword of $C$ (by definition of $C_I$). If every such column $w’$ satisfies $Δ(w’,1_n)>gt$, then each of the more than $gt$ columns in $T$ contributes more than $gt$ zero coordinates to $w_2$, so $w_2$ has strictly more than $g^2t^2$ zero coordinates, contradicting our hypothesis. Hence some column $w_bT$ is a codeword of $C$ with $Δ(w_b,1_n)≤gt$; take $w=w_b$. Combining completeness and soundness: $(F_2,G,1_n,t)Gap_gNCP_Yes$ implies $(F_2,G_2,1_n^2,t^2)Gap_g^2NCP_Yes$, and $(F_2,G,1_n,t)Gap_gNCP_No$ implies (by the contrapositive of the soundness argument) $(F_2,G_2,1_n^2,t^2)Gap_g^2NCP_No$. This gives the desired polynomial-time reduction from $Gap_gNCP$ to $Gap_g^2NCP$, concluding the proof. $

Proof ▶

Theorem

23.4 Distance bounded decoding

Definition 505 Gap Distance Bounded Decoding (GapDBD), Problem 23.4.1

For a matrix

GF^k×n$ let $d(G)$ denote the minimum distance of the code generated by $G$, i.e. $d(G) = _xF^k∖{0^k} wt(xG)$. For a real number $g≥1$, let \[ \mathrm{Gap}_g\mathrm{DBD}_{\mathrm{Yes}} = \{ (\mathbb {F},\mathbf{G},\mathbf{v},t) \mid d(\mathbf{G})\ge g\cdot t \text{ and } \exists \, \mathbf{x}\in \mathbb {F}_2^k \text{ s.t. } \Delta (\mathbf{v},\mathbf{x}\mathbf{G})\le t\} \] and \[ \mathrm{Gap}_g\mathrm{DBD}_{\mathrm{No}} = \{ (\mathbb {F},\mathbf{G},\mathbf{v},t) \mid d(\mathbf{G})\ge g\cdot t \text{ and } \forall \, \mathbf{x}\in \mathbb {F}_2^k,\ \Delta (\mathbf{v},\mathbf{x}\mathbf{G}){\gt}g\cdot t\} . \] The problem $Gap_gDBD$ is the restriction of NCP to $Gap_gDBD_Yes∪Gap_gDBD_No$: output \textsc{Yes} if $(F,G,v,t)Gap_gDBD_Yes$ and \textsc{No} otherwise. $

Definition

Lemma 506 Lemma 23.4.3

There exists

0<ε<12$ and a randomized algorithm that, on input an integer $k$, runs in time polynomial in $k$ and outputs, with probability $1-o(1)$, integers $k_0,n_0$, a matrix $G_0F_2^k_0×n_0$ generating a code $C_0$ of minimum distance \[ d_0 = \frac{k}{2}\cdot \Big\lfloor (k-1)^{\frac{2}{1-2\varepsilon }}\Big\rfloor \] and a vector $wF^n_0$ such that \[ \big|\{ \mathbf{y}\in \mathbb {F}_2^{k_0}\mid \Delta (\mathbf{y}\mathbf{G}_0,\mathbf{w})\le (1-\varepsilon )d_0\} \big| \ge 4^k. \] \begin{proof} \notready \end{proof} $

Lemma

Lemma 507 Lemma 23.4.4

For integers

k,k_0,N$ let $S⊆F_2^k_0$ be a set of size at least $N$ and let $zF_2^k$ be a fixed vector. Then for a uniformly random $AF_2^k_0×k$, \[ \Pr _{\mathbf{A}}[\exists \, \mathbf{y}\in S \text{ s.t. } \mathbf{y}\mathbf{A}=\mathbf{z}] \ge 1-\frac{2^k}{N}. \] \begin{proof} Since the event is monotone in \end{proof}$S$, we may assume $S$ consists of exactly $N$ nonzero vectors (discard the zero vector if present and any extra elements; this can only decrease the probability of the event, so proving the bound for this sub-collection suffices). For $yS$ fixed and nonzero, the map $A↦yA$ is a surjective linear map from $F_2^k_0×k$ (as $A$ ranges uniformly) onto $F_2^k$, so $yA$ is uniformly distributed over $F_2^k$ and $[yA=z]=2^-k$. Moreover for distinct nonzero $y_1≠y_2S$, since the only nonzero scalar of $F_2$ is $1$, $y_1$ and $y_2$ are linearly independent over $F_2$. Hence $(y_1A,y_2A)$ is uniformly distributed over $F_2^k×F_2^k$, so $y_1A$ and $y_2A$ are independent, and $[y_1A=z, y_2A=z]=2^-2k$. Let $N_z = |{yS ∣yA=z}|$. By linearity of expectation, $E[N_z] = N·2^-k$, and \[ \mathbb {E}[N_{\mathbf{z}}^2] = N\cdot 2^{-k} + N(N-1)\cdot 2^{-2k}. \] Hence \[ \mathrm{Var}(N_{\mathbf{z}}) = \mathbb {E}[N_{\mathbf{z}}^2] - \mathbb {E}[N_{\mathbf{z}}]^2 = N\cdot 2^{-k}+N(N-1)2^{-2k} - N^2 2^{-2k} = N\cdot 2^{-k}(1-2^{-k}) \le N\cdot 2^{-k}. \] By Chebyshev's inequality, \[ \Pr [N_{\mathbf{z}}=0] \le \Pr \big[|N_{\mathbf{z}}-\mathbb {E}[N_{\mathbf{z}}]|\ge \mathbb {E}[N_{\mathbf{z}}]\big] \le \frac{\mathrm{Var}(N_{\mathbf{z}})}{\mathbb {E}[N_{\mathbf{z}}]^2} \le \frac{N\cdot 2^{-k}}{N^2\cdot 2^{-2k}} = \frac{2^k}{N}. \] Thus $[∃ yS s.t. yA=z] = [N_z≥1] = 1-[N_z=0] ≥1-2^kN$, as desired. $

Proof ▶

Lemma

Theorem 508 Theorem 23.4.2

There exists

g’>1$ such that $Gap_g’DBD$ is $NP$-hard under randomized reductions. Specifically there exists $g$ and a randomized polynomial time reduction $R$ from $Gap_gNCP$ to $Gap_g’DBD$ such that for every instance $IGap_gNCP_Yes∪Gap_gNCP_No$ we have \begin{enumerate} \item \end{enumerate}$R(I)Gap_g’DBD_Yes∪Gap_g’DBD_No$ with probability $1-o(1)$, and \item with probability $1-o(1)$, $R(I)Gap_g’DBD_Yes$ if and only if $IGap_gNCP_Yes$. $

Proof ▶

Let

ε(0,12)$ be as given by Lemma~ \ref{lem:c23-high-distance-agreement-code}; we may assume $ε=1/s$ for some integer $s$ (reducing $ε$ if necessary while keeping it positive). Set \[ g' = \frac{1}{1-\varepsilon /2}, \qquad g = \frac{2}{\varepsilon }. \] We show $Gap_gNCP$ reduces to $Gap_g’DBD$; the existence of a suitable $g’>1$ for which $Gap_g’DBD$ is $NP$-hard under randomized reductions then follows from Theorem~ \ref{thm:c23-gapncp-nphard-all-g} (applied with $g_1=g$, so that $Gap_gNCP$ is $NP$-hard). \textbf{The reduction.} Let $(F_2,G,v,t)$ be an instance of $Gap_gNCP$ with $GF_2^k×n$. Run the randomized algorithm of Lemma~ \ref{lem:c23-high-distance-agreement-code} on input $k$ to obtain (with probability $1-o(1)$) integers $n_0,k_0$, a matrix $G_0F_2^k_0×n_0$ generating a code $C_0$ of minimum distance $d_0 = t·g = 2tε$ (padding by repetition if necessary to hit this exact target distance, without changing the salient properties), and a vector $wF_2^n_0$ such that the set $S {yF_2^k_0 ∣Δ(yG_0,w)≤(1-ε)d_0}$ satisfies $|S|≥4^k$. Let $AF_2^k_0×k$ be a uniformly random matrix. Set $G’ = [G_0 ∣AG] F_2^k_0×(n_0+n)$, $v’=(w,v)F_2^n_0+n$, and $t’ = t+(1-ε)d_0$. The reduction outputs $(F_2,G’,v’,t’)$. \textbf{Parameter check.} We have $t’ = ε2d_0 + (1-ε)d_0 = (1-ε2)d_0 = d_0/g’$, i.e. $d_0 = g’ t’$. \textbf{Soundness.} Suppose $(F_2,G,v,t)Gap_gNCP_No$, i.e. for every $xF_2^k$, $Δ(v,xG)>gt$. Then for every $yF_2^k_0$, taking $x=yA$ gives $Δ(v,yAG)>gt$. Since Hamming distance of concatenated vectors satisfies $Δ((w,v),y[G_0∣AG]) ≥Δ(v,yAG)$, we get $Δ(v’,yG’) ≥gt = d_0 = g’t’$ for every $y$, so $(F_2,G’,v’,t’)Gap_g’DBD_No$ (recalling $d(G’)≥d(G_0)=d_0=g’t’$, since restricting any nonzero codeword of $G’$ to the $G_0$-columns already gives a nonzero codeword of $C_0$, of weight at least $d_0$). This holds with probability $1$ (deterministically, given the outcome of Lemma~ \ref{lem:c23-high-distance-agreement-code}’s randomized algorithm). \textbf{Completeness.} Suppose $(F_2,G,v,t)Gap_gNCP_Yes$, i.e. there is $zF_2^k$ with $Δ(v,zG)≤t$. By Lemma~ \ref{lem:c23-random-linear-hit} (applied with the set $S$ above, of size $≥4^k$, and the fixed vector $z$), with probability at least $1-2^k/4^k = 1-o(1)$ there exists $yS$ with $yA=z$. Condition on this event (which, together with the success event of Lemma~ \ref{lem:c23-high-distance-agreement-code}, holds with probability $1-o(1)$ by the union bound). For this $y$, since $yS$ we have $Δ(yG_0,w)≤(1-ε)d_0$, and since $yA=z$ we have $Δ(v,yAG) = Δ(v,zG)≤t$. Hence \[ \Delta (\mathbf{v}',\mathbf{y}\mathbf{G}') = \Delta (\mathbf{w},\mathbf{y}\mathbf{G}_0) + \Delta (\mathbf{v},\mathbf{y}\mathbf{A}\mathbf{G}) \le (1-\varepsilon )d_0 + t = t', \] so $(F_2,G’,v’,t’)Gap_g’DBD_Yes$ with probability $1-o(1)$. Combining soundness (probability $1$) and completeness (probability $1-o(1)$), and the probability-$1-o(1)$ success of the preprocessing step of Lemma~ \ref{lem:c23-high-distance-agreement-code}, both conclusions (1) and (2) of the theorem hold with probability $1-o(1)$, as desired. $

Proof ▶

Theorem

23.5 Minimum distance problem

Definition 509 Gap Minimum Distance (GapMinDist), Problem 23.5.1

Let

F$ be a field, $GF^k×n$ a generator matrix of a code of dimension $k$ and block length $n$, and $d(G)$ its minimum distance. For a real number $g≥1$, let \[ \mathrm{Gap}_g\mathrm{MinDist}_{\mathrm{Yes}} = \{ (\mathbb {F},\mathbf{G},\overline{d}) \mid d(\mathbf{G})\le \overline{d}\} \] and \[ \mathrm{Gap}_g\mathrm{MinDist}_{\mathrm{No}} = \{ (\mathbb {F},\mathbf{G},\overline{d}) \mid d(\mathbf{G}){\gt} g\cdot \overline{d}\} . \] The problem $Gap_gMinDist$ is the restriction of the natural decision problem to $Gap_gMinDist_Yes∪Gap_gMinDist_No$: output \textsc{Yes} if $(F,G,d)Gap_gMinDist_Yes$ and \textsc{No} otherwise. $

Definition

Lemma 510 Lemma 23.5.2(i): Gap_gDBD$ reduces to $Gap_gMinDist

For every g>1$ there is a deterministic polynomial time reduction from $Gap_gDBD$ to $Gap_gMinDist$. \begin{proof} \uses{def:c23-gap-dbd, def:c23-gap-mindist} Given an instance \end{proof}$(F,G,v,t)$ of $Gap_gDBD$ with $GF^k×n$, form the $(k+1)×n$ matrix $G’ =

$ (i.e. $G$ with the row $v$ adjoined), and output $(F,G’,d)$ with $d=t$. This is computable in polynomial time. The nonzero codewords generated by $G’$ are exactly the vectors $c·v+xG$ for $cF$, $xF^k$, not both $c=0,x=0$. Those with $c=0$ are exactly the nonzero codewords of the code generated by $G$, of weight at least $d(G)$. Those with $c≠0$ may be rescaled (dividing by $c$, which does not change Hamming weight) to the form $v + xG$ for $xF^k$ arbitrary (reparametrizing $x ↦x/c$); as $x$ ranges over $F^k$ so does $-x$, so ${wt(v+xG) : xF^k} = {wt(v-xG):x} = {Δ(v,xG):xF^k}$. Hence \[ d(\mathbf{G}') = \min \Big(d(\mathbf{G}),\ \min _{\mathbf{x}\in \mathbb {F}^k}\Delta (\mathbf{v},\mathbf{x}\mathbf{G})\Big). \] Now suppose $(F,G,v,t)Gap_gDBD_Yes$, so $d(G)≥gt$ and there exists $x$ with $Δ(v,xG)≤t$; then $d(G’) ≤t = d$, so $(F,G’,d)Gap_gMinDist_Yes$. Suppose instead $(F,G,v,t)Gap_gDBD_No$, so $d(G)≥gt$ and for every $x$, $Δ(v,xG)>gt$; then $d(G’) = (d(G), _xΔ(v,xG)) > gt = gd$, so $(F,G’,d)Gap_gMinDist_No$. This proves correctness of the reduction. $

Proof ▶

Lemma 511 Lemma 23.5.2(ii): gap amplification for GapMinDist

For every g>1$ and $g’<∞$ there is a deterministic polynomial time reduction from $Gap_gMinDist$ to $Gap_g’MinDist$. \begin{proof} \uses{def:c23-gap-mindist} Given an instance \end{proof}$(F,G,d)$ of $Gap_gMinDist$ and a positive integer $ℓ$, let $G^⊗ℓ$ denote the generator matrix of the $ℓ$-fold tensor power of the code generated by $G$ with itself. Using the fact that tensoring multiplies minimum distances, $d(G^⊗ℓ) = d(G)^ℓ$, the reduction outputs $(F,G^⊗ℓ,d^ ℓ)$, computable in polynomial time (for fixed $ℓ$) from $(F,G,d)$. If $d(G)≤d$ then $d(G^⊗ℓ) = d(G)^ℓ≤d^ ℓ$, so YES instances map to YES instances. If $d(G)>gd$ then $d(G^⊗ℓ) = d(G)^ℓ> (gd)^ℓ= g^ℓd^ ℓ$, so NO instances of $Gap_gMinDist$ map to instances $(F,G^⊗ℓ,d^ ℓ)$ with $d(G^⊗ℓ) > g^ℓd^ ℓ$. Choosing $ℓ$ large enough that $g^ℓ≥g’$, this shows such instances lie in $Gap_g’MinDist_No$. This proves the reduction is correct. $

Proof ▶

Theorem 512 Theorem 23.5.3

For every constant

g<∞$, the problem $Gap_gMinDist$ is $NP$-hard under randomized reductions. Consequently, unless all of $NP$ has randomized polynomial time algorithms, there are no (randomized) polynomial time algorithms to approximate the minimum distance of a linear code given its generator matrix to within a multiplicative factor of $g$. \begin{proof} \uses{lem:c23-dbd-to-mindist, lem:c23-mindist-gap-amplification, thm:c23-gapdbd-nphard-randomized} By Theorem~ \ref{thm:c23-gapdbd-nphard-randomized}, there is \end{proof}$g_0’>1$ such that $Gap_g_0’DBD$ is $NP$-hard under randomized reductions. By Lemma~ \ref{lem:c23-dbd-to-mindist} (with $g=g_0’$), $Gap_g_0’DBD$ reduces (deterministically, in polynomial time) to $Gap_g_0’MinDist$, so $Gap_g_0’MinDist$ is $NP$-hard under randomized reductions. Finally, by Lemma~ \ref{lem:c23-mindist-gap-amplification}, for every constant $g<∞$, $Gap_g_0’MinDist$ reduces (deterministically, in polynomial time) to $Gap_gMinDist$. Composing the randomized reduction with the deterministic one gives a randomized polynomial time reduction establishing that $Gap_gMinDist$ is $NP$-hard, for every constant $g<∞$. If a randomized polynomial time algorithm could approximate the minimum distance of a linear code to within a factor of $g$, it would decide $Gap_gMinDist$ in randomized polynomial time, and composing with the reduction above would give a randomized polynomial time algorithm for an $NP$-hard problem, hence for all of $NP$. $

Proof ▶

Theorem

23.6 Conclusions

Proposition 513 Open Question 23.6.1

The status of the problem

Gap_1/2DBD$ (Definition~ \ref{def:c23-gap-dbd}) — whether it lies in $P$ or is $NP$-hard — is, as of the writing of the book, an open problem. \notready $

Proposition