6 The decoding error under adversarial stragglers

Proposition 39 Adversarial error via singular values [Proposition V.1]

Let \(A \in \mathbb {R}^{N \times m}\) be any assignment matrix on \(N\) data points in which each data point is replicated exactly \(d\) times and each machine holds exactly \(\ell \) data points. Let \(\sigma _2\) be the second largest singular value of \(A\). For any set \(S\) of \(s\) stragglers there exist decoding coefficients \(w = w(S)\) such that

\[ \tfrac 1N|\alpha - \mathbb {1}|_2^2 \le \tfrac 1N\Big(\tfrac {\sigma _2}{d}\Big)^2 \tfrac {sm}{m-s}. \]

Proof ▶

For the straggler set \(S\) choose \(w_i = \tfrac {m}{d(m-s)}\) for \(i \notin S\) and \(w_i = 0\) for \(i \in S\). Then with \(z := w - \tfrac 1d\mathbb {1}\),

\[ |\alpha - \mathbb {1}|_2^2 = \big|Aw - \tfrac 1d A\mathbb {1}\big|_2^2 = |Az|_2^2, \]

using \(A\mathbb {1}= d\mathbb {1}\) (each column has \(\ell \) ones... more precisely \(\tfrac 1d A\mathbb {1}= \mathbb {1}\) since each row of \(A\) has \(d\) ones). The matrix \(A\) has top singular value \(\sigma _1 = \sqrt{\ell d}\) with top right singular vector \(\mathbb {1}/\sqrt m\) (since \(A^TA\) has top eigenvector \(\mathbb {1}/\sqrt m\) and top eigenvalue \(\ell d\)). Because \(z \perp \mathbb {1}\) and \(|z|_2^2 = \tfrac 1{d^2}\big(s + (m-s)\tfrac {s^2}{(m-s)^2}\big) = \tfrac {ms}{(m-s)d^2}\), we get

\[ |Az|_2^2 \le \sigma _2^2|z|_2^2 \le \Big(\tfrac {\sigma _2}{d}\Big)^2\tfrac {sm}{m-s}, \]

as desired.

Corollary 40 Adversarial error for graph schemes [Corollary V.2]

Let \(A\) be a graph assignment scheme for a \(d\)-regular graph \(G\) with spectral expansion \(\lambda \). For any set of \(\lfloor pm\rfloor \) stragglers there exist decoding coefficients \(w\) with

\[ \tfrac 1n|\alpha - \mathbb {1}|_2^2 \le \tfrac {2d-\lambda }{2d}\cdot \tfrac {p}{1-p}. \]

Proof ▶

With \(\mathcal{A}(G)\) the adjacency matrix, \(\lambda _1(\mathcal{A}(G)) = d\) and \(\lambda _2(\mathcal{A}(G)) = d - \lambda \). Since each column of \(A\) has exactly two ones, \(A^TA = \mathcal{A}(G) + dI\), so \(\sigma _2(A) = \sqrt{\lambda _2(A^TA)} = \sqrt{2d - \lambda }\). Applying Proposition 39 with \(s = \lfloor pm\rfloor \) and \(\sigma _2^2 = 2d - \lambda \),

\[ \tfrac 1n|\alpha - \mathbb {1}|_2^2 \le \tfrac 1n\, \tfrac {2d-\lambda }{d^2}\, \tfrac {p}{1-p}\, m = \tfrac {2d-\lambda }{2d}\cdot \tfrac {p}{1-p}, \]

using \(d = 2m/n\).

Corollary 41 Adversarial error for good expanders [Corollary V.3]

Let \(A\) be a graph assignment scheme for a \(d\)-regular graph \(G\) with spectral gap \(\lambda = d - o(d)\). For any set of \(\lfloor pm\rfloor \) stragglers there exist decoding coefficients \(w\) with

\[ \tfrac 1n|\alpha - \mathbb {1}|_2^2 \le \tfrac {1+o(1)}{2}\cdot \tfrac {p}{1-p}. \]

Proof ▶

Plugging \(\lambda = d - o(d)\) into Corollary 40 gives \(\tfrac {2d-\lambda }{2d}\cdot \tfrac {p}{1-p} = \tfrac {1+o(1)}{2}\cdot \tfrac {p}{1-p}\).

Proposition 42 Tightness of the adversarial bound [Remark V.4]

For any graph assignment scheme with \(\lfloor pm\rfloor \) stragglers and replication factor \(d\), the stragglers can be chosen so that at least \(\tfrac {mp}{d}\) data blocks are held only at straggling machines, whence for any decoding coefficients, \(\tfrac 1n|\alpha - \mathbb {1}|_2^2 \ge \tfrac {mp}{dn} = \tfrac {p}{2}\), using \(nd = 2m\).

Proof ▶

Choosing the \(\lfloor pm\rfloor \) straggling edges greedily leaves at least \(\tfrac {mp}{d}\) vertices (blocks) with no incident non-straggling edge; for such a block \(i\), no choice of \(w\) supported on non-straggling machines can make \(\alpha _i\) nonzero, so \(\alpha _i = 0\) contributes \(1\) to \(|\alpha - \mathbb {1}|_2^2\). Hence \(|\alpha - \mathbb {1}|_2^2 \ge \tfrac {mp}{d}\) and dividing by \(n\) with \(nd = 2m\) gives the bound.