Approximate Gradient Coding with Optimal Decoding

1 Preliminaries: settled library facts

Lemma 1 Cyclic property of trace
#

For real matrices \(A \in \mathbb {R}^{n \times m}\) and \(B \in \mathbb {R}^{m \times n}\) we have \(\operatorname {Tr}(AB) = \operatorname {Tr}(BA)\). (The paper calls this the “circular law of trace”.)

Lemma 2 Markov’s inequality
#

If \(X \ge 0\) is a nonnegative random variable and \(a {\gt} 0\), then \(\Pr [X \ge a] \le \mathbb {E}[X]/a\).

Lemma 3 Jensen’s inequality
#

If \(\varphi \) is convex and \(X\) is an integrable random variable, then \(\varphi (\mathbb {E}[X]) \le \mathbb {E}[\varphi (X)]\).

Lemma 4 Multiplicative Chernoff bound for Bernoulli sums
#

Let \(X_1, \dots , X_t\) be i.i.d. Bernoulli random variables with mean \(q\) and let \(\delta \in (0,1)\). Then

\[ \Pr \! \left[\sum _{i=1}^t X_i \le (1-\delta )\, qt\right] \le \exp \! \left(-\tfrac {\delta ^2 qt}{2}\right). \]
Lemma 5 Stirling’s formula bound
#

For every positive integer \(k\), \(\left(\dfrac {k!}{2^{k/2}(k/2)!}\right)^{1/k} \le \sqrt{k}\).

Definition 6 Convex, strongly convex, and Lipschitz-gradient functions
#

A differentiable function \(f : \mathbb {R}^k \to \mathbb {R}\) is convex if \(f(y) \ge f(x) + \langle \nabla f(x), y - x\rangle \) for all \(x,y\); it is \(\mu \)-strongly convex if \(f - \tfrac {\mu }{2}|\cdot |_2^2\) is convex; and it has an \(L\)-Lipschitz gradient if \(|\nabla f(x) - \nabla f(y)|_2 \le L|x-y|_2\) for all \(x,y\).

Lemma 7 Trace inequality for PSD matrices
#

If \(A\) and \(B\) are positive semidefinite matrices, then \(\operatorname {Tr}(AB) \le |A|_2 \, \operatorname {Tr}(B)\), where \(|A|_2\) is the operator norm.

Lemma 8 Co-coercivity of smooth convex functions
#

For a convex function \(f\) whose gradient has Lipschitz constant \(L\),

\[ |\nabla f(x) - \nabla f(y)|_2^2 \le L\, \langle x - y,\ \nabla f(x) - \nabla f(y)\rangle . \]
Lemma 9 Expander Mixing Lemma

Let \(G\) be a \(d\)-regular graph on \(n\) vertices with spectral expansion \(\lambda \) (Definition 18). For any sets of vertices \(S\) and \(T\),

\[ |E(S,T)| \ \ge \ d\frac{|S||T|}{n} - (d-\lambda )\sqrt{|S||T|\Big(1-\tfrac {|S|}{n}\Big)\Big(1-\tfrac {|T|}{n}\Big)} , \]

where \(E(S,T)\) denotes the set of edges between \(S\) and \(T\).