- Boxes
- definitions
- Ellipses
- theorems and lemmas
- Blue border
- the statement of this result is ready to be formalized; all prerequisites are done
- Orange border
- the statement of this result is not ready to be formalized; the blueprint needs more work
- Blue background
- the proof of this result is ready to be formalized; all prerequisites are done
- Green border
- the statement of this result is formalized
- Green background
- the proof of this result is formalized
- Dark green background
- the proof of this result and all its ancestors are formalized
- Dark green border
- this is in Mathlib
Let \(C : \Sigma ^n \to \Sigma ^N\) be a code that maps \(x_1, \dots , x_n\) to \(c_1, \dots , c_N\). The code \(C\) is a \(k\)-batch code if, for every multiset of indices \(\{ i_1, \dots , i_k\} \subset [n]\), there exist \(k\) mutually disjoint sets \(R_1, \dots , R_k \subseteq [N]\) and functions \(g_1, \dots , g_k\) such that, for all information symbols \(x_1, \dots , x_n \in \Sigma \) and codewords \((c_1, \dots , c_N) = C(x_1, \dots , x_n) \in \Sigma ^N\), and for all \(j \in [k]\), we have \(g_j(c|_{R_j}) = x_{i_j}\). Here \(c|_{R_j}\) denotes the restriction of the codeword \(c\) to the coordinates indexed by \(R_j\).
We say \(C\) is a linear batch code if \(\Sigma = \mathbb {F}\) is a finite field, the functions \(g_i\) are all linear, and \(C\) is a linear map. We say \(C\) has a systematic encoding if \(c_i = x_i\) for \(i = 1, \dots , n\).
The redundancy of \(C\) is \(N - n\). (For a linear code with a systematic encoding, \(\dim C = n\), so the redundancy equals \(N - \dim C\).)
The dual code \(C^{\perp } \le \mathbb {F}^N\) of a linear code \(C \le \mathbb {F}^N\) is the subspace of all vectors orthogonal (under the standard dot product \(\langle u, v\rangle = \sum _{i=1}^N u_i v_i\)) to every codeword of \(C\). Elements of \(C^{\perp }\) are called dual codewords (of \(C\)).
For a good map \(\pi \) and indices \(i_1, \dots , i_t\), define the simple tensor
Note that this definition does not depend on \(\pi _2\).
A good map is a bijection \(\pi : [2t] \to [t] \times [2]\). We write \(\pi _1 : [2t] \to [t]\) for the first coordinate of \(\pi \) and \(\pi _2 : [2t] \to [2]\) for the second coordinate, so \(\pi (j) = (\pi _1(j), \pi _2(j))\).
Fix an integer \(t \ge 1\) and work in the standard tensor basis of \(\mathbb {F}^{N^{2t}}\). A simple tensor \(e_{i'_1} \otimes \cdots \otimes e_{i'_{2t}}\) is called good if the multiset \(\{ i'_1, \dots , i'_{2t}\} \subset [N]\) contains exactly \(t\) distinct elements, each appearing exactly twice.
We write \([N] = \{ 1, \dots , N\} \). For a vector \(c \in \mathbb {F}^N\), the support is \(\operatorname{supp}(c) := \{ i \in [N] : c_i \neq 0 \} \). For two vector spaces \(V, W\) over \(\mathbb {F}\) we write \(V \le W\) to mean \(V\) is a subspace of \(W\). (Standard; recalled so later nodes can depend on it.)
Definition 1 is the primitive multiset special case of the general notion of batch code of Ishai–Kushilevitz–Ostrovsky–Sahai. In the general definition, \(n\) symbols are encoded into “buckets” of symbols whose total size is \(N\), and each batch of \(k\) symbols can be decoded by reading at most one symbol from each bucket. A batch code is a multiset batch code if (a) the \(k\) requested symbols may form a multiset and (b) the \(k\) symbols can be decoded by querying \(k\) pairwise disjoint sets of buckets. When each bucket stores a single symbol the batch code is called primitive. Throughout this blueprint “batch code” means the primitive multiset notion of Definition 1.
Let \(e_i\) denote the standard basis vector of \(\mathbb {F}^N\), so \((e_i)_i = 1\) and \((e_i)_j = 0\) for \(j \neq i\). For \(v^{(1)}, \dots , v^{(s)} \in \mathbb {F}^N\), the tensor product \(v^{(1)} \otimes \cdots \otimes v^{(s)} \in \mathbb {F}^{N^s}\) is the vector indexed by tuples \((i_1, \dots , i_s) \in [N]^s\) with
A tensor of this form is a simple tensor; a tensor is any linear combination of simple tensors.
For a subspace \(V \le \mathbb {F}^N\) and a positive integer \(s\), let \(V^{\otimes s}\) denote the subspace of \(\mathbb {F}^{N^s}\) spanned by the simple tensors \(v^{(1)} \otimes \cdots \otimes v^{(s)}\) with \(v^{(1)}, \dots , v^{(s)} \in V\).
Let \(\mathcal{D}_{i_1, \dots , i_t} = (c^{(j,\ell )})_{1 \le j \le t,\ \ell =1,2}\) be the dual codewords of Lemma 15. For a good map \(\pi \) and indices \(i_1, \dots , i_t\), define the tensor
Since each \(c^{(\pi _1(j),\pi _2(j))} \in V = C^{\perp }\), we have \(w_{i_1, \dots , i_t, \pi } \in W = V^{\otimes 2t}\).
If \(C\) is a \(k\)-batch code and \(k' \le k\), then \(C\) is also a \(k'\)-batch code.
Each \(w_{i_1, \dots , i_t, \pi }\), written in the standard basis of \(\mathbb {F}^{N^{2t}}\), contains the simple tensor \(e_{i_1, \dots , i_t, \pi }\).
Each \(w_{i_1, \dots , i_t, \pi }\), written in the standard basis of \(\mathbb {F}^{N^{2t}}\), contains at most \(3^t\) good simple tensors.
Let \(C \le \mathbb {F}^N\) be a linear \(k\)-batch code with a systematic encoding and set \(t := \lfloor k/3 \rfloor \). For every \(t\)-tuple \((i_1, \dots , i_t) \in [n]^t\) there exist sets \(R_{j,\ell } \subseteq [N]\) (for \(j \in [t]\), \(\ell \in [3]\)) that are pairwise disjoint over all \((j,\ell )\), together with, for each \(j \in [t]\), two dual codewords \(c^{(j,1)}, c^{(j,2)} \in C^{\perp }\) such that
We write \(\mathcal{D}_{i_1, \dots , i_t} := (c^{(j,\ell )})_{1 \le j \le t,\ \ell =1,2}\).
Suppose \(C \le \mathbb {F}^N\) is a linear code with a systematic encoding, and there exist an index \(i \in [N]\), a set \(R \subset [N] \setminus \{ i\} \), and a linear function \(g\) such that \(g(c|_R) = c_i\) for all codewords \(c \in C\). Then there exists a nonzero dual codeword \(c^{\perp }\) such that \(\operatorname{supp}(c^{\perp }) \subset R \cup \{ i\} \) and \(i \in \operatorname{supp}(c^{\perp })\).
Let \(E\) be the set of all simple tensors \(e_{i_1, \dots , i_t, \pi }\) with \(i_1 {\gt} \cdots {\gt} i_t\) (from \([n]\)) and \(\pi \) a good map. Then
where the multinomial coefficient has \(t\) twos.
For every good map \(\pi \) and indices \(i_1, \dots , i_t\), the simple tensor \(e_{i_1, \dots , i_t, \pi }\) is a good simple tensor.
Let \((e^{(1)}, w^{(1)}), \dots , (e^{(D)}, w^{(D)})\) be pairs of tensors in \(\mathbb {F}^{N^s}\) such that, for all \(i = 1, \dots , D\), the tensor \(e^{(i)}\) is in the standard tensor basis and the standard basis representation of \(w^{(i)}\) contains \(e^{(i)}\) and none of \(e^{(i+1)}, \dots , e^{(D)}\). Then \(w^{(1)}, \dots , w^{(D)}\) are linearly independent.
The set of simple tensors
forms a basis for \(\mathbb {F}^{N^s}\), the standard tensor basis. Consequently every vector in \(\mathbb {F}^{N^s}\) has a unique expression as a linear combination of standard tensor basis elements, the standard basis representation; we say it contains a basis element when the corresponding coefficient is nonzero.
A linear \(k\)-batch code of length \(N\) with a systematic encoding must have redundancy \(\Omega (\sqrt{Nk})\).