27 Appendix D. Basic Algebraic Algorithms
D.2 Groups, Rings, Fields
Given a set
\(\)G\( and a binary operation \)○\( over \)G\(, \)(G,○)\( is a group if \)○\( is associative, has an identity, and every element of \)G\( is invertible. A group \)(G,○)\( is said to be abelian if \)○\( is also commutative. \)
A set
\(\)R\( with two binary operations \)+\( and \)·\( is said to form a ring if (1) \)(R,+)\( forms an abelian group, (2) \)·\( is associative and has an identity, and (3) \)·\( distributes over \)+\(, i.e., for every \)a,b,c R\( we have \)a ·(b+c) = (a ·b) + (a ·c)\( and \)(b+c)·a = (b ·a) + (c ·a)\(. The ring \)(R,+,·)\( is said to be a commutative ring if \)·\( is commutative. \)
A set
\(\)F\( with operations \)+\( and \)·\( forms a field if \)(F,+,·)\( is a commutative ring, and \)(F ∖{0}, ·)\( is a group, where \)0\( denotes the identity for \)+\(. \)
D.3 Polynomials
Let
\(\)(R,+,·)\( be a commutative ring with identity \)0\(. The set of formal polynomials over \)R\( in indeterminate \)X\(, denoted \)R[X]\(, is given by finite formal sums \)R[X] = {∑_i=0^d f_i X^i ∣f_0,…,f_d R, d Z^≥0}\(, under the equivalence \)∑_i=0^d f_i X^i = ∑_i=0^d-1 f_i X^i\( if \)f_d = 0\(. The \)f_i\( are the coefficients of \)f\(, the symbols \)X^i\( are the monomials, and \)f_i X^i\( are the terms of \)f\(. For \)f = ∑_i=0^d f_i X^i\(, its degree, denoted \)_X(f)\( or simply \)(f)\(, is the largest integer \)e\( such that \)f_e ≠0\(. Addition is coefficient-wise, \)f + g = ∑_i=0^d (f_i + g_i) X^i\(, and multiplication is given by convolution, \)f ·g = ∑_i=0^d+e ∑_j=0^e f_i-j ·g_j X^i\(. \)
For every commutative ring
\(\)R\(, \)R[X]\( is a commutative ring under the sum and product of polynomials. \)
Let
\(\)R\( be a commutative ring. An element \)u R\( is a unit if it has a multiplicative inverse in \)R\(. Elements \)a\( and \)b\( are said to be associates if there exists a unit \)u\( such that \)a = b ·u\(. Element \)a R\( is irreducible if \)a = b ·c\( implies either \)b\( or \)c\( is a unit. A factorization of \)a R\( is a sequence \)b_1,…,b_k\( such that \)a = b_1 ·b_2 b_k\( and none of the \)b_i\('s are units. Ring \)R\( is a factorization domain if for every non-zero \)a R\( that is not a unit, there is a finite bound \)k_a\( such that every factorization of \)a\( has at most \)k_a\( factors. A factorization domain \)R\( is a unique factorization domain (UFD) if every non-zero, non-unit element has a unique irreducible factorization up to associates: i.e., if \)a = b_1 b_k = c_1 c_ℓ\( and the \)b_i\('s and \)c_j\('s are irreducible, then \)k = ℓ\( and there exists a bijection \)π: [k] [ℓ]\( such that \)b_i\( and \)c_π(i)\( are associates, for every \)i [k]\(. \)
Every field is a UFD.
If
\(\)R\( is a UFD, then so is \)R[X]\(. \)
Given a monic polynomial
\(\)f\( (i.e., its leading coefficient is a unit), and general polynomial \)p\(, there exists a unique pair of polynomials \)q\( (the quotient) and \)r\( (the remainder) such that \)p = q ·f + r\( and \)(r) < (f)\(. \)
Given
\(\)p = ∑_i=0^d p_i X^i R[X]\( and \)αR\(, let \)p(α) = ∑_i=0^d p_i α^i\(. Then there exists a unique \)q R[X]\( such that \)p = q ·(X-α) + p(α)\(. It follows that \)p(α) = 0\( if and only if \)X - α\( divides \)p(X)\(. \)
Let
\(\)f ≠g F[X]\( be polynomials of degree at most \)d\(. Then there exist at most \)d\( elements \)αF\( such that \)f(α) = g(α)\(. \)
Let
\(\)h = f - g\(. Then \)h\( is non-zero and of degree at most \)d\(. Let \)S = {α∣f(α) = g(α)}\(. By Proposition~ \ref{prop:appD-evaluation-remainder}, \)(X - α)\( divides \)h\( for every \)αS\(. Furthermore, since \)F[X]\( is a UFD (Lemma~ \ref{lem:appD-gauss} applied to the field \)F\(), we have that \)h = ∏_αS (X-α)\( divides \)h\(. But if \)h\( divides \)h\(, then \)(h) ≤(h)\( and \)(h) = |S|\(. We conclude \)|S| ≤d\(. \)
D.4 Vector Spaces
D.4.1 Matrices and Vectors
A vector of length
\(\)n\( over the field \)F\( is a row vector \)x F^n\(. Given \)u,v F^n\(, their inner product is \)u, v = ∑_i=1^n u_i ·v_i\(. A matrix \)M F^k ×n\( is a two-dimensional array with rows \)M_i,·\( and columns \)M_·,j\(. The transpose of \)M F^k ×n\(, denoted \)M^T\(, is the \)n ×k\( matrix with \)M^T_j,i = M_i,j\(. The product of matrices \)A F^k ×n\( and \)B F^n ×m\( is the matrix \)C F^k ×m\( with \)C_i,j = A_i,·, B_·,j \(. \)
D.4.2 Definition and Properties of Vector Spaces
Over a field
\(\)F\(, a vector space is given by a triple \)(V,+,·)\( where \)(V,+)\( is a commutative group and \)·: F ×V V\( distributes over addition, so that \)α·(u+v) = α·u + α·v\( for every \)αF\( and \)u,v V\(. The identity of \)(V,+)\( is denoted \)0\(. The simplest example of an \)F\(-vector space is \)F^n\(, with coordinate-wise sum and scalar product. \)
A sequence of vectors
\(\)v_1,…,v_k V\( is linearly independent if \)∑_i=1^k β_i ·v_i = 0\( implies \)β_1 = = β_k = 0\(; otherwise the sequence is linearly dependent. \)V\( is finite dimensional of dimension \)k\( if every sequence of \)k+1\( vectors from \)V\( is linearly dependent and there exists a linearly independent sequence of length \)k\(. A linearly independent set \)v_1,…,v_k V\( is said to form a basis for \)V\( if \)V\( has dimension \)k\(. \)
If
\(\)v_1,…,v_k\( form a basis for an \)F\(-vector space \)V\(, then \)V = {∑_i=1^k β_i ·v_i ∣β_1,…,β_k F}\( and the map \)(β_1,…,β_k) ↦∑_i=1^k β_i ·v_i\( is an isomorphism from \)F^k\( to \)V\(. \)
A matrix
\(\)G F^k ×n\( is a generator matrix of an \)F\(-vector space \)V ⊆F^n\( if the rows of \)G\( are linearly independent in \)F^n\( and \)V = {x ·G ∣x F^k}\(; the rows of \)G\( form a basis of \)V\(. A matrix \)H F^(n-k)×n\( is a parity check matrix of \)V ⊆F^n\( if the rows of \)H\( are linearly independent and \)V = {y F^n ∣H ·y^T = 0}\(. Given \)V\( with parity check matrix \)H\(, its dual space, denoted \)V^⊥\(, is the vector space generated by \)H\(, i.e. \)V^⊥= {x ·H ∣x F^n-k}\(. \)
If
\(\)V ⊆F^n\( is a \)k\(-dimensional vector space then it has a generator matrix \)G F^k ×n\( and a parity check matrix \)H F^(n-k)×n\(. Furthermore its dual \)V^⊥\( is generated by \)H\(, has dimension \)n-k\(, and has \)G\( as its parity check matrix. Finally \)(V^⊥)^⊥= V\(. \)
Existence of a generator matrix is immediate from the definitions: since
\(\)V\( is \)k\(-dimensional it has a basis \)v_1,…,v_k\(, and the matrix \)G\( with these vectors as its rows satisfies the conditions of a generator matrix. For the parity check matrix, we say that a \)k×k\( matrix \)R\( forms a row operation if either (i) \)R_ii=1\( and \)R_ij=0\( for all but one pair \)i≠j [k]\(, or (ii) \)R\( is a permutation matrix that swaps two rows. We say \)G\( is obtained from \)G\( by row operations, denoted \)G G\(, if \)G = R_m ·R_m-1 R_1 ·G\( where the \)R_i\('s are row operations. If \)G\( is a generator matrix for \)V\( then so is \)G\(, since each row operation replaces the row space by an equal row space. Gaussian elimination allows us to simplify \)G\( until its columns are special, and in particular after permuting the columns \)G\( would look like \)[I_k ∣A]\( where \)I_k\( denotes the \)k×k\( identity matrix. Assume for simplicity that \)G = [I_k ∣A]\( (without permuting columns). Now let \)H\( be given by \)H = [-A^T ∣I_n-k]\(. It can be verified by direct computation that \)G ·H^T = -A + A = 0\(, and so \)G ·H^T = 0\( as well (since \)G\( and \)G\( generate the same row space, and the row operations used are invertible). Furthermore all rows of \)H\( are linearly independent (since the \)I_n-k\( block has full rank), and so \)H\( satisfies the conditions of a parity check matrix of \)V\(. For the claims about \)V^⊥\(: since \)H\( has linearly independent rows, \)H\( is itself a generator matrix for the \)(n-k)\(-dimensional vector space \)V^⊥\( generated by its rows, so \)(V^⊥) = n-k\(. Since \)G ·H^T = 0\(, equivalently \)H ·G^T = 0\(, and \)G\( has \)k = n - (n-k)\( linearly independent rows, \)G\( satisfies the conditions of a parity check matrix for \)V^⊥\(. Finally, applying the same construction to \)V^⊥\( (with parity check matrix \)G\() shows that \)(V^⊥)^⊥\( is generated by \)G\(, i.e. \)(V^⊥)^⊥= V\(. \)
D.5 Finite Fields
D.5.1 Prime Fields
For any prime
\(\)p\(, \)(Z_p, +_p, ·_p)\(, where \)+_p\( and \)·_p\( denote addition and multiplication modulo \)p\( on \)Z_p = {0,…,p-1}\(, forms a field of cardinality \)p\(. \)
Given a finite field
\(\)F\(, its characteristic, denoted \)char(F)\(, is the smallest positive integer \)p\( such that \)p ·1 = 1 + 1 + + 1 = 0\(. \)
For every finite field
\(\)F\(, \)char(F)\( is a prime. Furthermore, \)F\( is a \)Z_p\(-vector space, where \)p = char(F)\(. Thus \)F\( has cardinality \)p^n\( for prime \)p\( and integer \)n\(. \)
For any prime
\(\)p\(, there is a unique field of cardinality \)p\( up to isomorphism. \)
D.5.2 Extension Fields and Subfields
If
\(\)(G,·)\( is a finite group with identity \)1\(, then for every \)a G\(, \)a^|G| = 1\(. \)
Let
\(\)F\( be a field of cardinality \)q\(. Then every element \)αF\( is a root of the polynomial \)X^q - X\( and so \)X^q - X = ∏_αF (X - α)\(. \)
Let
\(\)K\( be a field and \)F ⊆K\( a subfield (closed under addition and multiplication), denoted \)F K\( (also, \)K F\(). Then \)K\( is an \)F\(-vector space and so \)|K| = |F|^n\( where \)n\( is the dimension of \)K\( as an \)F\(-vector space. Furthermore there is a unique copy of \)F\( in \)K\(. \)
D.5.3 Existence of Finite Fields
Let
\(\)F\( be a finite field of cardinality \)q\( and let \)g F[X]\( be an irreducible polynomial of degree \)n\(. Then \)(F[X]/g, +_g, ·_g)\(, where \)+_g\( and \)·_g\( denote addition and multiplication followed by reduction modulo \)g\(, forms a field of cardinality \)q^n\(. \)
Let
\(\)K\( be a field of characteristic \)p\( and let \)A,B K[X,Y]\(. Then for all positive integers \)n\( we have \)(A+B)^p^n = A^p^n + B^p^n\(. \)
If
\(\)F\( is a finite field, then it has characteristic \)p\( for some prime \)p\( and its cardinality is \)p^n\( for positive integer \)n\(. Conversely, for every prime \)p\( and positive integer \)n\(, there is a field of cardinality \)p^n\(. \)
D.5.4 Uniqueness of Finite Fields
Let
\(\)q = |F|\( and \)n = q-1\(. Let \)N^=(G,m)\( (resp. \)N(G,m)\() denote the number of elements of order exactly (resp. dividing) \)m\( in a finite group \)G\(. Then for every \)k\( dividing \)n\(, \)N(F^*,k) = N(Z_n,k)\( and \)N^=(F^*,k) = N^=(Z_n,k)\(. \)
Every finite field
\(\)F\( has a primitive element \)ω\( (i.e. \)ω^i ≠1\( for \)i < |F|-1\( and \)ω^|F|-1=1\(). Consequently the multiplicative group \)F^*\( is cyclic. \)
Let
\(\)K\( be a finite field and let \)ω\( be a primitive element in \)K\(. Then for every subfield \)F K\(, \)ω\( is an \)F\(-generator of \)K\(, i.e. every \)βK\( equals \)p(ω)\( for some \)p F[X]\(. As a consequence, for every \)K F\( there is an \)F\(-generator in \)K\(. \)
Let
\(\)K F\( and let \)α\( be an \)F\(-generator of \)K\(. Let \)p\( be the minimal degree polynomial in \)F[X]\( such that \)p(α) = 0\(. Then \)p\( is irreducible and \)K\( is isomorphic to \)F[X]/p\(. \)
If
\(\)f F_p[X]\( is irreducible of degree \)n\(, then \)f\( divides \)X^p^n - X\(. \)
For every prime
\(\)p\( and integer \)n\(, there is a unique field of cardinality \)p^n\( up to isomorphism. \)
D.5.5 The Trace and Norm Maps
Let
\(\)F = F_q\( and let \)K = F_q^n\(. The Trace function \)Tr = Tr_KF\( is the function obtained by evaluating the polynomial \)Tr(X) = X + X^q + X^q^2 + + X^q^n-1\(. The Norm function is obtained by evaluating \)N(X) = X^1+q+q^2++q^n-1\(. \)
Trace is an
\(\)F\(-linear map, i.e., for every \)αF\( and \)β, γK\(, \)Tr(α·β+ γ) = α·Tr(β) + Tr(γ)\(. \item Norm is multiplicative, i.e., \)N(β·γ) = N(β) N(γ)\(. \item Trace is a \)q^n-1\(-to-one map from \)K\( to \)F\(. \item Norm is a \)(q^n-1)/(q-1)\(-to-one map from \)K^*\( to \)F^*\(. \)
(1) The
\(\)F\(-linearity follows from the facts that \)(αβ+ γ)^q^i = α^q^i β^q^i + γ^q^i\( (using Proposition~ \ref{prop:appD-frobenius-freshman} repeatedly, together with the fact that \)q\( is a power of \)char(K)\() and \)α^q^i = α\( for \)αF\(. (2) The multiplicativity of \)N\( is immediate from the definition, since \)(βγ)^1+q++q^n-1 = β^1+q++q^n-1 γ^1+q++q^n-1\(. (3) For \)βK\( we have \)Tr(β)^q = β^q + + β^q^n = β^q + + β^q^n-1 + β= Tr(β)\( (using \)β^q^n=β\(), and so the range of \)Tr\( is \)F\(. Since \)Tr\( is (the evaluation of) a polynomial of degree \)q^n-1\(, it can take on any value in the range at most \)q^n-1\( times. But it has a domain of size \)q^n\( and range of size \)q\(, so it must take on every value exactly \)q^n-1\( times. (4) Similarly to Part (3): \)N(β)^q = N(β)\( for all \)βK\(, and \)N(β)\( is non-zero if and only if \)β≠0\(. Using the degree of \)N\( (which is \)(q^n-1)/(q-1)\() and counting as in Part (3), \)N\( takes on every value in \)F^*\( exactly \)(q^n-1)/(q-1)\( times. \)
A function
\(\)L : K F\( is \)F\(-linear if and only if there exists \)λK\( such that \)L(β) = Tr(λβ)\( for every \)βK\(. \)
First note that
\(\)f(β) = Tr(λβ)\( is \)F\(-linear, since \[ f(\alpha \beta + \gamma ) = \mathrm{Tr}(\lambda (\alpha \beta +\gamma )) = \mathrm{Tr}(\lambda \alpha \beta ) + \mathrm{Tr}(\lambda \gamma ) = \alpha \mathrm{Tr}(\lambda \beta ) + \mathrm{Tr}(\lambda \gamma ) = \alpha f(\beta ) + f(\gamma ) \] for every \)αF\( and \)β,γK\(, using Proposition~ \ref{prop:appD-trace-norm-properties}, Part (1). For the converse we employ a counting argument. First note that if \)λ≠0\( then the function \)f_λ(β) = Tr(λβ)\( is not identically zero: viewed as a polynomial in \)Z\(, \)f_λ(Z)\( has degree \)|K|/|F|\( and a non-zero coefficient of \)Z\(. By linearity, \)f_λ≠f_τ\( if \)λ≠τ\(, since \)f_λ- f_τ= f_λ-τ ≠0\(. So, including \)λ= 0\(, there are at least \)|K|\( distinct linear functions of the form \)f_λ(·)\(. We now note there are also at most \)|K|\( such linear functions in total. To see this let \)β_1,…,β_n K\( be \)F\(-linearly independent elements (i.e., a basis of \)K\( as an \)F\(-vector space, cf. Definition~ \ref{def:appD-vector-space}). A linear function \)L: K F\( is completely determined by its values at \)β_1,…,β_n\(, since for every \)β= ∑_i α_i β_i K\( with \)α_1,…,α_n F\( we have \)L(β) = ∑_i α_i L(β_i)\(. Thus the number of linear functions is upper bounded by \)|{(L(β_1),…,L(β_n)) F^n}| ≤|F|^n = |K|\(. We conclude that these are exactly \)|K|\( functions that are \)F\(-linear from \)K F\(, and these are exactly the functions \)f_λ\(, \)λK\(. \)
D.6 Algorithmic Aspects of Finite Fields
Let
\(\)p=2\( and \)t = 2 ·3^ℓ\( for any non-negative integer \)ℓ\(. Then the polynomial \)X^t + X^t/2 + 1\( is irreducible in \)F_2[X]\(. \)
D.7 Algorithmic Aspects of Polynomials
D.7.2 Greatest Common Divisor
Given polynomials
\(\)f,g F[X]\(, their greatest common divisor, denoted \)(f,g)\(, is the maximal degree polynomial \)h(X)\( with leading coefficient \)1\( such that \)h\( divides \)f\( and \)g\(. \)
On input
\(\)f,g F[X]\( with \)(g) < (f)\(: if \)g\( divides \)f\(, output \)g\( (normalized to have leading coefficient \)1\(); otherwise, compute \)f = q ·g + r\( with \)(r) < (g)\( via the division algorithm (Proposition~ \ref{prop:appD-division-algorithm}), and recursively output the result of running the algorithm on \)(g,r)\(. \)
Correctness follows from the identity
\(\)(f,g) = (g,r)\( where \)f = q ·g + r\(: any common divisor of \)f\( and \)g\( divides \)r = f - q·g\( and hence is a common divisor of \)g\( and \)r\(; conversely any common divisor of \)g\( and \)r\( divides \)f = q ·g + r\( and hence is a common divisor of \)f\( and \)g\(. Since \)(r) < (g) < (f)\( strictly decreases at each recursive call (and the recursion terminates when one polynomial divides the other), the algorithm terminates after at most \)(g)+1\( recursive calls. With a naive implementation the algorithm runs in time polynomial in \)n = ((f),(g))\(; the individual steps can be combined more cleverly to obtain an implementation running in time \)O(n(n)^c)\( field operations for some constant \)c\( (we do not reproduce this refinement here; see the notes to this appendix). \)
D.7.3 Factoring and Root-Finding
There exists a constant
\(\)c\( and a randomized algorithm running in expected time \)O((n q)^c)\( that factors polynomials of degree \)n\( in \)F_q[X]\(. Furthermore, if \)q = p^t\( for prime \)p\(, then there is a deterministic algorithm with running time \)O((npt)^c)\( for factoring. \)
The input to the root-finding problem is a polynomial
\(\)f F_q[X]\( of degree at most \)n\( (given as a list of coefficients \)f_0,…,f_n F_q\(). The task is to find all \)αF_q\( that are roots of \)f\(, i.e., to output the set \){αF_q ∣f(α) = 0}\(. \)
A polynomial
\(\)f F_q[X]\( has a root in \)F_q\( if and only if \)(f, X^q-X) ≠1\(. \)
If
\(\)f\( has a root \)α\(, then \)X-α\( divides both \)f\( and \)X^q-X\( (Proposition~ \ref{prop:appD-root-Xq-X}), and so their gcd cannot be trivial. Conversely, a factor of \)X^q-X\( is of the form \)∏_αS(X-α)\( for some \)S ⊆F_q\( (by unique factorization, Definition~ \ref{def:appD-ufd}, since \)X^q-X\( splits into distinct linear factors by Proposition~ \ref{prop:appD-root-Xq-X}), and so \)(f, X^q-X)\( must be of this form. If the gcd is non-trivial, then \)S\( must be non-empty, implying that for every \)αS\( we have \)X-α\( divides \)f\( and thus \)f\( has a root in \)S ⊆F_q\(. \)
We say that a polynomial
\(\)h F[X]\( is \)t\(-sparse if at most \)t\( of its coefficients are non-zero. Every polynomial \)h\( is \)((h)+1)\(-sparse. \)
Let
\(\)f F[X]\( be a polynomial of degree \)n\( and let \)h F[X]\( be a \)t\(-sparse polynomial of degree \)D\(. Then \)h f\( and \)(f,h)\( can be computed in time \)poly(n,t,D)\(. \)
It suffices to compute
\(\)h f\( in time \)poly(n,t,D)\(, and then one can use Euclid's algorithm (Algorithm~ \ref{alg:appD-euclid-gcd}) to compute \)(f,h) = (f, h f)\( in time \)poly(n)\(. Write \)h = ∑_i=1^t h_i X^d_i\(. If we can compute \)h_i X^d_i f\( in time \)poly(n,d_i)\( for every \)i\(, then we can add the \)t\( results (each of degree less than \)n\() in time \)poly(n,t)\( to get \)h f\(. Finally, \)X^d f\( can be computed by repeated squaring. Let \)d = ∑_j=0^_2 d d_j 2^j\( be the binary expansion of \)d\(. We first compute the sequence of polynomials \)g_j = X^2^j f = g_j-1^2 f\( by repeatedly squaring (and reducing modulo \)f\() the output of the previous step; this takes \)O(d)\( multiplications and reductions modulo \)f\(, each of which takes time \)poly(n)\(. Then we compute \)X^d f = ∏_j=0^_2 d (g_j)^d_j f\( using \)O(d)\( further multiplications and reductions, yielding the desired result in overall time \)poly(n,d)\(. \)
Let
\(\)F_q\( be a field of odd characteristic (so \)q\( is odd). Then \)X^q - X = X ·(X^(q-1)/2 - 1) ·(X^(q-1)/2+1)\(. In particular \)X^q-X\( factors into three \)2\(-sparse polynomials of degree at most \)q/2\(. \item Let \)q = 2^t\( for integer \)t ≥2\(. Then \)X^q - X = Tr(X) ·(Tr(X) - 1)\(, where \)Tr(X) = Tr_F_q F_2(X) = X + X^2 + X^4 + + X^2^t-1\( is the Trace map from \)F_q\( to \)F_2\(. In particular \)X^q - X\( factors into two \)(2+_2 q)\(-sparse polynomials of degree \)q/2\(. \)
(1) The case of odd
\(\)q\( is verified by direct expansion: \)X ·(X^(q-1)/2-1) ·(X^(q-1)/2+1) = X ·(X^q-1-1) = X^q - X\(. Note that \)(q-1)/2\( is an integer since \)q\( is odd. (2) For even \)q\(, we use the fact that \)Tr\( is a map from \)F_q\( to \)F_2\( (Definition~ \ref{def:appD-trace-norm}, applied with the subfield \)F_2\(). So every \)αF_q\( satisfies \)Tr(α)=0\( or \)Tr(α)=1\(. It follows that \)X-α\( divides \)Tr(X)·(Tr(X)-1)\( for every \)αF_q\(. Consequently, by unique factorization (Definition~ \ref{def:appD-ufd}), \)X^q-X\( (which is squarefree with roots exactly \)F_q\(, Proposition~ \ref{prop:appD-root-Xq-X}) divides \)Tr(X)·(Tr(X)-1)\(. The identity \)X^q-X = Tr(X)·(Tr(X)-1)\( now follows from the fact that both polynomials have the same degree \)q\( and have leading coefficient \)1\( (using the Polynomial Distance Lemma, Lemma~ \ref{lem:appD-polynomial-distance}, or direct degree comparison, to conclude that a divisibility between monic polynomials of equal degree is an equality). \)
Let
\(\)g F_q[X]\( have \)α≠β\( as its roots. Fix \)a F_q^*\( and \)b F_q\( and let \)g_a,b(X) = g((X-b)/a)\(. Then: \begin{enumerate} \item The coefficients of \end{enumerate}\)g_a,b\( can be computed efficiently given \)a\(, \)b\(, and the coefficients of \)g\(. \item \)g_a,b\( has \)aα+ b\( and \)aβ+b\( as its roots. \item If \)a F_q^*\( and \)b F_q\( are chosen uniformly at random independently, then the probability that exactly one of \)aα+b\( and \)aβ+b\( is a root of \)X^(q-1)/2-1\( is at least \)1/2\(. \)
Parts (1) and (2) are straightforward to verify by direct substitution:
\(\)g_a,b(aα+b) = g((aα+b-b)/a) = g(α) = 0\(, and similarly for \)β\(; and computing the coefficients of \)g_a,b\( from those of \)g\( amounts to composing with the linear map \)X ↦(X-b)/a\(, which can be done with a number of field operations polynomial in \)(g)\(. For Part (3), note that for any pair of distinct elements \)γ,δF_q\( there is exactly one pair \)a F_q^*, b F_q\( such that \)aα+b=γ\( and \)aβ+b=δ\( (this is a non-degenerate linear system in \)a,b\( since \)α≠β\(). Since the fraction of distinct pairs \)γ,δF_q\( such that exactly one of them comes from the set of size \)(q-1)/2\( (the set of roots of \)X^(q-1)/2-1\() is at least \)1/2\( (the exact fraction is \)1/2 + 1/(2q)\(), we have that the probability that exactly one of \)aα+b\( and \)aβ+b\( is a root of \)X^(q-1)/2-1\( is at least \)1/2\(. \)
Root-Find
\(\)(F_q, f)\(: On input \)f(X) F_q[X]\(: \begin{enumerate} \item Compute \end{enumerate}\)g ←(f, X^q-X)\(. \item If \)g = 1\(, return \)∅\(. \item Otherwise, return \)Linear-Root-Find(F_q,g) ∪Root-Find(F_q, f/g)\(. \)
Linear-Root-Find
\(\)(F_q, g)\(: On input \)g(X) F_q[X]\( that divides \)X^q-X\( (and hence splits into distinct linear factors): \begin{enumerate} \item If \end{enumerate}\)(g)=1\(, return \){α}\( where \)g = X - α\(. \item Repeat: \begin{enumerate} \item Pick \end{enumerate}\)a F_q^*\( and \)b F_q\( uniformly and independently at random. \item Compute \)g_a,b(X) ←g((X-b)/a)\(. \item Compute \)h_1 ←(g_a,b, X^(q-1)/2-1)\(. \item Compute \)g_1 ←h_1(aX+b)\(. \)
until \(\)0 < (g_1) < (g)\(. \item Return \)Linear-Root-Find(F_q,g_1) ∪Linear-Root-Find(F_q, g/g_1)\(. \)
Root-Find
\(\)(F_q,f)\( outputs the multiset of roots of \)f\( in expected time \)poly(n,q)\(, where \)n = (f)\(. \)
Correctness of Root-Find follows from Lemma 652:
\(\)g = (f,X^q-X)\( is trivial exactly when \)f\( has no roots in \)F_q\(, and otherwise \)g\( captures exactly the (distinct) roots of \)f\( that lie in \)F_q\( (with multiplicity removed, since \)X^q-X\( is squarefree), while \)f/g\( carries any remaining, higher-multiplicity or non-root factors, whose roots (if any) are found recursively; the two calls to \textsc{Root-Find}/\textsc{Linear-Root-Find} do not overlap in the roots they report since \)g\( and \)f/g\( are computed via division. Correctness of \textsc{Linear-Root-Find} follows from Proposition~ \ref{prop:appD-affine-transform-roots}: if \)g\( has two distinct roots \)α≠β\(, then by Part (3) of Proposition~ \ref{prop:appD-affine-transform-roots}, with probability at least \)1/2\( over the choice of \)a,b\(, exactly one of \)aα+b,aβ+b\( is a root of \)X^(q-1)/2-1\(, in which case \)h_1 = (g_a,b, X^(q-1)/2-1)\( is a proper non-trivial factor of \)g_a,b\( (by Part (2) of Proposition~ \ref{prop:appD-affine-transform-roots}, \)g_a,b\( has \)aα+b\( and \)aβ+b\( as roots, and \)h_1\( separates them), and so \)g_1 = h_1(aX+b)\( is a proper non-trivial factor of \)g\( (i.e. \)0 < (g_1) < (g)\(), and \)g/g_1\( is a proper factor as well. The recursion terminates when \)(g)=1\(. For the running time: it is straightforward to see that \textsc{Root-Find} makes at most \)n\( calls to \textsc{Linear-Root-Find} (a weak estimate that ignores possible further optimizations). By Proposition~ \ref{prop:appD-affine-transform-roots}, Part (3), the repeat loop in \textsc{Linear-Root-Find} is executed an expected constant number of times before a non-trivial split is found (since each iteration succeeds independently with probability at least \)1/2\(). Finally, the degrees of the polynomials in the two recursive calls (of both \textsc{Root-Find} and \textsc{Linear-Root-Find}) add up to at most \)(g)\( (resp. \)(f)\(), so this leads to a tree of recursive calls of total size at most \)n\(, with each internal node and leaf performing \)poly(n,q)\( work (to compute the various gcds, using Lemma~ \ref{lem:appD-sparse-mod-gcd} since \)X^q-X\( and \)X^(q-1)/2-1\( are sparse, and the affine changes of variables). Thus the overall expected running time is \)poly(n,q)\(. \)
Let
\(\)q = p^t\( for prime \)p\( and let \)g F_q[X]\( be a polynomial splitting into at least two relatively prime non-trivial factors \)g = g_1 ·g_2\(. Then there is a deterministic algorithm running in time \)poly((g),p,t)\( that finds a non-trivial factorization of \)g\(. \)
First find a polynomial
\(\)f(X)\( with \)0 < (f) < (g)\( such that \)f(X)^p - f(X) ≡0 g(X)\(. The search for such \)f\( can be phrased as a linear system over \)F_p\( (we omit the details of this step). Such a polynomial exists whenever \)g = g_1 ·g_2\( with \)g_1,g_2\( relatively prime: choosing any \)a ≠b F_p\( and letting \)f(X) = a g_1(X)\( and \)f(X) = b g_2(X)\( (which exists and has degree less than that of \)g_1 ·g_2\( by the Chinese Remainder Theorem) yields such a polynomial. Having found such an \)f\(, write \)f(X)^p - f(X) = ∏_a F_p (f(X)-a)\( (this follows since \)f(X)^p-f(X) ≡0 g(X)\( means every root of \)g\(, in a splitting field, satisfies \)Z^p=Z\(, i.e. lies in \)F_p\(, so \)Z^p-Z = ∏_a (Z-a)\( applied to \)f\( gives the claimed factorization). We note that \)(g(X), f(X)-a)\( must be non-trivial for some \)a\(, since \)(g(X), ∏_a (f(X)-a)) = g(X)\(, but for every single \)a\(, \)g(X)\( does not divide \)f(X)-a\( (as \)(f) < (g)\( and \)f\( is non-constant). Enumerating over all \)a F_p\( and computing \)(g,f-a)\( using Algorithm~ \ref{alg:appD-euclid-gcd} thus gives a non-trivial factorization of \)g\( in time \)poly((g),p)\( (there are \)p\( candidate values of \)a\(). \)
Given a bivariate polynomial
\(\)R(X,Y) F_q[X,Y]\(, we say that a polynomial \)P(X)\( is a root of \)R(X,Y)\( if \)Y - P(X)\( divides \)R(X,Y)\(. \)
There exists a randomized algorithm that, given a bivariate polynomial
\(\)R(X,Y) F_q[X,Y]\( of degree at most \)D\( (as a list of coefficients), outputs a list of all its roots in expected time polynomial in \)D\( and \)q\(. If \)q = p^t\(, there also exists a deterministic algorithm to output all roots in time polynomial in \)p,t\( and \)D\(. \)
The algorithm uses a reduction to the univariate case. We find a monic irreducible polynomial
\(\)F(X)\( of degree \)N\( where \)D < N ≤O(D)\( (such an \)F\( can be found in expected time polynomial in \)N\( and \)q\(, or deterministically in time polynomial in \)N,p,t\(, using the algorithms for finding irreducible polynomials discussed in Section D.6.4). We then consider the field \)F_Q = F_q[X] F(X)\( (a field of cardinality \)q^N\(, by Proposition~ \ref{prop:appD-quotient-field-existence}) and view \)R(X,Y)\( as a polynomial \)R_X(Y) F_Q[Y]\(. We find the roots of \)R_X(Y)\( (using the univariate root-finding algorithm of Lemma~ \ref{lem:appD-root-find-correctness}, or the deterministic root-finder implied by Theorem~ \ref{thm:appD-poly-factoring}). Let \)α_1,…,α_s F_Q\( be the roots of \)R_X\(. Using \)F_Q = F_q[X] F(X)\( we interpret \)α_1,…,α_s\( as polynomials \)A_1(X),…,A_s(X) F_q[X]\(. We report all \)A_i(X)\( such that \)Y - A_i(X)\( divides \)R(X,Y)\( (each such divisibility check can be carried out directly). This completes the description of the algorithm. Since the algorithm checks whether \)A_i(X)\( is a root before outputting it, it is clear that it outputs a subset of the true roots. To prove correctness we only need to show that every root is included in the output. In particular we need to show that if \)Y-P(X)\( divides \)R(X,Y)\( and \)α= P(X) F(X)\( represents the corresponding element of \)F_Q\( then \)Y-α\( also divides \)R_X(Y)\(. Let \)h(X,Y) F_q[X,Y]\( be such that \)R(X,Y) = h(X,Y)(Y-P(X))\(. Let \)h_X(Y) = h(X,Y) F(X)\( be the corresponding element of \)F_Q[Y]\(. We then have \)h_X(Y) ·(Y-α) = R(X,Y) F(X) = R_X(Y)\(, as desired. This proves correctness. To argue the running time, note that the algorithm can find \)F(X)\( in expected time polynomial in \)N\( and \)q\(, or deterministically in time polynomial in \)N,p\( and \)t\(. Once \)F\( is found, the univariate root-finding takes a number of steps that is polynomial in \)N\( and \)Q = N q\(, where each step is a field operation in \)F_Q\(, which may take time polynomial in \)N q\(. Using \)N = D+1\(, the overall expected running time remains polynomial in \)D\( and \)q\(. Similarly the deterministic running time is also polynomial in \)D,t\( and \)p\(, using the deterministic root-finder implied by Theorem~ \ref{thm:appD-poly-factoring}. \)