27 Appendix D. Basic Algebraic Algorithms

D.2 Groups, Rings, Fields

Definition 611 Group, Def D.2.1

✓

Given a set

\(\)G\( and a binary operation \)○\( over \)G\(, \)(G,○)\( is a group if \)○\( is associative, has an identity, and every element of \)G\( is invertible. A group \)(G,○)\( is said to be abelian if \)○\( is also commutative. \)

Definition

Definition 612 Ring, Def D.2.2

✓

A set

\(\)R\( with two binary operations \)+\( and \)·\( is said to form a ring if (1) \)(R,+)\( forms an abelian group, (2) \)·\( is associative and has an identity, and (3) \)·\( distributes over \)+\(, i.e., for every \)a,b,c R\( we have \)a ·(b+c) = (a ·b) + (a ·c)\( and \)(b+c)·a = (b ·a) + (c ·a)\(. The ring \)(R,+,·)\( is said to be a commutative ring if \)·\( is commutative. \)

Definition

Definition 613 Field, Def D.2.3

✓

A set

\(\)F\( with operations \)+\( and \)·\( forms a field if \)(F,+,·)\( is a commutative ring, and \)(F ∖{0}, ·)\( is a group, where \)0\( denotes the identity for \)+\(. \)

Definition

D.3 Polynomials

Definition 614 Formal Polynomials, Def D.3.1

✓

Let

\(\)(R,+,·)\( be a commutative ring with identity \)0\(. The set of formal polynomials over \)R\( in indeterminate \)X\(, denoted \)R[X]\(, is given by finite formal sums \)R[X] = {∑_i=0^d f_i X^i ∣f_0,…,f_d R, d Z^≥0}\(, under the equivalence \)∑_i=0^d f_i X^i = ∑_i=0^d-1 f_i X^i\( if \)f_d = 0\(. The \)f_i\( are the coefficients of \)f\(, the symbols \)X^i\( are the monomials, and \)f_i X^i\( are the terms of \)f\(. For \)f = ∑_i=0^d f_i X^i\(, its degree, denoted \)_X(f)\( or simply \)(f)\(, is the largest integer \)e\( such that \)f_e ≠0\(. Addition is coefficient-wise, \)f + g = ∑_i=0^d (f_i + g_i) X^i\(, and multiplication is given by convolution, \)f ·g = ∑_i=0^d+e ∑_j=0^e f_i-j ·g_j X^i\(. \)

Definition

Proposition 615 Prop D.3.2

✓

For every commutative ring

\(\)R\(, \)R[X]\( is a commutative ring under the sum and product of polynomials. \)

Proposition

Definition 616 Unique Factorization Domains, Def D.3.3

✓

Let

\(\)R\( be a commutative ring. An element \)u R\( is a unit if it has a multiplicative inverse in \)R\(. Elements \)a\( and \)b\( are said to be associates if there exists a unit \)u\( such that \)a = b ·u\(. Element \)a R\( is irreducible if \)a = b ·c\( implies either \)b\( or \)c\( is a unit. A factorization of \)a R\( is a sequence \)b_1,…,b_k\( such that \)a = b_1 ·b_2 b_k\( and none of the \)b_i\('s are units. Ring \)R\( is a factorization domain if for every non-zero \)a R\( that is not a unit, there is a finite bound \)k_a\( such that every factorization of \)a\( has at most \)k_a\( factors. A factorization domain \)R\( is a unique factorization domain (UFD) if every non-zero, non-unit element has a unique irreducible factorization up to associates: i.e., if \)a = b_1 b_k = c_1 c_ℓ\( and the \)b_i\('s and \)c_j\('s are irreducible, then \)k = ℓ\( and there exists a bijection \)π: [k] [ℓ]\( such that \)b_i\( and \)c_π(i)\( are associates, for every \)i [k]\(. \)

Definition

Proposition 617 Prop D.3.4

✓

Every field is a UFD.

Lemma 618 Gauss, Lemma D.3.5

✓

\(\)R\( is a UFD, then so is \)R[X]\(. \)

Lemma

Proposition 619 Prop D.3.6, Division Algorithm

✓

Given a monic polynomial

\(\)f\( (i.e., its leading coefficient is a unit), and general polynomial \)p\(, there exists a unique pair of polynomials \)q\( (the quotient) and \)r\( (the remainder) such that \)p = q ·f + r\( and \)(r) < (f)\(. \)

Proposition

Proposition 620 Prop D.3.7, Evaluation and the Remainder Theorem

✓

Given

\(\)p = ∑_i=0^d p_i X^i R[X]\( and \)αR\(, let \)p(α) = ∑_i=0^d p_i α^i\(. Then there exists a unique \)q R[X]\( such that \)p = q ·(X-α) + p(α)\(. It follows that \)p(α) = 0\( if and only if \)X - α\( divides \)p(X)\(. \)

Proposition

Lemma 621 Polynomial Distance Lemma, Lemma D.3.8

Let

\(\)f ≠g F[X]\( be polynomials of degree at most \)d\(. Then there exist at most \)d\( elements \)αF\( such that \)f(α) = g(α)\(. \)

Lemma

Proof ▶

Let

\(\)h = f - g\(. Then \)h\( is non-zero and of degree at most \)d\(. Let \)S = {α∣f(α) = g(α)}\(. By Proposition~ \ref{prop:appD-evaluation-remainder}, \)(X - α)\( divides \)h\( for every \)αS\(. Furthermore, since \)F[X]\( is a UFD (Lemma~ \ref{lem:appD-gauss} applied to the field \)F\(), we have that \)h = ∏_αS (X-α)\( divides \)h\(. But if \)h\( divides \)h\(, then \)(h) ≤(h)\( and \)(h) = |S|\(. We conclude \)|S| ≤d\(. \)

Proof ▶

D.4 Vector Spaces

D.4.1 Matrices and Vectors

Definition 622 Vectors and Matrices

✓

A vector of length

\(\)n\( over the field \)F\( is a row vector \)x F^n\(. Given \)u,v F^n\(, their inner product is \)u, v = ∑_i=1^n u_i ·v_i\(. A matrix \)M F^k ×n\( is a two-dimensional array with rows \)M_i,·\( and columns \)M_·,j\(. The transpose of \)M F^k ×n\(, denoted \)M^T\(, is the \)n ×k\( matrix with \)M^T_j,i = M_i,j\(. The product of matrices \)A F^k ×n\( and \)B F^n ×m\( is the matrix \)C F^k ×m\( with \)C_i,j = A_i,·, B_·,j \(. \)

Definition

D.4.2 Definition and Properties of Vector Spaces

Definition 623 Vector Space, Def D.4.1

✓

Over a field

\(\)F\(, a vector space is given by a triple \)(V,+,·)\( where \)(V,+)\( is a commutative group and \)·: F ×V V\( distributes over addition, so that \)α·(u+v) = α·u + α·v\( for every \)αF\( and \)u,v V\(. The identity of \)(V,+)\( is denoted \)0\(. The simplest example of an \)F\(-vector space is \)F^n\(, with coordinate-wise sum and scalar product. \)

Definition

Definition 624 Dimension of a Vector Space, Def D.4.2

✓

A sequence of vectors

\(\)v_1,…,v_k V\( is linearly independent if \)∑_i=1^k β_i ·v_i = 0\( implies \)β_1 = = β_k = 0\(; otherwise the sequence is linearly dependent. \)V\( is finite dimensional of dimension \)k\( if every sequence of \)k+1\( vectors from \)V\( is linearly dependent and there exists a linearly independent sequence of length \)k\(. A linearly independent set \)v_1,…,v_k V\( is said to form a basis for \)V\( if \)V\( has dimension \)k\(. \)

Definition

Proposition 625 Prop D.4.3

✓

\(\)v_1,…,v_k\( form a basis for an \)F\(-vector space \)V\(, then \)V = {∑_i=1^k β_i ·v_i ∣β_1,…,β_k F}\( and the map \)(β_1,…,β_k) ↦∑_i=1^k β_i ·v_i\( is an isomorphism from \)F^k\( to \)V\(. \)

Proposition

Definition 626 Generator Matrix, Parity Check Matrix, Def D.4.4

✓

A matrix

\(\)G F^k ×n\( is a generator matrix of an \)F\(-vector space \)V ⊆F^n\( if the rows of \)G\( are linearly independent in \)F^n\( and \)V = {x ·G ∣x F^k}\(; the rows of \)G\( form a basis of \)V\(. A matrix \)H F^(n-k)×n\( is a parity check matrix of \)V ⊆F^n\( if the rows of \)H\( are linearly independent and \)V = {y F^n ∣H ·y^T = 0}\(. Given \)V\( with parity check matrix \)H\(, its dual space, denoted \)V^⊥\(, is the vector space generated by \)H\(, i.e. \)V^⊥= {x ·H ∣x F^n-k}\(. \)

Definition

Proposition 627 Prop D.4.5

\(\)V ⊆F^n\( is a \)k\(-dimensional vector space then it has a generator matrix \)G F^k ×n\( and a parity check matrix \)H F^(n-k)×n\(. Furthermore its dual \)V^⊥\( is generated by \)H\(, has dimension \)n-k\(, and has \)G\( as its parity check matrix. Finally \)(V^⊥)^⊥= V\(. \)

Proposition

Proof ▶

Existence of a generator matrix is immediate from the definitions: since

\(\)V\( is \)k\(-dimensional it has a basis \)v_1,…,v_k\(, and the matrix \)G\( with these vectors as its rows satisfies the conditions of a generator matrix. For the parity check matrix, we say that a \)k×k\( matrix \)R\( forms a row operation if either (i) \)R_ii=1\( and \)R_ij=0\( for all but one pair \)i≠j [k]\(, or (ii) \)R\( is a permutation matrix that swaps two rows. We say \)G\( is obtained from \)G\( by row operations, denoted \)G G\(, if \)G = R_m ·R_m-1 R_1 ·G\( where the \)R_i\('s are row operations. If \)G\( is a generator matrix for \)V\( then so is \)G\(, since each row operation replaces the row space by an equal row space. Gaussian elimination allows us to simplify \)G\( until its columns are special, and in particular after permuting the columns \)G\( would look like \)[I_k ∣A]\( where \)I_k\( denotes the \)k×k\( identity matrix. Assume for simplicity that \)G = [I_k ∣A]\( (without permuting columns). Now let \)H\( be given by \)H = [-A^T ∣I_n-k]\(. It can be verified by direct computation that \)G ·H^T = -A + A = 0\(, and so \)G ·H^T = 0\( as well (since \)G\( and \)G\( generate the same row space, and the row operations used are invertible). Furthermore all rows of \)H\( are linearly independent (since the \)I_n-k\( block has full rank), and so \)H\( satisfies the conditions of a parity check matrix of \)V\(. For the claims about \)V^⊥\(: since \)H\( has linearly independent rows, \)H\( is itself a generator matrix for the \)(n-k)\(-dimensional vector space \)V^⊥\( generated by its rows, so \)(V^⊥) = n-k\(. Since \)G ·H^T = 0\(, equivalently \)H ·G^T = 0\(, and \)G\( has \)k = n - (n-k)\( linearly independent rows, \)G\( satisfies the conditions of a parity check matrix for \)V^⊥\(. Finally, applying the same construction to \)V^⊥\( (with parity check matrix \)G\() shows that \)(V^⊥)^⊥\( is generated by \)G\(, i.e. \)(V^⊥)^⊥= V\(. \)

Proof ▶

D.5 Finite Fields

D.5.1 Prime Fields

Proposition 628 Prop D.5.1

✓

For any prime

\(\)p\(, \)(Z_p, +_p, ·_p)\(, where \)+_p\( and \)·_p\( denote addition and multiplication modulo \)p\( on \)Z_p = {0,…,p-1}\(, forms a field of cardinality \)p\(. \)

Proposition

Definition 629 Characteristic

✓

Given a finite field

\(\)F\(, its characteristic, denoted \)char(F)\(, is the smallest positive integer \)p\( such that \)p ·1 = 1 + 1 + + 1 = 0\(. \)

Definition

Proposition 630 Prop D.5.2

✓

For every finite field

\(\)F\(, \)char(F)\( is a prime. Furthermore, \)F\( is a \)Z_p\(-vector space, where \)p = char(F)\(. Thus \)F\( has cardinality \)p^n\( for prime \)p\( and integer \)n\(. \)

Proposition

Proposition 631 Prop D.5.3

✓

For any prime

\(\)p\(, there is a unique field of cardinality \)p\( up to isomorphism. \)

Proposition

D.5.2 Extension Fields and Subfields

Proposition 632 Prop D.5.4

✓

\(\)(G,·)\( is a finite group with identity \)1\(, then for every \)a G\(, \)a^|G| = 1\(. \)

Proposition

Proposition 633 Prop D.5.5

✓

Let

\(\)F\( be a field of cardinality \)q\(. Then every element \)αF\( is a root of the polynomial \)X^q - X\( and so \)X^q - X = ∏_αF (X - α)\(. \)

Proposition

Proposition 634 Prop D.5.6

✓

Let

\(\)K\( be a field and \)F ⊆K\( a subfield (closed under addition and multiplication), denoted \)F K\( (also, \)K F\(). Then \)K\( is an \)F\(-vector space and so \)|K| = |F|^n\( where \)n\( is the dimension of \)K\( as an \)F\(-vector space. Furthermore there is a unique copy of \)F\( in \)K\(. \)

Proposition

D.5.3 Existence of Finite Fields

Proposition 635 Prop D.5.7

✓

Let

\(\)F\( be a finite field of cardinality \)q\( and let \)g F[X]\( be an irreducible polynomial of degree \)n\(. Then \)(F[X]/g, +_g, ·_g)\(, where \)+_g\( and \)·_g\( denote addition and multiplication followed by reduction modulo \)g\(, forms a field of cardinality \)q^n\(. \)

Proposition

Proposition 636 Prop D.5.8

✓

Let

\(\)K\( be a field of characteristic \)p\( and let \)A,B K[X,Y]\(. Then for all positive integers \)n\( we have \)(A+B)^p^n = A^p^n + B^p^n\(. \)

Proposition

Theorem 637 Thm D.5.9

✓

\(\)F\( is a finite field, then it has characteristic \)p\( for some prime \)p\( and its cardinality is \)p^n\( for positive integer \)n\(. Conversely, for every prime \)p\( and positive integer \)n\(, there is a field of cardinality \)p^n\(. \)

Theorem

D.5.4 Uniqueness of Finite Fields

Lemma 638 Lemma D.5.10

✓

Let

\(\)q = |F|\( and \)n = q-1\(. Let \)N^=(G,m)\( (resp. \)N(G,m)\() denote the number of elements of order exactly (resp. dividing) \)m\( in a finite group \)G\(. Then for every \)k\( dividing \)n\(, \)N(F^*,k) = N(Z_n,k)\( and \)N^=(F^*,k) = N^=(Z_n,k)\(. \)

Lemma

Proposition 639 Prop D.5.11

✓

Every finite field

\(\)F\( has a primitive element \)ω\( (i.e. \)ω^i ≠1\( for \)i < |F|-1\( and \)ω^|F|-1=1\(). Consequently the multiplicative group \)F^*\( is cyclic. \)

Proposition

Proposition 640 Prop D.5.12

✓

Let

\(\)K\( be a finite field and let \)ω\( be a primitive element in \)K\(. Then for every subfield \)F K\(, \)ω\( is an \)F\(-generator of \)K\(, i.e. every \)βK\( equals \)p(ω)\( for some \)p F[X]\(. As a consequence, for every \)K F\( there is an \)F\(-generator in \)K\(. \)

Proposition

Proposition 641 Prop D.5.13

✓

Let

\(\)K F\( and let \)α\( be an \)F\(-generator of \)K\(. Let \)p\( be the minimal degree polynomial in \)F[X]\( such that \)p(α) = 0\(. Then \)p\( is irreducible and \)K\( is isomorphic to \)F[X]/p\(. \)

Proposition

Proposition 642 Prop D.5.14

✓

\(\)f F_p[X]\( is irreducible of degree \)n\(, then \)f\( divides \)X^p^n - X\(. \)

Proposition

Theorem 643 Thm D.5.15

✓

For every prime

\(\)p\( and integer \)n\(, there is a unique field of cardinality \)p^n\( up to isomorphism. \)

Theorem

D.5.5 The Trace and Norm Maps

Definition 644 Trace and Norm, Def D.5.16

✓

Let

\(\)F = F_q\( and let \)K = F_q^n\(. The Trace function \)Tr = Tr_KF\( is the function obtained by evaluating the polynomial \)Tr(X) = X + X^q + X^q^2 + + X^q^n-1\(. The Norm function is obtained by evaluating \)N(X) = X^1+q+q^2++q^n-1\(. \)

Definition

Proposition 645 Prop D.5.17

Trace is an

\(\)F\(-linear map, i.e., for every \)αF\( and \)β, γK\(, \)Tr(α·β+ γ) = α·Tr(β) + Tr(γ)\(. \item Norm is multiplicative, i.e., \)N(β·γ) = N(β) N(γ)\(. \item Trace is a \)q^n-1\(-to-one map from \)K\( to \)F\(. \item Norm is a \)(q^n-1)/(q-1)\(-to-one map from \)K^*\( to \)F^*\(. \)

Proposition

Proof ▶

(1) The

\(\)F\(-linearity follows from the facts that \)(αβ+ γ)^q^i = α^q^i β^q^i + γ^q^i\( (using Proposition~ \ref{prop:appD-frobenius-freshman} repeatedly, together with the fact that \)q\( is a power of \)char(K)\() and \)α^q^i = α\( for \)αF\(. (2) The multiplicativity of \)N\( is immediate from the definition, since \)(βγ)^1+q++q^n-1 = β^1+q++q^n-1 γ^1+q++q^n-1\(. (3) For \)βK\( we have \)Tr(β)^q = β^q + + β^q^n = β^q + + β^q^n-1 + β= Tr(β)\( (using \)β^q^n=β\(), and so the range of \)Tr\( is \)F\(. Since \)Tr\( is (the evaluation of) a polynomial of degree \)q^n-1\(, it can take on any value in the range at most \)q^n-1\( times. But it has a domain of size \)q^n\( and range of size \)q\(, so it must take on every value exactly \)q^n-1\( times. (4) Similarly to Part (3): \)N(β)^q = N(β)\( for all \)βK\(, and \)N(β)\( is non-zero if and only if \)β≠0\(. Using the degree of \)N\( (which is \)(q^n-1)/(q-1)\() and counting as in Part (3), \)N\( takes on every value in \)F^*\( exactly \)(q^n-1)/(q-1)\( times. \)

Proof ▶

Proposition 646 Prop D.5.18

A function

\(\)L : K F\( is \)F\(-linear if and only if there exists \)λK\( such that \)L(β) = Tr(λβ)\( for every \)βK\(. \)

Proposition

Proof ▶

First note that

\(\)f(β) = Tr(λβ)\( is \)F\(-linear, since \[ f(\alpha \beta + \gamma ) = \mathrm{Tr}(\lambda (\alpha \beta +\gamma )) = \mathrm{Tr}(\lambda \alpha \beta ) + \mathrm{Tr}(\lambda \gamma ) = \alpha \mathrm{Tr}(\lambda \beta ) + \mathrm{Tr}(\lambda \gamma ) = \alpha f(\beta ) + f(\gamma ) \] for every \)αF\( and \)β,γK\(, using Proposition~ \ref{prop:appD-trace-norm-properties}, Part (1). For the converse we employ a counting argument. First note that if \)λ≠0\( then the function \)f_λ(β) = Tr(λβ)\( is not identically zero: viewed as a polynomial in \)Z\(, \)f_λ(Z)\( has degree \)|K|/|F|\( and a non-zero coefficient of \)Z\(. By linearity, \)f_λ≠f_τ\( if \)λ≠τ\(, since \)f_λ- f_τ= f_λ-τ ≠0\(. So, including \)λ= 0\(, there are at least \)|K|\( distinct linear functions of the form \)f_λ(·)\(. We now note there are also at most \)|K|\( such linear functions in total. To see this let \)β_1,…,β_n K\( be \)F\(-linearly independent elements (i.e., a basis of \)K\( as an \)F\(-vector space, cf. Definition~ \ref{def:appD-vector-space}). A linear function \)L: K F\( is completely determined by its values at \)β_1,…,β_n\(, since for every \)β= ∑_i α_i β_i K\( with \)α_1,…,α_n F\( we have \)L(β) = ∑_i α_i L(β_i)\(. Thus the number of linear functions is upper bounded by \)|{(L(β_1),…,L(β_n)) F^n}| ≤|F|^n = |K|\(. We conclude that these are exactly \)|K|\( functions that are \)F\(-linear from \)K F\(, and these are exactly the functions \)f_λ\(, \)λK\(. \)

Proof ▶

D.6 Algorithmic Aspects of Finite Fields

Proposition 647 Prop D.6.1

Let

\(\)p=2\( and \)t = 2 ·3^ℓ\( for any non-negative integer \)ℓ\(. Then the polynomial \)X^t + X^t/2 + 1\( is irreducible in \)F_2[X]\(. \)

Proposition

D.7 Algorithmic Aspects of Polynomials

D.7.2 Greatest Common Divisor

Definition 648 Greatest Common Divisor, Def D.7.1

✓

Given polynomials

\(\)f,g F[X]\(, their greatest common divisor, denoted \)(f,g)\(, is the maximal degree polynomial \)h(X)\( with leading coefficient \)1\( such that \)h\( divides \)f\( and \)g\(. \)

Definition

Algorithm 649 Euclid’s Algorithm

On input

\(\)f,g F[X]\( with \)(g) < (f)\(: if \)g\( divides \)f\(, output \)g\( (normalized to have leading coefficient \)1\(); otherwise, compute \)f = q ·g + r\( with \)(r) < (g)\( via the division algorithm (Proposition~ \ref{prop:appD-division-algorithm}), and recursively output the result of running the algorithm on \)(g,r)\(. \)

Algorithm

Proof ▶

Correctness follows from the identity

\(\)(f,g) = (g,r)\( where \)f = q ·g + r\(: any common divisor of \)f\( and \)g\( divides \)r = f - q·g\( and hence is a common divisor of \)g\( and \)r\(; conversely any common divisor of \)g\( and \)r\( divides \)f = q ·g + r\( and hence is a common divisor of \)f\( and \)g\(. Since \)(r) < (g) < (f)\( strictly decreases at each recursive call (and the recursion terminates when one polynomial divides the other), the algorithm terminates after at most \)(g)+1\( recursive calls. With a naive implementation the algorithm runs in time polynomial in \)n = ((f),(g))\(; the individual steps can be combined more cleverly to obtain an implementation running in time \)O(n(n)^c)\( field operations for some constant \)c\( (we do not reproduce this refinement here; see the notes to this appendix). \)

Proof ▶

D.7.3 Factoring and Root-Finding

Theorem 650 Thm D.7.2

There exists a constant

\(\)c\( and a randomized algorithm running in expected time \)O((n q)^c)\( that factors polynomials of degree \)n\( in \)F_q[X]\(. Furthermore, if \)q = p^t\( for prime \)p\(, then there is a deterministic algorithm with running time \)O((npt)^c)\( for factoring. \)

Theorem

Definition 651 Root-Finding Problem, Def D.7.3

The input to the root-finding problem is a polynomial

\(\)f F_q[X]\( of degree at most \)n\( (given as a list of coefficients \)f_0,…,f_n F_q\(). The task is to find all \)αF_q\( that are roots of \)f\(, i.e., to output the set \){αF_q ∣f(α) = 0}\(. \)

Definition

Lemma 652 Lemma D.7.4

A polynomial

\(\)f F_q[X]\( has a root in \)F_q\( if and only if \)(f, X^q-X) ≠1\(. \)

Lemma

Proof ▶

\(\)f\( has a root \)α\(, then \)X-α\( divides both \)f\( and \)X^q-X\( (Proposition~ \ref{prop:appD-root-Xq-X}), and so their gcd cannot be trivial. Conversely, a factor of \)X^q-X\( is of the form \)∏_αS(X-α)\( for some \)S ⊆F_q\( (by unique factorization, Definition~ \ref{def:appD-ufd}, since \)X^q-X\( splits into distinct linear factors by Proposition~ \ref{prop:appD-root-Xq-X}), and so \)(f, X^q-X)\( must be of this form. If the gcd is non-trivial, then \)S\( must be non-empty, implying that for every \)αS\( we have \)X-α\( divides \)f\( and thus \)f\( has a root in \)S ⊆F_q\(. \)

Proof ▶

Definition 653 Sparse Polynomials

We say that a polynomial

\(\)h F[X]\( is \)t\(-sparse if at most \)t\( of its coefficients are non-zero. Every polynomial \)h\( is \)((h)+1)\(-sparse. \)

Definition

Lemma 654 Lemma D.7.5

Let

\(\)f F[X]\( be a polynomial of degree \)n\( and let \)h F[X]\( be a \)t\(-sparse polynomial of degree \)D\(. Then \)h f\( and \)(f,h)\( can be computed in time \)poly(n,t,D)\(. \)

Lemma

Proof ▶

It suffices to compute

\(\)h f\( in time \)poly(n,t,D)\(, and then one can use Euclid's algorithm (Algorithm~ \ref{alg:appD-euclid-gcd}) to compute \)(f,h) = (f, h f)\( in time \)poly(n)\(. Write \)h = ∑_i=1^t h_i X^d_i\(. If we can compute \)h_i X^d_i f\( in time \)poly(n,d_i)\( for every \)i\(, then we can add the \)t\( results (each of degree less than \)n\() in time \)poly(n,t)\( to get \)h f\(. Finally, \)X^d f\( can be computed by repeated squaring. Let \)d = ∑_j=0^_2 d d_j 2^j\( be the binary expansion of \)d\(. We first compute the sequence of polynomials \)g_j = X^2^j f = g_j-1^2 f\( by repeatedly squaring (and reducing modulo \)f\() the output of the previous step; this takes \)O(d)\( multiplications and reductions modulo \)f\(, each of which takes time \)poly(n)\(. Then we compute \)X^d f = ∏_j=0^_2 d (g_j)^d_j f\( using \)O(d)\( further multiplications and reductions, yielding the desired result in overall time \)poly(n,d)\(. \)

Proof ▶

Proposition 655 Prop D.7.6

\(\)F_q\( be a field of odd characteristic (so \)q\( is odd). Then \)X^q - X = X ·(X^(q-1)/2 - 1) ·(X^(q-1)/2+1)\(. In particular \)X^q-X\( factors into three \)2\(-sparse polynomials of degree at most \)q/2\(. \item Let \)q = 2^t\( for integer \)t ≥2\(. Then \)X^q - X = Tr(X) ·(Tr(X) - 1)\(, where \)Tr(X) = Tr_F_q F_2(X) = X + X^2 + X^4 + + X^2^t-1\( is the Trace map from \)F_q\( to \)F_2\(. In particular \)X^q - X\( factors into two \)(2+_2 q)\(-sparse polynomials of degree \)q/2\(. \)

Proposition

Proof ▶

(1) The case of odd

\(\)q\( is verified by direct expansion: \)X ·(X^(q-1)/2-1) ·(X^(q-1)/2+1) = X ·(X^q-1-1) = X^q - X\(. Note that \)(q-1)/2\( is an integer since \)q\( is odd. (2) For even \)q\(, we use the fact that \)Tr\( is a map from \)F_q\( to \)F_2\( (Definition~ \ref{def:appD-trace-norm}, applied with the subfield \)F_2\(). So every \)αF_q\( satisfies \)Tr(α)=0\( or \)Tr(α)=1\(. It follows that \)X-α\( divides \)Tr(X)·(Tr(X)-1)\( for every \)αF_q\(. Consequently, by unique factorization (Definition~ \ref{def:appD-ufd}), \)X^q-X\( (which is squarefree with roots exactly \)F_q\(, Proposition~ \ref{prop:appD-root-Xq-X}) divides \)Tr(X)·(Tr(X)-1)\(. The identity \)X^q-X = Tr(X)·(Tr(X)-1)\( now follows from the fact that both polynomials have the same degree \)q\( and have leading coefficient \)1\( (using the Polynomial Distance Lemma, Lemma~ \ref{lem:appD-polynomial-distance}, or direct degree comparison, to conclude that a divisibility between monic polynomials of equal degree is an equality). \)

Proof ▶

Proposition 656 Prop D.7.7

Let

\(\)g F_q[X]\( have \)α≠β\( as its roots. Fix \)a F_q^*\( and \)b F_q\( and let \)g_a,b(X) = g((X-b)/a)\(. Then: \begin{enumerate} \item The coefficients of \end{enumerate}\)g_a,b\( can be computed efficiently given \)a\(, \)b\(, and the coefficients of \)g\(. \item \)g_a,b\( has \)aα+ b\( and \)aβ+b\( as its roots. \item If \)a F_q^*\( and \)b F_q\( are chosen uniformly at random independently, then the probability that exactly one of \)aα+b\( and \)aβ+b\( is a root of \)X^(q-1)/2-1\( is at least \)1/2\(. \)

Proposition

Proof ▶

Parts (1) and (2) are straightforward to verify by direct substitution:

\(\)g_a,b(aα+b) = g((aα+b-b)/a) = g(α) = 0\(, and similarly for \)β\(; and computing the coefficients of \)g_a,b\( from those of \)g\( amounts to composing with the linear map \)X ↦(X-b)/a\(, which can be done with a number of field operations polynomial in \)(g)\(. For Part (3), note that for any pair of distinct elements \)γ,δF_q\( there is exactly one pair \)a F_q^*, b F_q\( such that \)aα+b=γ\( and \)aβ+b=δ\( (this is a non-degenerate linear system in \)a,b\( since \)α≠β\(). Since the fraction of distinct pairs \)γ,δF_q\( such that exactly one of them comes from the set of size \)(q-1)/2\( (the set of roots of \)X^(q-1)/2-1\() is at least \)1/2\( (the exact fraction is \)1/2 + 1/(2q)\(), we have that the probability that exactly one of \)aα+b\( and \)aβ+b\( is a root of \)X^(q-1)/2-1\( is at least \)1/2\(. \)

Proof ▶

Algorithm 657 Root-Find, Algorithm D.7.1

Root-Find

\(\)(F_q, f)\(: On input \)f(X) F_q[X]\(: \begin{enumerate} \item Compute \end{enumerate}\)g ←(f, X^q-X)\(. \item If \)g = 1\(, return \)∅\(. \item Otherwise, return \)Linear-Root-Find(F_q,g) ∪Root-Find(F_q, f/g)\(. \)

Algorithm

Algorithm 658 Linear-Root-Find, Algorithm D.7.2

Linear-Root-Find

\(\)(F_q, g)\(: On input \)g(X) F_q[X]\( that divides \)X^q-X\( (and hence splits into distinct linear factors): \begin{enumerate} \item If \end{enumerate}\)(g)=1\(, return \){α}\( where \)g = X - α\(. \item Repeat: \begin{enumerate} \item Pick \end{enumerate}\)a F_q^*\( and \)b F_q\( uniformly and independently at random. \item Compute \)g_a,b(X) ←g((X-b)/a)\(. \item Compute \)h_1 ←(g_a,b, X^(q-1)/2-1)\(. \item Compute \)g_1 ←h_1(aX+b)\(. \)

until \(\)0 < (g_1) < (g)\(. \item Return \)Linear-Root-Find(F_q,g_1) ∪Linear-Root-Find(F_q, g/g_1)\(. \)

Algorithm

Lemma 659 Lemma D.7.8

Root-Find

\(\)(F_q,f)\( outputs the multiset of roots of \)f\( in expected time \)poly(n,q)\(, where \)n = (f)\(. \)

Lemma

Proof ▶

Correctness of Root-Find follows from Lemma 652:

\(\)g = (f,X^q-X)\( is trivial exactly when \)f\( has no roots in \)F_q\(, and otherwise \)g\( captures exactly the (distinct) roots of \)f\( that lie in \)F_q\( (with multiplicity removed, since \)X^q-X\( is squarefree), while \)f/g\( carries any remaining, higher-multiplicity or non-root factors, whose roots (if any) are found recursively; the two calls to \textsc{Root-Find}/\textsc{Linear-Root-Find} do not overlap in the roots they report since \)g\( and \)f/g\( are computed via division. Correctness of \textsc{Linear-Root-Find} follows from Proposition~ \ref{prop:appD-affine-transform-roots}: if \)g\( has two distinct roots \)α≠β\(, then by Part (3) of Proposition~ \ref{prop:appD-affine-transform-roots}, with probability at least \)1/2\( over the choice of \)a,b\(, exactly one of \)aα+b,aβ+b\( is a root of \)X^(q-1)/2-1\(, in which case \)h_1 = (g_a,b, X^(q-1)/2-1)\( is a proper non-trivial factor of \)g_a,b\( (by Part (2) of Proposition~ \ref{prop:appD-affine-transform-roots}, \)g_a,b\( has \)aα+b\( and \)aβ+b\( as roots, and \)h_1\( separates them), and so \)g_1 = h_1(aX+b)\( is a proper non-trivial factor of \)g\( (i.e. \)0 < (g_1) < (g)\(), and \)g/g_1\( is a proper factor as well. The recursion terminates when \)(g)=1\(. For the running time: it is straightforward to see that \textsc{Root-Find} makes at most \)n\( calls to \textsc{Linear-Root-Find} (a weak estimate that ignores possible further optimizations). By Proposition~ \ref{prop:appD-affine-transform-roots}, Part (3), the repeat loop in \textsc{Linear-Root-Find} is executed an expected constant number of times before a non-trivial split is found (since each iteration succeeds independently with probability at least \)1/2\(). Finally, the degrees of the polynomials in the two recursive calls (of both \textsc{Root-Find} and \textsc{Linear-Root-Find}) add up to at most \)(g)\( (resp. \)(f)\(), so this leads to a tree of recursive calls of total size at most \)n\(, with each internal node and leaf performing \)poly(n,q)\( work (to compute the various gcds, using Lemma~ \ref{lem:appD-sparse-mod-gcd} since \)X^q-X\( and \)X^(q-1)/2-1\( are sparse, and the affine changes of variables). Thus the overall expected running time is \)poly(n,q)\(. \)

Proof ▶

Lemma 660 Deterministic Root-Finding, Sketch

Let

\(\)q = p^t\( for prime \)p\( and let \)g F_q[X]\( be a polynomial splitting into at least two relatively prime non-trivial factors \)g = g_1 ·g_2\(. Then there is a deterministic algorithm running in time \)poly((g),p,t)\( that finds a non-trivial factorization of \)g\(. \)

Lemma

Proof ▶

First find a polynomial

\(\)f(X)\( with \)0 < (f) < (g)\( such that \)f(X)^p - f(X) ≡0 g(X)\(. The search for such \)f\( can be phrased as a linear system over \)F_p\( (we omit the details of this step). Such a polynomial exists whenever \)g = g_1 ·g_2\( with \)g_1,g_2\( relatively prime: choosing any \)a ≠b F_p\( and letting \)f(X) = a g_1(X)\( and \)f(X) = b g_2(X)\( (which exists and has degree less than that of \)g_1 ·g_2\( by the Chinese Remainder Theorem) yields such a polynomial. Having found such an \)f\(, write \)f(X)^p - f(X) = ∏_a F_p (f(X)-a)\( (this follows since \)f(X)^p-f(X) ≡0 g(X)\( means every root of \)g\(, in a splitting field, satisfies \)Z^p=Z\(, i.e. lies in \)F_p\(, so \)Z^p-Z = ∏_a (Z-a)\( applied to \)f\( gives the claimed factorization). We note that \)(g(X), f(X)-a)\( must be non-trivial for some \)a\(, since \)(g(X), ∏_a (f(X)-a)) = g(X)\(, but for every single \)a\(, \)g(X)\( does not divide \)f(X)-a\( (as \)(f) < (g)\( and \)f\( is non-constant). Enumerating over all \)a F_p\( and computing \)(g,f-a)\( using Algorithm~ \ref{alg:appD-euclid-gcd} thus gives a non-trivial factorization of \)g\( in time \)poly((g),p)\( (there are \)p\( candidate values of \)a\(). \)

Proof ▶

Definition 661 Root of a Bivariate Polynomial

Given a bivariate polynomial

\(\)R(X,Y) F_q[X,Y]\(, we say that a polynomial \)P(X)\( is a root of \)R(X,Y)\( if \)Y - P(X)\( divides \)R(X,Y)\(. \)

Definition

Theorem 662 Thm D.7.9, Bivariate Root-Finding

There exists a randomized algorithm that, given a bivariate polynomial

\(\)R(X,Y) F_q[X,Y]\( of degree at most \)D\( (as a list of coefficients), outputs a list of all its roots in expected time polynomial in \)D\( and \)q\(. If \)q = p^t\(, there also exists a deterministic algorithm to output all roots in time polynomial in \)p,t\( and \)D\(. \)

Theorem

Proof ▶

The algorithm uses a reduction to the univariate case. We find a monic irreducible polynomial

\(\)F(X)\( of degree \)N\( where \)D < N ≤O(D)\( (such an \)F\( can be found in expected time polynomial in \)N\( and \)q\(, or deterministically in time polynomial in \)N,p,t\(, using the algorithms for finding irreducible polynomials discussed in Section D.6.4). We then consider the field \)F_Q = F_q[X] F(X)\( (a field of cardinality \)q^N\(, by Proposition~ \ref{prop:appD-quotient-field-existence}) and view \)R(X,Y)\( as a polynomial \)R_X(Y) F_Q[Y]\(. We find the roots of \)R_X(Y)\( (using the univariate root-finding algorithm of Lemma~ \ref{lem:appD-root-find-correctness}, or the deterministic root-finder implied by Theorem~ \ref{thm:appD-poly-factoring}). Let \)α_1,…,α_s F_Q\( be the roots of \)R_X\(. Using \)F_Q = F_q[X] F(X)\( we interpret \)α_1,…,α_s\( as polynomials \)A_1(X),…,A_s(X) F_q[X]\(. We report all \)A_i(X)\( such that \)Y - A_i(X)\( divides \)R(X,Y)\( (each such divisibility check can be carried out directly). This completes the description of the algorithm. Since the algorithm checks whether \)A_i(X)\( is a root before outputting it, it is clear that it outputs a subset of the true roots. To prove correctness we only need to show that every root is included in the output. In particular we need to show that if \)Y-P(X)\( divides \)R(X,Y)\( and \)α= P(X) F(X)\( represents the corresponding element of \)F_Q\( then \)Y-α\( also divides \)R_X(Y)\(. Let \)h(X,Y) F_q[X,Y]\( be such that \)R(X,Y) = h(X,Y)(Y-P(X))\(. Let \)h_X(Y) = h(X,Y) F(X)\( be the corresponding element of \)F_Q[Y]\(. We then have \)h_X(Y) ·(Y-α) = R(X,Y) F(X) = R_X(Y)\(, as desired. This proves correctness. To argue the running time, note that the algorithm can find \)F(X)\( in expected time polynomial in \)N\( and \)q\(, or deterministically in time polynomial in \)N,p\( and \)t\(. Once \)F\( is found, the univariate root-finding takes a number of steps that is polynomial in \)N\( and \)Q = N q\(, where each step is a field operation in \)F_Q\(, which may take time polynomial in \)N q\(. Using \)N = D+1\(, the overall expected running time remains polynomial in \)D\( and \)q\(. Similarly the deterministic running time is also polynomial in \)D,t\( and \)p\(, using the deterministic root-finder implied by Theorem~ \ref{thm:appD-poly-factoring}. \)

Proof ▶