SymmetricPolymomials

Symmetric Polynomials

Introduction

Consider the quadratic equation given by y = (x - 2)(x - 3) = x² - 5x +6. In this case we know that the roots of the eqation are x = 2 and x = 3. Suppose, however, that we did not know what the roots were. Lets'see what information we can determine about the roots. We know that a second order equation has two roots and that these roots may be real or imaginary. Calling the roots r₁ and r₂, we also know that the polynomial can be factored as (x - r₁)(x - r₂) = x² - (r₁ + r₂)x + r₁r₂ = x² - 5x +6. Equating coefficients we havve:

-(r₁ + r2 ) = -5
r₁ + r₂=5 (1)

r₁r₂= 6 (2)
Exercise: Express r₁² + r₂² in terms of (r₁ + r₂ ) and r₁r₂.

Solution: r₁² + r₂² = (r₁ + r₂ )² - 2r₁r₂ . That means that substituting (1) and (2) into the above equation we can determine the value of r₁² + r₂² without having to solve for the roots. r₁² + r₂² = 5² - 2*6 = 13, which you can verify by substituting the actual root values of 2 and 3 into r₁² + r₂².

With regard to symmetry there are several things to notice. Firstly, (x - r₁)(x - r₂) is symmetric with respect to r₁ and r₂. As a result, the expressions for the coefficients, r₁ + r2 and r₁r2, are also symmetric with respect to r₁ and r₂. Finally, any polynomial expression of r₁ + r2 and r₁r2, like (r₁ + r₂ )² - 2r₁r₂ used to compute r₁² + r₂² must result in a polynomial that is symmetric in r₁ and r₂. In this case the symmetric polynmial computed was r₁² + r₂². If you had been asked to use r₁ + r2 and r₁r2 to compute r₁² + 2r₂² you could not do it because r₁² + 2r₂² is not symmetric in r₁ and r₂.

Exercise: The quadratic formula r = (-b +/ sqrt(b² - 4ac))/ 2a can be derived by the method of completing the square.
For the case where a = 1 this simplifies to r = (-b +/ sqrt(b² - 4c))/ 2a. Verify the formula substiituting -(r₁ + r2 ) for b and r₁r2 for c.

Solution: We want to show that the two values obtained from the equation are r₁ and r₂.
(-b +/ sqrt(b² - 4c))/ 2a =( (r₁ + r2 ) +/- sqrt((r₁ + r₂ )² - 4r₁r2))/2
Working with the sqrt portion, sqrt((r₁ + r₂ )² - 4r₁r2) = sqrt( r₁² + 2r₁r2 +r₂² - 4r₁r2) = sqrt(r₁² -2r₁r2 + r₂²) = sqrt((r₁ - r₂)² ) = r_{1
- r₂.

Substituting into the equation, we get r = ( (r₁
+ r2 ) +/- (r₁ - r₂))/ 2
which gives values of r₁ and r₂.

What We Will Be Doing

To summarize what was done in the first exercise, we were able to compute the
value of a symmetric polynomial, r₁²
+ r₂², in the roots of the orignal
polynomial in terms
of the coefficients of the original polynomial. What will be shown is a
proof due to Isaac Newton that is a generalization of this
result. Any symmetric polynomial of the roots of a given
polynomial can be expressed as a polynomial of the coefficients of the
the given polynomial. Stated symbolically this means that
given a polynomial

p(x) = a₀ + a₁X + a₂X²
+ ... a_n-1X^n-1 + a_nXⁿ,
the value of any symmetric polynomial of the roots r₁,
r₂, ...r_n can be computed
as a polynomial in a₀, a₁,
... , a_n with no need to compute any of the
roots.

General
Expressions for the Coefficients of Polynomials

Given any polynomial
equation

a₀ + a₁X + a₂X²
+ ... a_n-1X^n-1 + a_nXⁿ= 0, where a_n is
not 0, we can divide through by a_n to
get a polynomial in the form

p(x) = a₀ + a₁X + a₂X²
+ ... a_n-1X^n-1
+ Xⁿ= 0. We can
therefore without loss of generality restrict ourselves to polynmials
with leading coefficient equal to 1.

The polynomial can be factored in terms of its roots to get

(X - r₁)(X - r₂)...(X - r_n)=
0 so

(X - r₁)(X - r₂)...(X - r_n)
= a₀ + a₁X + a₂X²
+ ... a_n-1X^n-1
+ Xn

Equating the coefficients on both sides we get:

Sum(r_i) = (r₁ + r₂
+ ... r_n) = a_n-1

Sum(r_ir_j) = a_n-2
By this is meant that to find the coefficient of X_n-2on the left side, find all of the ways of choosing n-2
roots and two X factors from the terms in parentheses. If,
for example, n was equal to 3 we would have a₁ =
(r₁r₂ + r₁r₃
+ r₂r₃).

Similarly, Sum(r_ir_jr_k)
= a_n-3.

If we continue in this way, the left side terms will include all the
possible expressions of the form

Sum(r₁r₂...r_k)
where k ≤ n and each of the r_i are
distinct. These expressions are polynomials in the r_i
and are in fact symmetric polynomials. They are referred to as
elementary symmetric polynomials. Each one equates to one of the
coefficients of p(x). Therefore showing that a every symmetric polynomial in the r_i
is a polynomial of the coefficients is equivalent to showing that every
symmetric polynomial can be expressed as a polynomial of the
elementary symmetic polynomials. The exercise showed that r₁²
+ r₂² =
(r₁ + r₂ )²
- 2r₁r₂, which we noww see as a polynomial in the two elementary symmetric polynomials (r₁ + r₂ ) and r₁r₂.

Proof Outline

In computing r₁²
+ r₂², a first approach is to use
(r₁ + r₂ )² to compute the squared terms. We are left with the symmetric polynomial 2r₁r₂,
which we can view as a simpler problem to solve since it does not
contain any square terms. In this case what we are left with is
just twice the elementary symmetric polynomial r₁r₂.
We can handle the general case in the same way - at each step
eliminate the highest order term with any additional terms created
being of lower order. In order to do this we need a way of
specifying what is meant by higher order terms. We start by
writing the terms such that their exponents are in descending order.
Because the polynomials are symmetric we can always select a term
listed with r₁ first, r₂ second and so on. Consider the term r₁¹⁰r₂⁸r₃⁶. This term will be greater than any term whose leading exponent is less than 10 like r₁⁹r₂⁸r₃⁷r₄⁶.
It will also be greater than any term whose leading exponent is
10 but whose second order exponent is either less than 8 or is
non-existent like r₁¹⁰r₂⁷r₃⁶r₄⁵ or r₁¹⁰. It should be clear how to continue.

We will now see how to replace the terms r₁¹⁰r₂⁸r₃⁶ with smaller order terms. r₁, r₂ and r₃ all have exponents of at least 6, so we start by factoring out (r₁r₂r₃)⁶. r₁¹⁰r₂⁸r₃⁶ = (r₁r₂r₃)⁶ r₁⁴r₂². In the portion of the term remaing after factoring, r₁⁴r₂², r1 and r2 both have exponents of at least two, so we can factor out (r₁r₂)², giving us

r₁¹⁰r₂⁸r₃⁶ = (r₁r₂r₃)⁶ (r₁r₂)²r₁².

How should this be interpreted? Suppose that the original
polynomial was third order so there are only three roots. We
would get rid of the terms like r_i¹⁰r_j⁸rk⁶ with the following polynomial in elementary symmetric polynomials:

(r₁r₂r₃)⁶(r₁r₂ + r₂r₃ + r₁r₃)²(r₁ + r₂ + r₃)². The highest order terms from (r₁ + r₂ + r₃)² are the terms r_i². The highest order terms of (r₁r₂ + r₂r₃ + r₁r₃)² are the terms (r_ir_j)² . We therefore have succeeded in producing terms like ri¹⁰rj⁸rk⁶ and
introducing only lower order terms. The method can now be applied to
the remaining terms. We will eventually come to a halt because at
each stage the remaining terms are of lower order.

General Symmetric Polynomials

In order to motivate the discussion of
symmetric polynomials, they were introduced as being symmetric
polynomials in the roots of some other polynomial. The above
discussion shows how any symmetric polynomial in x₁, x₂, ..., x_n. can be expressed in terms of the elementary symmetric polynomials sum(x_ix_j...x_k).}