Symmetric Polynomials


Consider the quadratic equation given by y = (x - 2)(x - 3) = x2 - 5x +6.  In this case we know that the roots of the eqation are x = 2 and x = 3. Suppose, however, that we did not know what the roots were.  Lets'see what information we can determine about the roots.  We know that a second order equation has two roots and that these roots may be real or imaginary.  Calling the roots r1 and r2, we also know that the polynomial can be factored as (x - r1)(x - r2) = x2 - (r1 + r2)x + r1r2 = x2 - 5x +6.  Equating coefficients we havve:

-(r1 + r2 ) = -5
r1 + r=  5 (1)

r1r2 = 6        (2)

Exercise: Express r12  + r22 in terms of (r1 + r2 ) and r1r2.

Solution: r12  + r22 = (r1 + r2 )2 - 2r1r2 . That means that substituting (1) and (2)  into the above equation we can determine the value of  r12 + r22  without having to solve for the roots.   r12  + r22 =  52 - 2*6 = 13, which you can verify by substituting the actual root values of 2 and 3 into r12  + r22.

With regard to symmetry there are several things to notice.  Firstly, (x - r1)(x - r2) is symmetric with respect to r1 and  r2.  As a result, the expressions for the coefficients, r1 + r2 and r1r2, are also symmetric with respect to r1 and  r2.   Finally, any polynomial expression of r1 + r2 and r1r2, like (r1 + r2 )2 - 2r1r2   used to compute r12  + r22  must result in a polynomial that is symmetric in r1 and  r2.  In this case the symmetric polynmial computed was r12  + r22.  If you had been asked to use r1 + r2 and r1r2  to compute r12 + 2r22 you could not do it because r12 + 2r22  is not symmetric in r1 and  r2.

Exercise: The quadratic formula r = (-b +/ sqrt(b2 - 4ac))/ 2a can be derived by the method of completing the square.  
For the case where a = 1 this simplifies to r = (-b +/ sqrt(b2 - 4c))/ 2a.  Verify the formula substiituting -(r1 + r2 ) for b and r1r2 for  c.

Solution: We want to show that the two values obtained from the equation are r1 and r2.
 (-b +/ sqrt(b2 - 4c))/ 2a =( (r1 + r2 ) +/- sqrt((r1 + r2 )2 - 4r1r2))/2
Working with the sqrt portion, sqrt((r1 + r2 )2 - 4r1r2) = sqrt( r12 + 2r1r2 +r22 - 4r1r2) =  sqrt(r12 -2r1r2 + r22) = sqrt((r1 - r2)2 ) = r1 - r2.
Substituting into the equation, we get r = ( (r1 + r2 ) +/- (r1 - r2))/ 2 which gives values of r1 and r2.

What We Will Be Doing

To summarize what was done in the first exercise, we were able to compute the value of  a symmetric polynomial, r12  + r22, in the roots of the orignal polynomial in terms of the coefficients of the original polynomial. What will be shown is a proof due to Isaac Newton that is a generalization of this result.  Any symmetric polynomial of the roots of a given polynomial can be expressed as a polynomial of the coefficients of the the given polynomial.  Stated symbolically this means that given a polynomial
p(x) = a0 + a1X + a2X2 + ... an-1Xn-1 + anXn, the value of any symmetric polynomial of the roots r1, r2, ...rn can be computed as a polynomial in a0, a1, ... , an with no need to compute any of the roots.

General Expressions for the Coefficients of Polynomials

Given any polynomial equation
a0 + a1X + a2X2 + ... an-1Xn-1 + anXn  = 0, where an is not 0, we can divide through by an to get a polynomial in the form
p(x) = a0 + a1X + a2X2 + ... an-1Xn-1 + Xn  = 0.  We can therefore without loss of generality restrict ourselves to polynmials with leading coefficient equal to 1.
The polynomial can be factored in terms of its roots to get
(X - r1)(X - r2)...(X - rn)= 0 so
(X - r1)(X - r2)...(X - rn) =  a0 + a1X + a2X2 + ... an-1Xn-1 + Xn  

Equating the coefficients on both sides we get:
Sum(ri) = (r1 + r2 + ... rn) = an-1
Sum(rirj) = an-2  By this is meant that to find the coefficient of Xn-2 on the left side, find all of the ways of choosing n-2 roots and two X factors from the terms in parentheses.  If, for example, n was equal to 3 we would have a1 = (r1r2 + r1r3 + r2r3).

Similarly, Sum(rirjrk) = an-3.

If we continue in this way, the left side terms will include all the possible expressions of the form
Sum(r1r2...rk) where k ≤ n and each of the ri are distinct.  These expressions are polynomials in the ri and are in fact symmetric polynomials. They are referred to as elementary symmetric polynomials. Each one equates to one of the coefficients of  p(x).  Therefore showing that a every symmetric polynomial in the ri is a polynomial of the coefficients is equivalent to showing that every symmetric polynomial  can be expressed as a polynomial of the elementary symmetic polynomials.  The exercise showed that r12  + r22 = (r1 + r2 )2 - 2r1r2, which we noww see as a polynomial in the two elementary symmetric polynomials (r1 + r2 ) and r1r2.

Proof Outline
In computing r12  + r22,  a first approach is to use (r1 + r2 )2 to compute the squared terms.  We are left with the symmetric polynomial  2r1r2, which we can view as a simpler problem to solve since it does not contain any square terms.  In this case what we are left with is just twice the elementary symmetric polynomial r1r2.  We can handle the general case in the same way - at each step eliminate the highest order term with any additional terms created being of lower order.  In order to do this  we need a way of specifying what is meant by higher order terms.  We start by writing the terms such that their exponents are in descending order.  Because the polynomials are symmetric we can always select a term listed with r1 first, r2 second and so on.  Consider the term  r110r28r36.  This term will be greater than any term whose leading exponent is less than 10 like r19r28r37r46.  It will also be greater than any term whose leading exponent is 10 but whose second order exponent is either less than 8 or is non-existent like r110r27r36r45 or r110.  It should be clear how to continue.

We will now see how to replace the terms r110r28r36 with smaller order terms. r1, r2 and r3 all have exponents of at least 6, so we start by factoring out  (r1r2r3)6.     r110r28r36 =  (r1r2r3)6 r14r22. In the portion of the term remaing after factoring, r14r22, r1 and r2 both have exponents of at least two, so we can factor out  (r1r2)2, giving us
r110r28r36 =  (r1r2r3)6 (r1r2)2r12.  

How should this be interpreted?  Suppose that the original polynomial was third order so there are only three roots.  We would get rid of the terms like ri10rj8rk6 with the following polynomial in elementary symmetric polynomials:
(r1r2r3)6(r1r2 + r2r3 + r1r3)2(r1 + r2 + r3)2.   The highest order terms from (r1 + r2 + r3)2 are the terms  ri2.  The highest order terms of  (r1r2 + r2r3 + r1r3) are the terms (rirj)2 .  We therefore have succeeded in producing terms like ri10rj8rk6 and introducing only lower order terms. The method can now be applied to the remaining terms.  We will eventually come to a halt because at each stage the remaining terms are of lower order.

General Symmetric Polynomials

In order to motivate the discussion of symmetric polynomials, they were introduced as being symmetric polynomials in the roots of some other polynomial.  The above discussion shows how any symmetric polynomial in x1, x2, ..., xn. can be expressed in terms of the elementary symmetric polynomials sum(xixj...xk).