Comments
Description
Transcript
Quadratic and Hermitian forms
MATRICES AND VECTOR SPACES | exp A|. Moreover, by choosing the similarity transformation so that it diagonalises A, we have A = diag(λ1 , λ2 , . . . , λN ), and so | exp A| = | exp A | = | exp[diag(λ1 , λ2 , . . . , λN )]| = |diag(exp λ1 , exp λ2 , . . . , exp λN )| = N exp λi . i=1 Rewriting the final product of exponentials of the eigenvalues as the exponential of the sum of the eigenvalues, we find N N | exp A| = exp λi = exp λi = exp(Tr A), i=1 i=1 which gives the trace formula (8.104). 8.17 Quadratic and Hermitian forms Let us now introduce the concept of quadratic forms (and their complex analogues, Hermitian forms). A quadratic form Q is a scalar function of a real vector x given by Q(x) = x|A x, (8.105) for some real linear operator A . In any given basis (coordinate system) we can write (8.105) in matrix form as Q(x) = xT Ax, (8.106) where A is a real matrix. In fact, as will be explained below, we need only consider the case where A is symmetric, i.e. A = AT . As an example in a three-dimensional space, 1 1 3 x1 Q = xT Ax = x1 x2 x3 1 1 −3 x2 3 −3 −3 x3 = x21 + x22 − 3x23 + 2x1 x2 + 6x1 x3 − 6x2 x3 . (8.107) It is reasonable to ask whether a quadratic form Q = xT Mx, where M is any (possibly non-symmetric) real square matrix, is a more general definition. That this is not the case may be seen by expressing M in terms of a symmetric matrix A = 12 (M+MT ) and an antisymmetric matrix B = 12 (M−MT ) such that M = A+B. We then have Q = xT Mx = xT Ax + xT Bx. (8.108) However, Q is a scalar quantity and so Q = QT = (xT Ax)T + (xT Bx)T = xT AT x + xT BT x = xT Ax − xT Bx. (8.109) Comparing (8.108) and (8.109) shows that xT Bx = 0, and hence xT Mx = xT Ax, 288 8.17 QUADRATIC AND HERMITIAN FORMS i.e. Q is unchanged by considering only the symmetric part of M. Hence, with no loss of generality, we may assume A = AT in (8.106). From its definition (8.105), Q is clearly a basis- (i.e. coordinate-) independent quantity. Let us therefore consider a new basis related to the old one by an orthogonal transformation matrix S, the components in the two bases of any vector x being related (as in (8.91)) by x = Sx or, equivalently, by x = S−1 x = ST x. We then have Q = xT Ax = (x )T ST ASx = (x )T A x , where (as expected) the matrix describing the linear operator A in the new basis is given by A = ST AS (since ST = S−1 ). But, from the last section, if we choose as S the matrix whose columns are the normalised eigenvectors of A then A = ST AS is diagonal with the eigenvalues of A as the diagonal elements. (Since A is symmetric, its normalised eigenvectors are orthogonal, or can be made so, and hence S is orthogonal with S−1 = ST .) In the new basis Q = xT Ax = (x )T Λx = λ1 x1 + λ2 x2 + · · · + λN xN , 2 2 2 (8.110) where Λ = diag(λ1 , λ2 , . . . , λN ) and the λi are the eigenvalues of A. It should be noted that Q contains no cross-terms of the form x1 x2 . Find an orthogonal transformation that takes the quadratic form (8.107) into the form λ1 x1 + λ2 x2 + λ3 x3 . 2 2 2 The required transformation matrix S has the normalised eigenvectors of A as its columns. We have already found these in section 8.14, and so we can write immediately √ √ 3 1 √2 1 √ S= √ 3 −√ 2 −1 , 6 0 2 −2 which is easily verified as being orthogonal. Since the eigenvalues of A are λ = 2, 3, and −6, the general result already proved shows that the transformation x = Sx will carry (8.107) into the form 2x1 2 + 3x2 2 − 6x3 2 . This may be verified most easily by writing out the inverse transformation x = S−1 x = ST x and substituting. The inverse equations are √ x1 = (x1 + x2 )/ 2, √ x2 = (x1 − x2 + x3 )/ 3, (8.111) √ x3 = (x1 − x2 − 2x3 )/ 6. If these are substituted into the form Q = 2x1 2 + 3x2 2 − 6x3 2 then the original expression (8.107) is recovered. In the definition of Q it was assumed that the components x1 , x2 , x3 and the matrix A were real. It is clear that in this case the quadratic form Q ≡ xT Ax is real 289 MATRICES AND VECTOR SPACES also. Another, rather more general, expression that is also real is the Hermitian form H(x) ≡ x† Ax, (8.112) where A is Hermitian (i.e. A† = A) and the components of x may now be complex. It is straightforward to show that H is real, since H ∗ = (H T )∗ = x† A† x = x† Ax = H. With suitable generalisation, the properties of quadratic forms apply also to Hermitian forms, but to keep the presentation simple we will restrict our discussion to quadratic forms. A special case of a quadratic (Hermitian) form is one for which Q = xT Ax is greater than zero for all column matrices x. By choosing as the basis the eigenvectors of A we have Q in the form Q = λ1 x21 + λ2 x22 + λ3 x23 . The requirement that Q > 0 for all x means that all the eigenvalues λi of A must be positive. A symmetric (Hermitian) matrix A with this property is called positive definite. If, instead, Q ≥ 0 for all x then it is possible that some of the eigenvalues are zero, and A is called positive semi-definite. 8.17.1 The stationary properties of the eigenvectors Consider a quadratic form, such as Q(x) = x|A x, equation (8.105), in a fixed basis. As the vector x is varied, through changes in its three components x1 , x2 and x3 , the value of the quantity Q also varies. Because of the homogeneous form of Q we may restrict any investigation of these variations to vectors of unit length (since multiplying any vector x by any scalar k simply multiplies the value of Q by a factor k 2 ). Of particular interest are any vectors x that make the value of the quadratic form a maximum or minimum. A necessary, but not sufficient, condition for this is that Q is stationary with respect to small variations ∆x in x, whilst x|x is maintained at a constant value (unity). In the chosen basis the quadratic form is given by Q = xT Ax and, using Lagrange undetermined multipliers to incorporate the variational constraints, we are led to seek solutions of ∆[xT Ax − λ(xT x − 1)] = 0. (8.113) This may be used directly, together with the fact that (∆xT )Ax = xT A ∆x, since A is symmetric, to obtain Ax = λx 290 (8.114) 8.17 QUADRATIC AND HERMITIAN FORMS as the necessary condition that x must satisfy. If (8.114) is satisfied for some eigenvector x then the value of Q(x) is given by Q = xT Ax = xT λx = λ. (8.115) However, if x and y are eigenvectors corresponding to different eigenvalues then they are (or can be chosen to be) orthogonal. Consequently the expression yT Ax is necessarily zero, since yT Ax = yT λx = λyT x = 0. (8.116) Summarising, those column matrices x of unit magnitude that make the quadratic form Q stationary are eigenvectors of the matrix A, and the stationary value of Q is then equal to the corresponding eigenvalue. It is straightforward to see from the proof of (8.114) that, conversely, any eigenvector of A makes Q stationary. Instead of maximising or minimising Q = xT Ax subject to the constraint T x x = 1, an equivalent procedure is to extremise the function λ(x) = xT Ax . xT x Show that if λ(x) is stationary then x is an eigenvector of A and λ(x) is equal to the corresponding eigenvalue. We require ∆λ(x) = 0 with respect to small variations in x. Now 1 T T (x x) ∆x Ax + xT A ∆x − xT Ax ∆xT x + xT ∆x (xT x)2 T T 2∆xT Ax x Ax ∆x x = − 2 , xT x xT x xT x ∆λ = since xT A ∆x = (∆xT )Ax and xT ∆x = (∆xT )x. Thus ∆λ = 2 ∆xT [Ax − λ(x)x]. xT x Hence, if ∆λ = 0 then Ax = λ(x)x, i.e. x is an eigenvector of A with eigenvalue λ(x). Thus the eigenvalues of a symmetric matrix A are the values of the function λ(x) = xT Ax xT x at its stationary points. The eigenvectors of A lie along those directions in space for which the quadratic form Q = xT Ax has stationary values, given a fixed magnitude for the vector x. Similar results hold for Hermitian matrices. 291