Quadratic and Hermitian forms

by taratuta

on 20 января 2017

Category: Documents

>> Downloads: 12

views

Report

Comments

Description

Download Quadratic and Hermitian forms

Transcript

Quadratic and Hermitian forms

MATRICES AND VECTOR SPACES
| exp A|. Moreover, by choosing the similarity transformation so that it diagonalises A, we
have A = diag(λ1 , λ2 , . . . , λN ), and so
| exp A| = | exp A | = | exp[diag(λ1 , λ2 , . . . , λN )]| = |diag(exp λ1 , exp λ2 , . . . , exp λN )| =
N
exp λi .
i=1
Rewriting the ﬁnal product of exponentials of the eigenvalues as the exponential of the
sum of the eigenvalues, we ﬁnd
N N
| exp A| =
exp λi = exp
λi = exp(Tr A),
i=1
i=1
which gives the trace formula (8.104). 8.17 Quadratic and Hermitian forms
Let us now introduce the concept of quadratic forms (and their complex analogues, Hermitian forms). A quadratic form Q is a scalar function of a real vector
x given by
Q(x) = x|A x,
(8.105)
for some real linear operator A . In any given basis (coordinate system) we can
write (8.105) in matrix form as
Q(x) = xT Ax,
(8.106)
where A is a real matrix. In fact, as will be explained below, we need only consider
the case where A is symmetric, i.e. A = AT . As an example in a three-dimensional
space,



1 1
3
x1
Q = xT Ax = x1 x2 x3  1 1 −3   x2 
3 −3 −3
x3
= x21 + x22 − 3x23 + 2x1 x2 + 6x1 x3 − 6x2 x3 .
(8.107)
It is reasonable to ask whether a quadratic form Q = xT Mx, where M is any
(possibly non-symmetric) real square matrix, is a more general deﬁnition. That
this is not the case may be seen by expressing M in terms of a symmetric matrix
A = 12 (M+MT ) and an antisymmetric matrix B = 12 (M−MT ) such that M = A+B.
We then have
Q = xT Mx = xT Ax + xT Bx.
(8.108)
However, Q is a scalar quantity and so
Q = QT = (xT Ax)T + (xT Bx)T = xT AT x + xT BT x = xT Ax − xT Bx.
(8.109)
Comparing (8.108) and (8.109) shows that xT Bx = 0, and hence xT Mx = xT Ax,
288
8.17 QUADRATIC AND HERMITIAN FORMS
i.e. Q is unchanged by considering only the symmetric part of M. Hence, with no
loss of generality, we may assume A = AT in (8.106).
From its deﬁnition (8.105), Q is clearly a basis- (i.e. coordinate-) independent
quantity. Let us therefore consider a new basis related to the old one by an
orthogonal transformation matrix S, the components in the two bases of any
vector x being related (as in (8.91)) by x = Sx or, equivalently, by x = S−1 x =
ST x. We then have
Q = xT Ax = (x )T ST ASx = (x )T A x ,
where (as expected) the matrix describing the linear operator A in the new
basis is given by A = ST AS (since ST = S−1 ). But, from the last section, if we
choose as S the matrix whose columns are the normalised eigenvectors of A then
A = ST AS is diagonal with the eigenvalues of A as the diagonal elements. (Since
A is symmetric, its normalised eigenvectors are orthogonal, or can be made so,
and hence S is orthogonal with S−1 = ST .)
In the new basis
Q = xT Ax = (x )T Λx = λ1 x1 + λ2 x2 + · · · + λN xN ,
2
2
2
(8.110)
where Λ = diag(λ1 , λ2 , . . . , λN ) and the λi are the eigenvalues of A. It should be
noted that Q contains no cross-terms of the form x1 x2 .
Find an orthogonal transformation that takes the quadratic form (8.107) into the form
λ1 x1 + λ2 x2 + λ3 x3 .
2
2
2
The required transformation matrix S has the normalised eigenvectors of A as its columns.
We have already found these in section 8.14, and so we can write immediately
√
 √

3
1
√2
1  √
S= √
3 −√ 2 −1  ,
6
0
2
−2
which is easily veriﬁed as being orthogonal. Since the eigenvalues of A are λ = 2, 3, and
−6, the general result already proved shows that the transformation x = Sx will carry
(8.107) into the form 2x1 2 + 3x2 2 − 6x3 2 . This may be veriﬁed most easily by writing out
the inverse transformation x = S−1 x = ST x and substituting. The inverse equations are
√
x1 = (x1 + x2 )/ 2,
√
x2 = (x1 − x2 + x3 )/ 3,
(8.111)
√
x3 = (x1 − x2 − 2x3 )/ 6.
If these are substituted into the form Q = 2x1 2 + 3x2 2 − 6x3 2 then the original expression
(8.107) is recovered. In the deﬁnition of Q it was assumed that the components x1 , x2 , x3 and the
matrix A were real. It is clear that in this case the quadratic form Q ≡ xT Ax is real
289
MATRICES AND VECTOR SPACES
also. Another, rather more general, expression that is also real is the Hermitian
form
H(x) ≡ x† Ax,
(8.112)
where A is Hermitian (i.e. A† = A) and the components of x may now be complex.
It is straightforward to show that H is real, since
H ∗ = (H T )∗ = x† A† x = x† Ax = H.
With suitable generalisation, the properties of quadratic forms apply also to Hermitian forms, but to keep the presentation simple we will restrict our discussion
to quadratic forms.
A special case of a quadratic (Hermitian) form is one for which Q = xT Ax
is greater than zero for all column matrices x. By choosing as the basis the
eigenvectors of A we have Q in the form
Q = λ1 x21 + λ2 x22 + λ3 x23 .
The requirement that Q > 0 for all x means that all the eigenvalues λi of A must
be positive. A symmetric (Hermitian) matrix A with this property is called positive
deﬁnite. If, instead, Q ≥ 0 for all x then it is possible that some of the eigenvalues
are zero, and A is called positive semi-deﬁnite.
8.17.1 The stationary properties of the eigenvectors
Consider a quadratic form, such as Q(x) = x|A x, equation (8.105), in a ﬁxed
basis. As the vector x is varied, through changes in its three components x1 , x2
and x3 , the value of the quantity Q also varies. Because of the homogeneous
form of Q we may restrict any investigation of these variations to vectors of unit
length (since multiplying any vector x by any scalar k simply multiplies the value
of Q by a factor k 2 ).
Of particular interest are any vectors x that make the value of the quadratic
form a maximum or minimum. A necessary, but not suﬃcient, condition for this
is that Q is stationary with respect to small variations ∆x in x, whilst x|x is
maintained at a constant value (unity).
In the chosen basis the quadratic form is given by Q = xT Ax and, using
Lagrange undetermined multipliers to incorporate the variational constraints, we
are led to seek solutions of
∆[xT Ax − λ(xT x − 1)] = 0.
(8.113)
This may be used directly, together with the fact that (∆xT )Ax = xT A ∆x, since A
is symmetric, to obtain
Ax = λx
290
(8.114)
8.17 QUADRATIC AND HERMITIAN FORMS
as the necessary condition that x must satisfy. If (8.114) is satisﬁed for some
eigenvector x then the value of Q(x) is given by
Q = xT Ax = xT λx = λ.
(8.115)
However, if x and y are eigenvectors corresponding to diﬀerent eigenvalues then
they are (or can be chosen to be) orthogonal. Consequently the expression yT Ax
is necessarily zero, since
yT Ax = yT λx = λyT x = 0.
(8.116)
Summarising, those column matrices x of unit magnitude that make the
quadratic form Q stationary are eigenvectors of the matrix A, and the stationary
value of Q is then equal to the corresponding eigenvalue. It is straightforward
to see from the proof of (8.114) that, conversely, any eigenvector of A makes Q
stationary.
Instead of maximising or minimising Q = xT Ax subject to the constraint
T
x x = 1, an equivalent procedure is to extremise the function
λ(x) =
xT Ax
.
xT x
Show that if λ(x) is stationary then x is an eigenvector of A and λ(x) is equal to the
corresponding eigenvalue.
We require ∆λ(x) = 0 with respect to small variations in x. Now
1 T T
(x x) ∆x Ax + xT A ∆x − xT Ax ∆xT x + xT ∆x
(xT x)2
T T
2∆xT Ax
x Ax ∆x x
=
−
2
,
xT x
xT x
xT x
∆λ =
since xT A ∆x = (∆xT )Ax and xT ∆x = (∆xT )x. Thus
∆λ =
2
∆xT [Ax − λ(x)x].
xT x
Hence, if ∆λ = 0 then Ax = λ(x)x, i.e. x is an eigenvector of A with eigenvalue λ(x). Thus the eigenvalues of a symmetric matrix A are the values of the function
λ(x) =
xT Ax
xT x
at its stationary points. The eigenvectors of A lie along those directions in space
for which the quadratic form Q = xT Ax has stationary values, given a ﬁxed
magnitude for the vector x. Similar results hold for Hermitian matrices.
291