...

14 Chapter 14 Matrix Equalities and Inequalities

by taratuta

on
Category: Documents
43

views

Report

Comments

Transcript

14 Chapter 14 Matrix Equalities and Inequalities
14
Matrix Equalities
and Inequalities
Eigenvalue Equalities and Inequalities . . . . . . . . . . . . . . .
Spectrum Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Inequalities for the Singular Values and the
Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.4 Basic Determinantal Relations . . . . . . . . . . . . . . . . . . . . . .
14.5 Rank and Nullity Equalities and Inequalities . . . . . . . .
14.6 Useful Identities for the Inverse . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.1
14.2
14.3
Michael Tsatsomeros
Washington State University
14-1
14-5
14-8
14-10
14-12
14-15
14-17
In this chapter, we have collected classical equalities and inequalities regarding the eigenvalues, the singular
values, the determinant, and the dimensions of the fundamental subspaces of a matrix. Also included is a
section on identities for matrix inverses. The majority of these results can be found in comprehensive books
on linear algebra and matrix theory, although some are of specialized nature. The reader is encouraged to
consult, e.g., [HJ85], [HJ91], [MM92], or [Mey00] for details, proofs, and further bibliography.
14.1
Eigenvalue Equalities and Inequalities
The majority of the facts in this section concern general matrices; however, some classical and frequently
used results on eigenvalues of Hermitian and positive definite matrices are also included. For the latter,
see also Chapter 8 and [HJ85, Chap. 4]. Many of the definitions and some of the facts in this section are
also given in Section 4.3.
Facts:
1. [HJ85, Chap. 1] Let A ∈ F n×n , where F = C or any algebraically closed field. Let p A (x) =
det(x I − A) be the characteristic polynomial of A, and λ1 , λ2 , . . . , λn be the eigenvalues of A. Denote
by Sk (λ1 , . . . , λn )(k = 1, 2, . . . , n) the kth elementary symmetric function of the eigenvalues (here
abbreviated Sk (λ)), and by Sk (A) the sum of all k × k principal minors of A. Then
r The characteristic polynomial satisfies
p A (x) = (x − λ1 )(x − λ2 ) · · · (x − λn )
= x n − S1 (λ)x n−1 + S2 (λ)x n−2 + · · · + (−1)n−1 Sn−1 (λ)x + (−1)n Sn (λ)
= x n − S1 (A)x n−1 + S2 (A)x n−2 + · · · + (−1)n−1 Sn−1 x + (−1)n Sn (A).
14-1
14-2
Handbook of Linear Algebra
r S (λ) = S (λ , . . . , λ ) = S (A)(k = 1, 2, . . . , n).
k
k 1
n
k
r trA = S (A) = n a = n λ
and det A = Sn (A) = in=1 λi .
1
i =1 ii
i =1 i
2. [HJ85, (1.2.13)] Let A(i ) be obtained from A ∈ Cn×n by deleting row and column i . Then
n
d
p A (x) =
p A(i ) (x).
dx
i =1
Facts 3 to 9 are collected, together with historical commentary, proofs, and further references, in
[MM92, Chap. III].
3. (Hirsch and Bendixson) Let A = [aij ] ∈ Cn×n and λ be an eigenvalue of A. Denote B = [bij ] =
(A + A∗ )/2 and C = [c ij ] = (A − A∗ )/(2i ). Then the following inequalities hold:
|λ| ≤ n max |aij |,
i, j
|Reλ| ≤ n max |bij |,
i, j
|Imλ| ≤ n max |c ij |.
i, j
Moreover, if A + AT ∈ Rn×n , then
|Imλ| ≤ max |c ij |
i, j
n(n − 1)
.
2
4. (Pick’s inequality) Let A = [aij ] ∈ Rn×n and λ be an eigenvalue of A. Denote C = [c ij ] =
(A − AT )/2. Then
|Imλ| ≤ max |c ij | cot
i, j
π
2n
.
5. Let A = [aij ] ∈ Cn×n and λ be an eigenvalue of A. Denote B = [bij ] = (A + A∗ )/2 and
C = [c ij ] = (A − A∗ )/(2i ). Then the following inequalities hold:
min{µ : µ ∈ σ (B)} ≤ Reλ ≤ max{µ : µ ∈ σ (B)},
min{ν : ν ∈ σ (C )} ≤ Imλ ≤ max{ν : ν ∈ σ (C )}.
6. (Schur’s inequality) Let A = [aij ] ∈ Cn×n have eigenvalues λ j ( j = 1, 2, . . . , n). Then
n
|λ j |2 ≤
j =1
n
|aij |2
i, j =1
with equality holding if and only if A is a normal matrix (i.e., A∗ A = AA∗ ). (See Section 7.2 for
more information on normal matrices.)
7. (Browne’s Theorem) Let A = [aij ] ∈ Cn×n and λ j ( j = 1, 2, . . . , n) be the eigenvalues of A ordered
so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Let also σ1 ≥ σ2 ≥ · · · ≥ σn be the singular values of A, which
are real and nonnegative. (See Section 5.6 for the definition.) Then
σn ≤ |λ j | ≤ σ1
( j = 1, 2, . . . , n).
In fact, the following more general statement holds:
k
i =1
σn−i +1 ≤
k
j =1
|λt j | ≤
k
σi ,
i =1
for every k ∈ {1, 2, . . . , n} and every k-tuple (t1 , t2 , . . . , tk ) of strictly increasing elements chosen
from {1, 2, . . . , n} .
14-3
Matrix Equalities and Inequalities
8. Let A ∈ Cn×n and Ri , C i (i = 1, 2, . . . , n) denote the sums of the absolute values of the entries of
A in row i and column i , respectively. Also denote
R = max{Ri }
i
and C = max{C i }.
i
Let λ be an eigenvalue of A. Then the following inequalities hold:
R+C
Ri + C i
≤
,
2
2
√
√
|λ| ≤ maxi Ri C i ≤ RC ,
|λ| ≤ maxi
|λ| ≤ min{R, C }.
9. (Schneider’s Theorem) Let A = [aij ] ∈ Cn×n and λ j ( j = 1, 2, . . . , n) be the eigenvalues of A
ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Let x = [xi ] be any vector in Rn with positive entries and
define the quantities
ri =
n
|aij |x j
j =1
(i = 1, 2, . . . , n).
xi
Then
k
j =1
|λ j | ≤
k
ri j
(k = 1, 2, . . . , n)
j =1
for all n-tuples (i 1 , i 2 , . . . , i n ) of elements from {1, 2, . . . , n} such that
r i1 ≥ r i2 ≥ · · · ≥ r in .
10. [HJ85, Theorem 8.1.18] For A = [aij ] ∈ Cn×n , let its entrywise absolute value be denoted by
|A| = [|aij |]. Let B ∈ Cn×n and assume that |A| ≤ B (entrywise). Then
ρ(A) ≤ ρ(|A|) ≤ ρ(B).
11. [HJ85, Chap. 5, Sec. 6] Let A ∈ Cn×n and · denote any matrix norm on Cn×n . (See
Chapter 37). Then
ρ(A) ≤ A
and
lim Ak 1/k = ρ(A).
k−→∞
12. [HJ91, Corollary 1.5.5] Let A = [aij ] ∈ Cn×n . The numerical range of A ∈ Cn×n is W(A) =
{v ∗ Av ∈ C : v ∈ Cn with v ∗ v = 1} and the numerical radius of A ∈ Cn×n is r (A) = max{|z| : z ∈
W(A)}. (See Chapter 18 for more information about the numerical range and numerical radius.)
Then the following inequalities hold:
r (Am ) ≤ [r (A)]m
(m = 1, 2, . . . ),
A1 + A∞
,
ρ(A) ≤ r (A) ≤
2
A2
≤ r (A) ≤ A2 ,
2
|A| + |A|T
(where |A| = [|aij |]).
r (A) ≤ r (|A|) =
2
14-4
Handbook of Linear Algebra
Moreover, the following statements are equivalent:
(a) r (A) = A2 .
(b) ρ(A) = A2 .
(c) An 2 = An2 .
(d) Ak 2 = Ak2
(k = 1, 2, . . . ).
Facts 13 to 15 below, along with proofs, can be found in [HJ85, Chap. 4].
13. (Rayleigh–Ritz) Let A ∈ Cn×n be Hermitian (i.e., A = A∗ ) with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn .
Then
(a) λn x ∗ x ≤ x ∗ Ax ≤ λ1 x ∗ x for all x ∈ Cn .
x ∗ Ax
(b) λ1 = max ∗ = max
x ∗ Ax.
x ∗ x=1
x=0 x x
x ∗ Ax
(c) λn = min ∗ = min
x ∗ Ax.
x ∗ x=1
x=0 x x
14. (Courant–Fischer) Let A ∈ Cn×n be Hermitian with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn . Let
k ∈ {1, 2, . . . , n}. Then
λk =
=
min
w 1 ,w 2 ,...,w k−1 ∈Cn
max
w 1 ,w 2 ,...,w n−k ∈C
n
maxn
x=0,x∈C
x⊥w 1 ,w 2 ,...,w k−1
minn
x=0,x∈C
x⊥w 1 ,w 2 ,...,w n−k
x ∗ Ax
x∗x
x ∗ Ax
.
x∗x
15. (Weyl) Let A, B ∈ Cn×n be Hermitian. Consider the eigenvalues of A, B, and A + B, denoted by λi (A), λi (B), λi (A + B), respectively, arranged in decreasing order. Then the following
hold:
(a) For each k ∈ {1, 2, . . . , n},
λk (A) + λn (A) ≤ λk (A + B) ≤ λk (A) + λ1 (B).
(b) For every pair j, k ∈ {1, 2, . . . , n} such that j + k ≥ n + 1,
λ j +k−n (A + B) ≥ λ j (A) + λk (B).
(c) For every pair j, k ∈ {1, 2, . . . , n} such that j + k ≤ n + 1,
λ j (A) + λk (B) ≥ λ j +k−1 (A + B).
Examples:
1. To illustrate several of the facts in this section, consider
⎡
1 −1
⎢
1
⎢ 3
A=⎢
0
⎣ 1
−1
2
0
−2
0
1
⎤
2
1⎥
⎥
⎥,
−1⎦
0
whose spectrum, σ (A), consists of
λ1 = −0.7112 + 2.6718i, λ2 = −0.7112 − 2.6718i, λ3 = 2.5506, λ4 = 0.8719.
Note that the eigenvalues are ordered decreasingly with respect to their moduli (absolute values):
|λ1 | = |λ2 | = 2.7649 > |λ3 | = 2.5506 > |λ4 | = 0.8719.
14-5
Matrix Equalities and Inequalities
The maximum and minimum eigenvalues of ( A + A∗ )/2 are 2.8484 and −1.495. Note that, as
required by Fact 5, for every λ ∈ σ (A),
−1.495 ≤ |λ| ≤ 2.8484.
To illustrate Fact 7, let (t1 , t2 ) = (1, 3) and compute the singular values of A:
σ1 = 4.2418, σ2 = 2.5334, σ3 = 1.9890, σ4 = 0.7954.
Then, indeed,
σ4 σ3 = 1.5821 ≤ |λ1 ||λ3 | = 7.0522 ≤ σ1 σ2 = 10.7462.
Referring to the notation in Fact 8, we have C = 6 and R = 7. The spectral radius of A is
ρ(A) = 2.7649 and, thus, the modulus of every eigenvalue of A is indeed bounded above by the
quantities
√
13
R+C
=
= 6.5,
2
2
RC = 6.4807,
min{R, C } = 6.
Letting B denote the entrywise absolute value of A, Facts 10 and 11 state that
and ρ(A) = 2.7649 ≤ A2 = 4.2418.
ρ(A) = 2.7649 ≤ ρ(B) = 4.4005
Examples related to Fact 12 and the numerical range are found in Chapter 18. See also Example 2
that associates the numerical range with the location of the eigenvalues.
2. Consider the matrix
⎡
1
⎢
A = ⎣0
0
0
0
0
⎤
0
⎥
1⎦
0
and note that for every integer m ≥ 2, Am consists of zero entries, except for its (1, 1) entry that is
equal to 1. One may easily verify that
A∞ = A1 = A2 = 1.
ρ(A) = 1,
By Fact 12, it follows that r (A) = 1 and all of the equivalent conditions (a) to (d) in that fact hold,
despite A not being a normal matrix.
14.2
Spectrum Localization
This section presents results on classical inclusion regions for the eigenvalues of a matrix. The following
facts, proofs, and details, as well as additional references, can be found in [MM92, Chap. III, Sec. 2], [HJ85,
Chap. 6], and [Bru82].
Facts:
1. (Geršgorin) Let A = [aij ] ∈ Cn×n and define the quantities
Ri =
n
|aij |
(i = 1, 2, . . . , n).
j =1
j =i
Consider the Geršgorin discs (centered at aii with radii Ri ),
Di = {z ∈ C : |z − aii | ≤ Ri }
(i = 1, 2, . . . , n).
14-6
Handbook of Linear Algebra
Then all the eigenvalues of A lie in the union of the Geršgorin discs; that is,
σ (A) ⊂
n
Di .
i =1
Moreover, if the union of k Geršgorin discs, G , forms a connected region disjoint from the remaining
n − k discs, then G contains exactly k eigenvalues of A (counting algebraic multiplicities).
2. (Lévy–Desplanques) Let A = [aij ] ∈ Cn×n be a strictly diagonally dominant matrix, namely,
|aii | >
n
|aij |
(i = 1, 2, . . . , n).
j =1
j =i
Then A is an invertible matrix.
3. (Brauer) Let A = [aij ] ∈ Cn×n and define the quantities
Ri =
n
|aij |
(i = 1, 2, . . . , n).
j =1
j =i
Consider the ovals of Cassini, which are defined by
Vi, j = {z ∈ C : |z − aii ||z − a j j | ≤ Ri R j }
(i, j = 1, 2, . . . , n, i = j ).
Then all the eigenvalues of A lie in the union of the ovals of Cassini; that is,
n
σ (A) ⊂
Vi, j .
i, j =1
i = j
4. [VK99, Eq. 3.1] Denoting the union of the Geršgorin discs of A ∈ Cn×n by (A) (see Fact 1) and
the union of the ovals of Cassini of A by K (A) (see Fact 2), we have that
σ (A) ⊂ K (A) ⊆ (A).
That is, the ovals of Cassini provided at least as good a localization for the eigenvalues of A as do
the Geršgorin discs.
5. Let A = [aij ] ∈ Cn×n such that
|aii ||akk | >
n
j =1
j =i
|aij |
n
|ak j |
(i, k = 1, 2, . . . , n, i = k).
j =1
j =k
Then A is an invertible matrix.
6. Facts 1 to 5 can also be stated in terms of column sums instead of row sums.
7. (Ostrowski) Let A = [aij ] ∈ Cn×n and α ∈ [0, 1]. Define the quantities
Ri =
n
j =1
j =i
|aij |,
Ci =
n
|a j i |
(i = 1, 2, . . . , n).
j =1
j =i
Then all the eigenvalues of A lie in the union of the discs
Di (α) = z ∈ C : |z − aii | ≤ Riα C i1−α
(i = 1, 2, . . . , n);
14-7
Matrix Equalities and Inequalities
that is,
σ (A) ⊂
n
Di (α).
i =1
8. Let A ∈ Cn×n and consider the spectrum of A, σ (A), as well as its numerical range, W(A). Then
σ (A) ⊂ W(A).
In particular, if A is a normal matrix (i.e., A∗ A = AA∗ ), then W(A) is exactly equal to the convex
hull of the eigenvalues of A.
Examples:
1. To illustrate Fact 1 (see also Facts 3 and 4) let
⎡
3i
⎢−1
⎢
⎢
A=⎢ 1
⎢
⎣ 0
1
1
2i
2
−1
0
⎤
0.5 −1 0
1.5
0 0⎥
⎥
⎥
−7
0 1⎥
⎥
0 10 i ⎦
1 −1 1
and consider the Geršgorin discs of A displayed in Figure 14.1. Note that there are three connected
regions of discs that are disjoint of each other. Each region contains as many eigenvalues (marked
with +’s) as the number of discs it comprises. The ovals of Cassini are contained in the union of the
Geršgorin discs. In general, although it is easy to verify whether a complex number belongs to an oval
of Cassini or not, these ovals are generally difficult to draw. An interactive supplement to [VK99]
(accessible at: www.emis.math.ca/EMIS/journals/ETNA/vol.8.1999/pp15-20.
dir/gershini.html) allows one to draw and compare the Geršgorin discs and ovals of Cassini
of 3 × 3 matrices.
2. To illustrate Fact 8, consider the matrices
⎡
⎤
1 −1 2
⎢
⎥
A = ⎣ 2 −1 0⎦
−1
0 1
⎡
2 + 2i
⎢
B = ⎣1 + 2i
2+i
and
−2 − i
−1 − i
−2 − i
⎤
−1 − 2i
⎥
−1 − 2i ⎦ .
−1 − i
6
4
2
0
−2
−4
−6
−10
−5
0
5
FIGURE 14.1 The Geršgorin disks of A.
10
14-8
Handbook of Linear Algebra
2.5
1
2
0.8
1.5
0.6
1
0.4
0.5
0.2
0
0
−0.5
−0.2
−0.4
−1
−0.6
−1.5
−0.8
−2
−1
−2.5
−1.5
−1
−0.5
0
0.5
1
1.5
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
2
FIGURE 14.2 The numerical range of A and of the normal matrix B.
Note that B is a normal matrix with spectrum {1, i, −1 − i }. As indicated in Figure 14.2, the
numerical ranges of A and B contain the eigenvalues of A and B, respectively, marked with +’s.
The numerical range of B is indeed the convex hull of the eigenvalues.
14.3
Inequalities for the Singular Values and the Eigenvalues
The material in this section is a selection of classical inequalities about the singular values. Extensive details
and proofs, as well as a host of additional results on singular values, can be found in [HJ91, Chap. 3].
Definitions of many of the terms in this section are given in Section 5.6, Chapter 17, and Chapter 45;
additional facts and examples are also given there.
Facts:
1. Let A ∈ Cm×n and σ1 be its largest singular value. Then
σ1 = A2 .
2. Let A ∈ Cm×n , q = min{m, n}. Denote the singular values of A by σ1 ≥ σ2 ≥ · · · ≥ σq and let
k ∈ {1, 2, . . . , q }. Then
σk =
=
=
min
max
w 1 ,w 2 ,...,w k−1 ∈Cn
x2 =1,x∈Cn
max
min
w 1 ,w 2 ,...,w n−k ∈Cn
W⊆C
Ax2
x∈W
W⊆C
dim W=k
x2 =1,x∈Cn
x⊥w 1 ,w 2 ,...,w n−k
max Ax2
minn
dim W=n−k+1
= maxn
Ax2
x⊥w 1 ,w 2 ,...,w k−1
x2 =1
min Ax2 ,
x∈W
x2 =1
where the optimizations take place over all subspaces W ⊆ Cn of the indicated dimensions.
3. (Weyl) Let A ∈ Cn×n have singular values σ1 ≥ σ2 ≥ · · · ≥ σn and eigenvalues λ j ( j = 1, 2, . . . , n)
be ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then
|λ1 λ2 · · · λk | ≤ σ1 σ2 · · · σk
Equality holds in (3) when k = n.
(k = 1, 2, . . . , n).
14-9
Matrix Equalities and Inequalities
4. (A. Horn) Let A ∈ Cm× p and B ∈ C p×n . Let also r = min{m, p}, s = min{ p, n}, and q =
min{r, s }. Denote the singular values of A, B, and AB, respectively, by σ1 ≥ σ2 ≥ · · · ≥ σr ,
τ1 ≥ τ2 ≥ · · · ≥ τs , and χ1 ≥ χ2 ≥ · · · ≥ χq . Then
k
χi ≤
i =1
k
σi τi
(k = 1, 2, . . . , q ).
i =1
Equality holds if k = n = p = m. Also for any t > 0,
k
χit ≤
i =1
k
(σi τi )t
(k = 1, 2, . . . , q ).
i =1
5. Let A ∈ Cn×n have singular values σ1 ≥ σ2 ≥ · · · ≥ σn and eigenvalues λ j ( j = 1, 2, . . . , n)
ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then for any t > 0,
k
|λi |t ≤
i =1
k
σit
(k = 1, 2 . . . , n).
i =1
In particular, for t = 1 and k = n we obtain from the inequality above that
|trA| ≤
n
σi .
i =1
6. Let A, B ∈ Cm×n and q = min{m, n}. Denote the singular values of A, B, and A + B, respectively,
by σ1 ≥ σ2 ≥ · · · ≥ σq , τ1 ≥ τ2 ≥ · · · ≥ τq , and ψ1 ≥ ψ2 ≥ · · · ≥ ψq . Then the following
inequalities hold:
(a) ψi + j −1 ≤ σi + τ j
(b) |ρi − σi | ≤ τ1
(c)
k
i =1
ψi ≤
k
i =1
(1 ≤ i, j ≤ q ,
i + j ≤ q + 1).
(i = 1, 2, . . . , q ).
σi +
k
τi
(k = 1, 2, . . . , q ).
i =1
7. Let A ∈ Cn×n have eigenvalues λ j ( j = 1, 2, . . . , n) ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |.
Denote the singular values of Ak by σ1 (Ak ) ≥ σ2 (Ak ) ≥ · · · ≥ σn (Ak ). Then
lim [σi (Ak )]1/k = |λi |
k−→∞
(i = 1, 2, . . . , n).
Examples:
1. To illustrate Facts 1, 3, and 5, as well as gauge the bounds they provide, let
⎡
i
2 −1
⎢
1
⎢ 2 1+i
A=⎢
1
1
⎣2i
0 1−i
1
⎤
0
0⎥
⎥
⎥,
0⎦
0
whose eigenvalues and singular values ordered as required in Fact 3 are, respectively,
λ1 = 2.6775 + 1.0227i, λ2 = −2.0773 + 1.4685i, λ3 = 1.3998 − 0.4912i, λ4 = 0,
and
σ1 = 3.5278, σ2 = 2.5360, σ3 = 1.7673, σ4 = 0.
14-10
Handbook of Linear Algebra
According to Fact 1, A2 = σ1 = 3.5278. The following inequalities hold according to Fact 3:
7.2914 = |λ1 λ2 | ≤ σ1 σ2 = 8.9465.
10.8167 = |λ1 λ2 λ3 | ≤ σ1 σ2 σ3 = 15.8114.
Finally, applying Fact 5 with t = 3/2 and k = 2, we obtain the inequality
3/2
8.9099 = |λ1 |3/2 + |λ2 |3/2 ≤ σ1
3/2
+ σ2
= 10.6646.
For t = 1 and k = n, we get
2.8284 = |2 + 2i | = |tr(A)| ≤
4
σ j = 7.8311.
j =1
14.4
Basic Determinantal Relations
The purpose of this section is to review some basic equalities and inequalities regarding the determinant
of a matrix. For most of the facts mentioned here, see [Mey00, Chap. 6] and [HJ85, Chap. 0]. Definitions
of many of the terms in this section are given in Sections 4.1 and 4.2; additional facts and examples are
given there as well. Note that this section concludes with a couple of classical determinantal inequalities
for positive semidefinite matrices; see Section 8.4 or [HJ85, Chap. 7] for more on this subject.
Following are some of the properties of determinants of n × n matrices, as well as classical formulas for
the determinant of A and its submatrices.
Facts:
1. Let A ∈ F n×n . The following are basic facts about the determinant. (See also Chapter 4.1.)
r det A = det AT ; if F = C, then det A∗ = det A.
r If B is obtained from A by multiplying one row (or column) by a scalar c , then det B = c det A.
r det(cA) = c n det A for any scalar c .
r det(AB) = det A det B. If A is invertible, then det A−1 = (det A)−1 .
r If B is obtained from A by adding nonzero multiples of one row (respectively, column) to other
rows (respectively,
columns), then det B = det A.
sgn(σ )a1σ (1) a2σ (2) · · · anσ (n) , where the summation is taken over all permutations
σ ∈Sn
σ of n letters, and where sgn(σ ) denotes the sign of the permutation σ .
r Let A denote the (n −1)×(n −1) matrix obtained from A ∈ F n×n (n ≥ 2) by deleting row i and
ij
column j . The following formula is known as the Laplace expansion of det A along column j :
r det A =
det A =
n
(−1)i + j aij det Aij
( j = 1, 2, . . . , n).
i =1
2. (Cauchy–Binet) Let A ∈ F m,k , B ∈ F k×n and consider the matrix C = AB ∈ F m×n . Let also
α ⊆ {1, 2, . . . , m} and β ⊆ {1, 2, . . . , n} have cardinality r , where 1 ≤ r ≤ min{m, k, n}. Then the
submatrix of C whose rows are indexed by α and columns indexed by β satisfies
det C [α, β] =
det A[α, γ ] det B[γ , β].
γ ⊆{1,2,...,k}
|γ |=r
3. [Mey00, Sec. 6.1, p. 471] Let A = [aij (x)] be an n × n matrix whose entries are complex differentiable functions of x. Let Di (i = 1, 2, . . . , n) denote the n × n matrix obtained from A when the
entries in its i th row are replaced by their derivatives with respect to x. Then
n
d
det Di .
(det A) =
dx
i =1
14-11
Matrix Equalities and Inequalities
4. Let A = [aij ] be an n × n matrix and consider its entries as independent variables. Then
∂(det A)
= det A({i }, { j }) (i, j = 1, 2, . . . , n),
∂aij
where A({i }, { j }) denotes the submatrix of A obtained from A by deleting row i and column j .
5. [Mey00, Sec. 6.2] Let A ∈ F n×n and α ⊆ {1, 2, . . . , n}. If the submatrix of A whose rows and
columns are indexed by α, A[α], is invertible, then
det A = det A[α] det( A/A[α]).
In particular, if A is partitioned in blocks as
A=
A11
A21
A12
,
A22
where A11 and A22 are square matrices, then
det A =
det A11 det(A22 − A21 (A11 )−1 A12 ) if A11 is invertible
det A22 det(A11 − A12 (A22 )−1 A21 ) if A22 is invertible.
The following two facts for F = C can be found in [Mey00, Sec. 6.2, pp. 475, 483] and [Mey00,
Exer. 6.2.15, p. 485], respectively. The proofs are valid for arbitrary fields.
6. Let A ∈ F n×n be invertible and c , d ∈ F n . Then
det(A + c d T ) = det(A)(1 + d T A−1 c ).
7. Let A ∈ F n×n be invertible, x, y ∈ F n . Then
det
A
yT
x
= − det(A + xy T ).
−1
8. [HJ85, Theorem 7.8.1 and Corollary 7.8.2] (Hadamard’s inequalities) Let A = [aij ] ∈ Cn×n be a
positive semidefinite matrix. Then
det A ≤
n
aii .
i =1
If A is positive definite, equality holds if and only if A is a diagonal matrix.
For a general matrix B = [bij ] ∈ Cn×n , applying the above inequality to B ∗ B and B B ∗ , respectively,
one obtains
| det B| ≤
n
i =1
⎛
⎝
n
⎞1/2
2⎠
|bij |
and | det B| ≤
j =1
n
j =1
n
1/2
|bij |
2
.
i =1
If B is nonsingular, equalities hold, respectively, if and only if the rows or the columns of B are
orthogonal.
9. [HJ85, Theorem 7.8.3] (Fischer’s inequality) Consider a positive definite matrix
A=
X
Y∗
Y
,
Z
partitioned so that X, Z are square and nonvacuous. Then
det A ≤ det X det Z.
14-12
Handbook of Linear Algebra
Examples:
For examples relating to Facts 1, 2, and 5, see Chapter 4.
1. Let
⎡
1
3
1
2
⎢
A=⎣ 0
−1
⎤
−1
⎥
1⎦ and
2
⎡ ⎤
⎡
1
2
⎤
⎢ ⎥
⎢ ⎥
x = ⎣ 1⎦ , y = ⎣ 1⎦ .
−1
1
Then, as noted by Fact 7,
det
A
yT
⎡
1 3
⎢
x
⎢ 0 1
=⎢
−1
⎣−1 2
2 1
⎤
⎡
1
3
⎥
1⎥
⎢
⎥ = − det(A + xy T ) = ⎣2
1⎦
1
−1
−1
1
2
−1
4
2
3
⎤
−2
⎥
0⎦ = 10.
1
Next, letting c = [121]T and d = [0 − 1 − 1]T , by Fact 6, we have
det(A + c d T ) = det(A)(1 + d T A−1 c ) = (−4) · (−1) = 4.
2. To illustrate Facts 8 and 9, let
⎡
⎤
1 1
X
⎥
5 1⎦ =
Y∗
1 1 3
3
⎢
A = ⎣1
Y
.
Z
Note that A is positive definite and so Hadamard’s inequality says that
det A ≤ 3 · 5 · 3 = 45;
in fact, det A = 36. Fischer’s inequality gives a smaller upper bound for the determinant:
det A ≤ det X det Z = 13 · 3 = 39.
3. Consider the matrix
⎡
⎤
1
2 −2
⎢
⎥
B = ⎣4 −1
1⎦ .
0
1
1
The first inequality about general matrices in Fact 8 applied to B gives
√
| det B| ≤ 9 · 18 · 2 = 18.
As the rows of B are mutually orthogonal, we have that | det B| = 18; in fact, det B = −18.
14.5
Rank and Nullity Equalities and Inequalities
Let A be a matrix over a field F . Here we present relations among the fundamental subspaces of A and their
dimensions. As general references consult, e.g., [HJ85] and [Mey00, Sec. 4.2, 4.4, 4.5] (even though the
matrices discussed there are complex, most of the proofs remain valid for any field). Additional material
on rank and nullity can also be found in Section 2.4.
Facts:
1. Let A ∈ F m×n . Then rank(A) = dim rangeA = dim rangeAT .
If F = C, then rank(A) = dim rangeA∗ = dim rangeA.
14-13
Matrix Equalities and Inequalities
2. If A ∈ Cm×n , then rangeA = (kerA∗ )⊥ and rangeA∗ = (kerA)⊥ .
3. If A ∈ F m×n and rank(A) = k, then there exist X ∈ F m×k and Y ∈ F k×n such that A = XY.
4. Let A, B ∈ F m×n . Then rank(A) = rank(B) if and only if there exist invertible matrices X ∈ F m×m
and Y ∈ F n×n such that B = XAY.
5. (Dimension Theorem) Let A ∈ F m×n . Then
rank(A) + null(A) = n
rank(A) + null(AT ) = m.
and
If F = C, then rank(A) + null(A∗ ) = m.
6. Let A, B ∈ F m×n . Then
rank(A) − rank(B) ≤ rank(A + B) ≤ rank(A) + rank(B).
7. Let A ∈ F m×n , B ∈ F m×k , and C = [A|B] ∈ F m×(n+k) . Then
r rank(C ) = rank(A) + rank(B) − dim(rangeA ∩ rangeB).
r null(C ) = null(A) + null(B) + dim(rangeA ∩ rangeB).
8. Let A ∈ F m×n and B ∈ F n×k . Then
r rank(AB) = rank(B) − dim(kerA ∩ rangeB).
r If F = C, then rank(AB) = rank(A) − dim(kerB ∗ ∩ rangeA∗ ).
r Multiplication of a matrix from the left or right by an invertible matrix leaves the rank unchanged.
r null(AB) = null(B) + dim(kerA ∩ rangeB).
r rank(AB) ≤ min{rank(A), rank(B)}.
r rank(AB) ≥ rank(A) + rank(B) − n.
9. (Sylvester’s law of nullity) Let A, B ∈ Cn×n . Then
max{null(A), null(B)} ≤ null(AB)
≤ null(A) + null(B).
The above fact is valid only for square matrices.
10. (Frobenius inequality) Let A ∈ F m×n , B ∈ F n×k , and C ∈ F k× p . Then
rank(AB) + rank(BC) ≤ rank(B) + rank(ABC).
11. Let A ∈ Cm×n . Then
rank(A∗ A) = rank(A) = rank(AA∗ ).
In fact,
range(A∗ A) = rangeA∗
and rangeA = range(AA∗ ),
as well as
ker(A∗ A) = kerA and
ker(AA∗ ) = kerA∗ .
12. Let A ∈ F m×n and B ∈ F k× p . The rank of their direct sum is
A
rank(A ⊕ B) = rank
0
0
= rank(A) + rank(B).
B
13. Let A = [aij ] ∈ F m×n and B ∈ F k× p . The rank of the Kronecker product A⊗ B = [aij B] ∈ F mk×np
is
rank(A ⊗ B) = rank(A)rank(B).
14-14
Handbook of Linear Algebra
14. Let A = [aij ] ∈ F m×n and B = [bij ] ∈ F m×n . The rank of the Hadamard product A ◦ B =
[aij bij ] ∈ F m×n satisfies
rank(A ◦ B) ≤ rank(A)rank(B).
Examples:
1. Consider the matrices
⎡
⎤
1 −1 1
⎢
⎥
A = ⎣2 −1 0⎦ ,
3 −2 1
⎡
2
⎢
B = ⎣0
2
3
0
1
⎤
⎡
4
1
⎥
⎢
−1⎦ , and C = ⎣1
2
2
2
2
4
−1
−1
−2
⎤
1
⎥
1⎦ .
2
We have that
rank(A) = 2,
rank(B) = 3,
rank(AB) = 2,
rank(C ) = 1,
rank(BC) = 1,
rank(A + B) = 3,
rank(ABC) = 1.
r As a consequence of Fact 5, we have
null(A) = 3 − 2 = 1,
null(B) = 3 − 3 = 0,
null(C ) = 4 − 1 = 3,
null(A + B) = 3 − 3 = 0,
null(AB) = 3 − 2 = 1,
null(BC) = 3 − 1 = 2,
null(ABC) = 4 − 1 = 3.
r Fact 6 states that
−1 = 2 − 3 = rank(A) − rank(B) ≤ rank(A + B) = 0 ≤ rank(A) + rank(B) = 5.
r Since rangeA ∩ rangeB = rangeA, Fact 7 states that
rank([A|B]) = rank(A) + rank(B) − dim(rangeA ∩ rangeB) = 2 + 3 − 2 = 3,
null([A|B]) = null(A) + null(B) + dim(rangeA ∩ rangeB) = 1 + 0 + 2 = 3.
r Since ker A ∩ rangeB = kerA, Fact 8 states that
2 = rank(AB) = rank(B) − dim(kerA ∩ rangeB) = 3 − 1 = 2.
2 = rank(AB) ≤ min{rank(A), rank(B)} = 2.
2 = rank(AB) ≥ rank(A) + rank(B) − n = 2 + 3 − 3 = 2.
r Fact 9 states that
1 = max{null(A), null(B)} ≤ null(AB) = 1
≤ Null(A) + null(B) = 1.
Fact 9 can fail for nonsquare matrices. For example, if
D = [1
1],
then
1 = max{null(D), null(D T )} ≤ null(DDT ) = 0.
14-15
Matrix Equalities and Inequalities
r Fact 10 states that
3 = rank(AB) + rank(BC) ≤ rank(B) + rank(ABC) = 4.
14.6
Useful Identities for the Inverse
This section presents facts and formulas related to inversion of matrices.
Facts:
1. [Oue81, (1.9)], [HJ85, p. 18] Recall that A/A[α] denotes the Schur complement of the principal
submatrix A[α] in A. (See Section 4.2 and Section 10.3.) If A ∈ F n×n is partitioned in blocks as
A11
A=
A21
A12
,
A22
where A11 and A22 are square matrices, then, provided that A, A11 , and A22 are invertible, we have
that the Schur complements A/A11 and A/A22 are invertible and
−1
A
−1
−A−1
11 A12 (A/A11 )
.
−1
(A/A11 )
(A/A22 )−1
=
−(A/A11 )−1 A21 A−1
11
More generally, given an invertible A ∈ F n×n and α ⊆ {1, 2, . . . , n} such that A[α] and A(α) are
invertible, A−1 is obtained from A by replacing
r A[α] by (A/A(α))−1 ,
r A[α, α c ] by −A[α]−1 A[α, α c ](A/A[α])−1 ,
r A[α c , α] by −(A/A[α])−1 A[α c , α]A[α]−1 , and
r A(α) by (A/A[α])−1 .
2. [HJ85, pp. 18–19] Let A ∈ F n×n , X ∈ F n×r , R ∈ F r ×r , and Y ∈ F r ×n . Let B = A + X RY .
Suppose that A, B, and R are invertible. Then
B −1 = (A + X RY )−1 = A−1 − A−1 X(R −1 + Y A−1 X)−1 Y A−1 .
3. (Sherman–Morrison) Let A ∈ F n×n , x, y ∈ F n . Let B = A + xy T . Suppose that A and B are
invertible. Then, if y T A−1 x = −1,
B −1 = (A + xy T )−1 = A−1 −
1
A−1 xy T A−1 .
1 + y T A−1 x
In particular, if y T x = −1, then
(I + xy T )−1 = I −
1
xy T .
1 + yT x
4. Let A ∈ F n×n . Then the adjugate of A (see Section 4.2) satisfies
(adjA)A = A(adjA) = (det A)I.
If A is invertible, then
A−1 =
1
adjA.
det A
14-16
Handbook of Linear Algebra
5. Let A ∈ F n×n be invertible and let its characteristic polynomial be p A (x) = x n + an−1 x n−1 +
an−2 x n−2 + · · · + a1 x + a0 . Then,
A−1 =
(−1)n+1 n+1
(A
+ a1 An + a2 An−1 + · · · + an−1 A).
det A
6. [Mey00, Sec. 7.10, p. 618] Let A ∈ Cn×n . The following statements are equivalent.
r The Neumann series, I + A + A2 + . . . , converges.
r (I − A)−1 exists and (I − A)−1 =
∞
Ak .
k=0
r ρ(A) < 1.
r lim Ak = 0.
k→∞
Examples:
1. Consider the partitioned matrix
A=
A11
A21
A12
A22
⎡
⎤
1
3 −1
⎢
⎥
2
1⎦ .
=⎣ 0
−1 −1
1
Since
−1
(A/A22 )
0
=
1
2
3
−1
−1.5 1
=
0.5 0
and
(A/A11 )−1 = (−1)−1 = −1,
by Fact 1, we have
⎡
A−1
(A/A22 )−1
=
−(A/A11 )−1 A21 A−1
11
2. To illustrate Fact 3, consider the invertible matrix
⎡
1 i
⎢
A=⎣ 1 0
−2i 1
⎤
−1
⎥
1⎦
−2
and the vectors x = y = [1 1 1]T . We have that
⎡
A−1
⎤
−1.5 1 −2.5
−1
−A−1
⎢
⎥
11 A12 (A/A11 )
0.5⎦ .
= ⎣ 0.5 0
−1
(A/A11 )
−1 1
−1
0.5i
⎢
= ⎣−1 − i
−0.5i
1 + 0.5i
−1 + i
−0.5i
⎤
0.5
⎥
i ⎦.
−0.5
Adding xy T to A amounts to adding 1 to each entry of A; since
1 + y T A−1 x = i = 0,
14-17
Matrix Equalities and Inequalities
the resulting matrix is invertible and its inverse is given by
(A + xy T )−1 = A−1 −
⎡
1 −1 T −1
A xy A
i
2.5
⎢
= ⎣ −2 + 2i
−1.5 − i
3. Consider the matrix
−0.5 − 0.5i
1
0.5 + 0.5i
⎡
⎤
−1 − i
⎥
2 ⎦.
i
⎤
−1
1 −1
⎢
⎥
A = ⎣ 1 −1
3⎦ .
1 −1
2
Since A3 = 0, A is a nilpotent matrix and, thus, all its eigenvalues equal 0. That is, ρ(A) = 0 < 1.
As a consequence of Fact 6, I − A is invertible and
⎡
(I − A)−1
⎤
1
0 1
⎢
⎥
= I + A + A2 = ⎣2 −1 5⎦ .
1 −1 3
References
[Bru82] R.A. Brualdi. Matrices, eigenvalues, and directed graphs. Lin. Multilin. Alg., 11:143–165, 1982.
[HJ85] R.A. Horn and C.R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, 1985.
[HJ91] R.A. Horn and C.R. Johnson. Topics in Matrix Analysis. Cambridge University Press, Cambridge,
1991.
[MM92] M. Marcus and H. Minc. A Survey of Matrix Theory and Matrix Inequalities. Dover Publications,
New York, 1992.
[Mey00] C. D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia, 2000.
[Oue81] D. Ouellette. Schur complements and statistics. Lin. Alg. Appl., 36:187–295, 1981.
[VK99] R.S. Varga and A. Krautstengl. On Geršgorin-type problems and ovals of Cassini. Electron. Trans.
Numer. Anal., 8:15–20, 1999.
Fly UP