Comments
Transcript
14 Chapter 14 Matrix Equalities and Inequalities
14 Matrix Equalities and Inequalities Eigenvalue Equalities and Inequalities . . . . . . . . . . . . . . . Spectrum Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inequalities for the Singular Values and the Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Basic Determinantal Relations . . . . . . . . . . . . . . . . . . . . . . 14.5 Rank and Nullity Equalities and Inequalities . . . . . . . . 14.6 Useful Identities for the Inverse . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 14.2 14.3 Michael Tsatsomeros Washington State University 14-1 14-5 14-8 14-10 14-12 14-15 14-17 In this chapter, we have collected classical equalities and inequalities regarding the eigenvalues, the singular values, the determinant, and the dimensions of the fundamental subspaces of a matrix. Also included is a section on identities for matrix inverses. The majority of these results can be found in comprehensive books on linear algebra and matrix theory, although some are of specialized nature. The reader is encouraged to consult, e.g., [HJ85], [HJ91], [MM92], or [Mey00] for details, proofs, and further bibliography. 14.1 Eigenvalue Equalities and Inequalities The majority of the facts in this section concern general matrices; however, some classical and frequently used results on eigenvalues of Hermitian and positive definite matrices are also included. For the latter, see also Chapter 8 and [HJ85, Chap. 4]. Many of the definitions and some of the facts in this section are also given in Section 4.3. Facts: 1. [HJ85, Chap. 1] Let A ∈ F n×n , where F = C or any algebraically closed field. Let p A (x) = det(x I − A) be the characteristic polynomial of A, and λ1 , λ2 , . . . , λn be the eigenvalues of A. Denote by Sk (λ1 , . . . , λn )(k = 1, 2, . . . , n) the kth elementary symmetric function of the eigenvalues (here abbreviated Sk (λ)), and by Sk (A) the sum of all k × k principal minors of A. Then r The characteristic polynomial satisfies p A (x) = (x − λ1 )(x − λ2 ) · · · (x − λn ) = x n − S1 (λ)x n−1 + S2 (λ)x n−2 + · · · + (−1)n−1 Sn−1 (λ)x + (−1)n Sn (λ) = x n − S1 (A)x n−1 + S2 (A)x n−2 + · · · + (−1)n−1 Sn−1 x + (−1)n Sn (A). 14-1 14-2 Handbook of Linear Algebra r S (λ) = S (λ , . . . , λ ) = S (A)(k = 1, 2, . . . , n). k k 1 n k r trA = S (A) = n a = n λ and det A = Sn (A) = in=1 λi . 1 i =1 ii i =1 i 2. [HJ85, (1.2.13)] Let A(i ) be obtained from A ∈ Cn×n by deleting row and column i . Then n d p A (x) = p A(i ) (x). dx i =1 Facts 3 to 9 are collected, together with historical commentary, proofs, and further references, in [MM92, Chap. III]. 3. (Hirsch and Bendixson) Let A = [aij ] ∈ Cn×n and λ be an eigenvalue of A. Denote B = [bij ] = (A + A∗ )/2 and C = [c ij ] = (A − A∗ )/(2i ). Then the following inequalities hold: |λ| ≤ n max |aij |, i, j |Reλ| ≤ n max |bij |, i, j |Imλ| ≤ n max |c ij |. i, j Moreover, if A + AT ∈ Rn×n , then |Imλ| ≤ max |c ij | i, j n(n − 1) . 2 4. (Pick’s inequality) Let A = [aij ] ∈ Rn×n and λ be an eigenvalue of A. Denote C = [c ij ] = (A − AT )/2. Then |Imλ| ≤ max |c ij | cot i, j π 2n . 5. Let A = [aij ] ∈ Cn×n and λ be an eigenvalue of A. Denote B = [bij ] = (A + A∗ )/2 and C = [c ij ] = (A − A∗ )/(2i ). Then the following inequalities hold: min{µ : µ ∈ σ (B)} ≤ Reλ ≤ max{µ : µ ∈ σ (B)}, min{ν : ν ∈ σ (C )} ≤ Imλ ≤ max{ν : ν ∈ σ (C )}. 6. (Schur’s inequality) Let A = [aij ] ∈ Cn×n have eigenvalues λ j ( j = 1, 2, . . . , n). Then n |λ j |2 ≤ j =1 n |aij |2 i, j =1 with equality holding if and only if A is a normal matrix (i.e., A∗ A = AA∗ ). (See Section 7.2 for more information on normal matrices.) 7. (Browne’s Theorem) Let A = [aij ] ∈ Cn×n and λ j ( j = 1, 2, . . . , n) be the eigenvalues of A ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Let also σ1 ≥ σ2 ≥ · · · ≥ σn be the singular values of A, which are real and nonnegative. (See Section 5.6 for the definition.) Then σn ≤ |λ j | ≤ σ1 ( j = 1, 2, . . . , n). In fact, the following more general statement holds: k i =1 σn−i +1 ≤ k j =1 |λt j | ≤ k σi , i =1 for every k ∈ {1, 2, . . . , n} and every k-tuple (t1 , t2 , . . . , tk ) of strictly increasing elements chosen from {1, 2, . . . , n} . 14-3 Matrix Equalities and Inequalities 8. Let A ∈ Cn×n and Ri , C i (i = 1, 2, . . . , n) denote the sums of the absolute values of the entries of A in row i and column i , respectively. Also denote R = max{Ri } i and C = max{C i }. i Let λ be an eigenvalue of A. Then the following inequalities hold: R+C Ri + C i ≤ , 2 2 √ √ |λ| ≤ maxi Ri C i ≤ RC , |λ| ≤ maxi |λ| ≤ min{R, C }. 9. (Schneider’s Theorem) Let A = [aij ] ∈ Cn×n and λ j ( j = 1, 2, . . . , n) be the eigenvalues of A ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Let x = [xi ] be any vector in Rn with positive entries and define the quantities ri = n |aij |x j j =1 (i = 1, 2, . . . , n). xi Then k j =1 |λ j | ≤ k ri j (k = 1, 2, . . . , n) j =1 for all n-tuples (i 1 , i 2 , . . . , i n ) of elements from {1, 2, . . . , n} such that r i1 ≥ r i2 ≥ · · · ≥ r in . 10. [HJ85, Theorem 8.1.18] For A = [aij ] ∈ Cn×n , let its entrywise absolute value be denoted by |A| = [|aij |]. Let B ∈ Cn×n and assume that |A| ≤ B (entrywise). Then ρ(A) ≤ ρ(|A|) ≤ ρ(B). 11. [HJ85, Chap. 5, Sec. 6] Let A ∈ Cn×n and · denote any matrix norm on Cn×n . (See Chapter 37). Then ρ(A) ≤ A and lim Ak 1/k = ρ(A). k−→∞ 12. [HJ91, Corollary 1.5.5] Let A = [aij ] ∈ Cn×n . The numerical range of A ∈ Cn×n is W(A) = {v ∗ Av ∈ C : v ∈ Cn with v ∗ v = 1} and the numerical radius of A ∈ Cn×n is r (A) = max{|z| : z ∈ W(A)}. (See Chapter 18 for more information about the numerical range and numerical radius.) Then the following inequalities hold: r (Am ) ≤ [r (A)]m (m = 1, 2, . . . ), A1 + A∞ , ρ(A) ≤ r (A) ≤ 2 A2 ≤ r (A) ≤ A2 , 2 |A| + |A|T (where |A| = [|aij |]). r (A) ≤ r (|A|) = 2 14-4 Handbook of Linear Algebra Moreover, the following statements are equivalent: (a) r (A) = A2 . (b) ρ(A) = A2 . (c) An 2 = An2 . (d) Ak 2 = Ak2 (k = 1, 2, . . . ). Facts 13 to 15 below, along with proofs, can be found in [HJ85, Chap. 4]. 13. (Rayleigh–Ritz) Let A ∈ Cn×n be Hermitian (i.e., A = A∗ ) with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn . Then (a) λn x ∗ x ≤ x ∗ Ax ≤ λ1 x ∗ x for all x ∈ Cn . x ∗ Ax (b) λ1 = max ∗ = max x ∗ Ax. x ∗ x=1 x=0 x x x ∗ Ax (c) λn = min ∗ = min x ∗ Ax. x ∗ x=1 x=0 x x 14. (Courant–Fischer) Let A ∈ Cn×n be Hermitian with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn . Let k ∈ {1, 2, . . . , n}. Then λk = = min w 1 ,w 2 ,...,w k−1 ∈Cn max w 1 ,w 2 ,...,w n−k ∈C n maxn x=0,x∈C x⊥w 1 ,w 2 ,...,w k−1 minn x=0,x∈C x⊥w 1 ,w 2 ,...,w n−k x ∗ Ax x∗x x ∗ Ax . x∗x 15. (Weyl) Let A, B ∈ Cn×n be Hermitian. Consider the eigenvalues of A, B, and A + B, denoted by λi (A), λi (B), λi (A + B), respectively, arranged in decreasing order. Then the following hold: (a) For each k ∈ {1, 2, . . . , n}, λk (A) + λn (A) ≤ λk (A + B) ≤ λk (A) + λ1 (B). (b) For every pair j, k ∈ {1, 2, . . . , n} such that j + k ≥ n + 1, λ j +k−n (A + B) ≥ λ j (A) + λk (B). (c) For every pair j, k ∈ {1, 2, . . . , n} such that j + k ≤ n + 1, λ j (A) + λk (B) ≥ λ j +k−1 (A + B). Examples: 1. To illustrate several of the facts in this section, consider ⎡ 1 −1 ⎢ 1 ⎢ 3 A=⎢ 0 ⎣ 1 −1 2 0 −2 0 1 ⎤ 2 1⎥ ⎥ ⎥, −1⎦ 0 whose spectrum, σ (A), consists of λ1 = −0.7112 + 2.6718i, λ2 = −0.7112 − 2.6718i, λ3 = 2.5506, λ4 = 0.8719. Note that the eigenvalues are ordered decreasingly with respect to their moduli (absolute values): |λ1 | = |λ2 | = 2.7649 > |λ3 | = 2.5506 > |λ4 | = 0.8719. 14-5 Matrix Equalities and Inequalities The maximum and minimum eigenvalues of ( A + A∗ )/2 are 2.8484 and −1.495. Note that, as required by Fact 5, for every λ ∈ σ (A), −1.495 ≤ |λ| ≤ 2.8484. To illustrate Fact 7, let (t1 , t2 ) = (1, 3) and compute the singular values of A: σ1 = 4.2418, σ2 = 2.5334, σ3 = 1.9890, σ4 = 0.7954. Then, indeed, σ4 σ3 = 1.5821 ≤ |λ1 ||λ3 | = 7.0522 ≤ σ1 σ2 = 10.7462. Referring to the notation in Fact 8, we have C = 6 and R = 7. The spectral radius of A is ρ(A) = 2.7649 and, thus, the modulus of every eigenvalue of A is indeed bounded above by the quantities √ 13 R+C = = 6.5, 2 2 RC = 6.4807, min{R, C } = 6. Letting B denote the entrywise absolute value of A, Facts 10 and 11 state that and ρ(A) = 2.7649 ≤ A2 = 4.2418. ρ(A) = 2.7649 ≤ ρ(B) = 4.4005 Examples related to Fact 12 and the numerical range are found in Chapter 18. See also Example 2 that associates the numerical range with the location of the eigenvalues. 2. Consider the matrix ⎡ 1 ⎢ A = ⎣0 0 0 0 0 ⎤ 0 ⎥ 1⎦ 0 and note that for every integer m ≥ 2, Am consists of zero entries, except for its (1, 1) entry that is equal to 1. One may easily verify that A∞ = A1 = A2 = 1. ρ(A) = 1, By Fact 12, it follows that r (A) = 1 and all of the equivalent conditions (a) to (d) in that fact hold, despite A not being a normal matrix. 14.2 Spectrum Localization This section presents results on classical inclusion regions for the eigenvalues of a matrix. The following facts, proofs, and details, as well as additional references, can be found in [MM92, Chap. III, Sec. 2], [HJ85, Chap. 6], and [Bru82]. Facts: 1. (Geršgorin) Let A = [aij ] ∈ Cn×n and define the quantities Ri = n |aij | (i = 1, 2, . . . , n). j =1 j =i Consider the Geršgorin discs (centered at aii with radii Ri ), Di = {z ∈ C : |z − aii | ≤ Ri } (i = 1, 2, . . . , n). 14-6 Handbook of Linear Algebra Then all the eigenvalues of A lie in the union of the Geršgorin discs; that is, σ (A) ⊂ n Di . i =1 Moreover, if the union of k Geršgorin discs, G , forms a connected region disjoint from the remaining n − k discs, then G contains exactly k eigenvalues of A (counting algebraic multiplicities). 2. (Lévy–Desplanques) Let A = [aij ] ∈ Cn×n be a strictly diagonally dominant matrix, namely, |aii | > n |aij | (i = 1, 2, . . . , n). j =1 j =i Then A is an invertible matrix. 3. (Brauer) Let A = [aij ] ∈ Cn×n and define the quantities Ri = n |aij | (i = 1, 2, . . . , n). j =1 j =i Consider the ovals of Cassini, which are defined by Vi, j = {z ∈ C : |z − aii ||z − a j j | ≤ Ri R j } (i, j = 1, 2, . . . , n, i = j ). Then all the eigenvalues of A lie in the union of the ovals of Cassini; that is, n σ (A) ⊂ Vi, j . i, j =1 i = j 4. [VK99, Eq. 3.1] Denoting the union of the Geršgorin discs of A ∈ Cn×n by (A) (see Fact 1) and the union of the ovals of Cassini of A by K (A) (see Fact 2), we have that σ (A) ⊂ K (A) ⊆ (A). That is, the ovals of Cassini provided at least as good a localization for the eigenvalues of A as do the Geršgorin discs. 5. Let A = [aij ] ∈ Cn×n such that |aii ||akk | > n j =1 j =i |aij | n |ak j | (i, k = 1, 2, . . . , n, i = k). j =1 j =k Then A is an invertible matrix. 6. Facts 1 to 5 can also be stated in terms of column sums instead of row sums. 7. (Ostrowski) Let A = [aij ] ∈ Cn×n and α ∈ [0, 1]. Define the quantities Ri = n j =1 j =i |aij |, Ci = n |a j i | (i = 1, 2, . . . , n). j =1 j =i Then all the eigenvalues of A lie in the union of the discs Di (α) = z ∈ C : |z − aii | ≤ Riα C i1−α (i = 1, 2, . . . , n); 14-7 Matrix Equalities and Inequalities that is, σ (A) ⊂ n Di (α). i =1 8. Let A ∈ Cn×n and consider the spectrum of A, σ (A), as well as its numerical range, W(A). Then σ (A) ⊂ W(A). In particular, if A is a normal matrix (i.e., A∗ A = AA∗ ), then W(A) is exactly equal to the convex hull of the eigenvalues of A. Examples: 1. To illustrate Fact 1 (see also Facts 3 and 4) let ⎡ 3i ⎢−1 ⎢ ⎢ A=⎢ 1 ⎢ ⎣ 0 1 1 2i 2 −1 0 ⎤ 0.5 −1 0 1.5 0 0⎥ ⎥ ⎥ −7 0 1⎥ ⎥ 0 10 i ⎦ 1 −1 1 and consider the Geršgorin discs of A displayed in Figure 14.1. Note that there are three connected regions of discs that are disjoint of each other. Each region contains as many eigenvalues (marked with +’s) as the number of discs it comprises. The ovals of Cassini are contained in the union of the Geršgorin discs. In general, although it is easy to verify whether a complex number belongs to an oval of Cassini or not, these ovals are generally difficult to draw. An interactive supplement to [VK99] (accessible at: www.emis.math.ca/EMIS/journals/ETNA/vol.8.1999/pp15-20. dir/gershini.html) allows one to draw and compare the Geršgorin discs and ovals of Cassini of 3 × 3 matrices. 2. To illustrate Fact 8, consider the matrices ⎡ ⎤ 1 −1 2 ⎢ ⎥ A = ⎣ 2 −1 0⎦ −1 0 1 ⎡ 2 + 2i ⎢ B = ⎣1 + 2i 2+i and −2 − i −1 − i −2 − i ⎤ −1 − 2i ⎥ −1 − 2i ⎦ . −1 − i 6 4 2 0 −2 −4 −6 −10 −5 0 5 FIGURE 14.1 The Geršgorin disks of A. 10 14-8 Handbook of Linear Algebra 2.5 1 2 0.8 1.5 0.6 1 0.4 0.5 0.2 0 0 −0.5 −0.2 −0.4 −1 −0.6 −1.5 −0.8 −2 −1 −2.5 −1.5 −1 −0.5 0 0.5 1 1.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 2 FIGURE 14.2 The numerical range of A and of the normal matrix B. Note that B is a normal matrix with spectrum {1, i, −1 − i }. As indicated in Figure 14.2, the numerical ranges of A and B contain the eigenvalues of A and B, respectively, marked with +’s. The numerical range of B is indeed the convex hull of the eigenvalues. 14.3 Inequalities for the Singular Values and the Eigenvalues The material in this section is a selection of classical inequalities about the singular values. Extensive details and proofs, as well as a host of additional results on singular values, can be found in [HJ91, Chap. 3]. Definitions of many of the terms in this section are given in Section 5.6, Chapter 17, and Chapter 45; additional facts and examples are also given there. Facts: 1. Let A ∈ Cm×n and σ1 be its largest singular value. Then σ1 = A2 . 2. Let A ∈ Cm×n , q = min{m, n}. Denote the singular values of A by σ1 ≥ σ2 ≥ · · · ≥ σq and let k ∈ {1, 2, . . . , q }. Then σk = = = min max w 1 ,w 2 ,...,w k−1 ∈Cn x2 =1,x∈Cn max min w 1 ,w 2 ,...,w n−k ∈Cn W⊆C Ax2 x∈W W⊆C dim W=k x2 =1,x∈Cn x⊥w 1 ,w 2 ,...,w n−k max Ax2 minn dim W=n−k+1 = maxn Ax2 x⊥w 1 ,w 2 ,...,w k−1 x2 =1 min Ax2 , x∈W x2 =1 where the optimizations take place over all subspaces W ⊆ Cn of the indicated dimensions. 3. (Weyl) Let A ∈ Cn×n have singular values σ1 ≥ σ2 ≥ · · · ≥ σn and eigenvalues λ j ( j = 1, 2, . . . , n) be ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then |λ1 λ2 · · · λk | ≤ σ1 σ2 · · · σk Equality holds in (3) when k = n. (k = 1, 2, . . . , n). 14-9 Matrix Equalities and Inequalities 4. (A. Horn) Let A ∈ Cm× p and B ∈ C p×n . Let also r = min{m, p}, s = min{ p, n}, and q = min{r, s }. Denote the singular values of A, B, and AB, respectively, by σ1 ≥ σ2 ≥ · · · ≥ σr , τ1 ≥ τ2 ≥ · · · ≥ τs , and χ1 ≥ χ2 ≥ · · · ≥ χq . Then k χi ≤ i =1 k σi τi (k = 1, 2, . . . , q ). i =1 Equality holds if k = n = p = m. Also for any t > 0, k χit ≤ i =1 k (σi τi )t (k = 1, 2, . . . , q ). i =1 5. Let A ∈ Cn×n have singular values σ1 ≥ σ2 ≥ · · · ≥ σn and eigenvalues λ j ( j = 1, 2, . . . , n) ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then for any t > 0, k |λi |t ≤ i =1 k σit (k = 1, 2 . . . , n). i =1 In particular, for t = 1 and k = n we obtain from the inequality above that |trA| ≤ n σi . i =1 6. Let A, B ∈ Cm×n and q = min{m, n}. Denote the singular values of A, B, and A + B, respectively, by σ1 ≥ σ2 ≥ · · · ≥ σq , τ1 ≥ τ2 ≥ · · · ≥ τq , and ψ1 ≥ ψ2 ≥ · · · ≥ ψq . Then the following inequalities hold: (a) ψi + j −1 ≤ σi + τ j (b) |ρi − σi | ≤ τ1 (c) k i =1 ψi ≤ k i =1 (1 ≤ i, j ≤ q , i + j ≤ q + 1). (i = 1, 2, . . . , q ). σi + k τi (k = 1, 2, . . . , q ). i =1 7. Let A ∈ Cn×n have eigenvalues λ j ( j = 1, 2, . . . , n) ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Denote the singular values of Ak by σ1 (Ak ) ≥ σ2 (Ak ) ≥ · · · ≥ σn (Ak ). Then lim [σi (Ak )]1/k = |λi | k−→∞ (i = 1, 2, . . . , n). Examples: 1. To illustrate Facts 1, 3, and 5, as well as gauge the bounds they provide, let ⎡ i 2 −1 ⎢ 1 ⎢ 2 1+i A=⎢ 1 1 ⎣2i 0 1−i 1 ⎤ 0 0⎥ ⎥ ⎥, 0⎦ 0 whose eigenvalues and singular values ordered as required in Fact 3 are, respectively, λ1 = 2.6775 + 1.0227i, λ2 = −2.0773 + 1.4685i, λ3 = 1.3998 − 0.4912i, λ4 = 0, and σ1 = 3.5278, σ2 = 2.5360, σ3 = 1.7673, σ4 = 0. 14-10 Handbook of Linear Algebra According to Fact 1, A2 = σ1 = 3.5278. The following inequalities hold according to Fact 3: 7.2914 = |λ1 λ2 | ≤ σ1 σ2 = 8.9465. 10.8167 = |λ1 λ2 λ3 | ≤ σ1 σ2 σ3 = 15.8114. Finally, applying Fact 5 with t = 3/2 and k = 2, we obtain the inequality 3/2 8.9099 = |λ1 |3/2 + |λ2 |3/2 ≤ σ1 3/2 + σ2 = 10.6646. For t = 1 and k = n, we get 2.8284 = |2 + 2i | = |tr(A)| ≤ 4 σ j = 7.8311. j =1 14.4 Basic Determinantal Relations The purpose of this section is to review some basic equalities and inequalities regarding the determinant of a matrix. For most of the facts mentioned here, see [Mey00, Chap. 6] and [HJ85, Chap. 0]. Definitions of many of the terms in this section are given in Sections 4.1 and 4.2; additional facts and examples are given there as well. Note that this section concludes with a couple of classical determinantal inequalities for positive semidefinite matrices; see Section 8.4 or [HJ85, Chap. 7] for more on this subject. Following are some of the properties of determinants of n × n matrices, as well as classical formulas for the determinant of A and its submatrices. Facts: 1. Let A ∈ F n×n . The following are basic facts about the determinant. (See also Chapter 4.1.) r det A = det AT ; if F = C, then det A∗ = det A. r If B is obtained from A by multiplying one row (or column) by a scalar c , then det B = c det A. r det(cA) = c n det A for any scalar c . r det(AB) = det A det B. If A is invertible, then det A−1 = (det A)−1 . r If B is obtained from A by adding nonzero multiples of one row (respectively, column) to other rows (respectively, columns), then det B = det A. sgn(σ )a1σ (1) a2σ (2) · · · anσ (n) , where the summation is taken over all permutations σ ∈Sn σ of n letters, and where sgn(σ ) denotes the sign of the permutation σ . r Let A denote the (n −1)×(n −1) matrix obtained from A ∈ F n×n (n ≥ 2) by deleting row i and ij column j . The following formula is known as the Laplace expansion of det A along column j : r det A = det A = n (−1)i + j aij det Aij ( j = 1, 2, . . . , n). i =1 2. (Cauchy–Binet) Let A ∈ F m,k , B ∈ F k×n and consider the matrix C = AB ∈ F m×n . Let also α ⊆ {1, 2, . . . , m} and β ⊆ {1, 2, . . . , n} have cardinality r , where 1 ≤ r ≤ min{m, k, n}. Then the submatrix of C whose rows are indexed by α and columns indexed by β satisfies det C [α, β] = det A[α, γ ] det B[γ , β]. γ ⊆{1,2,...,k} |γ |=r 3. [Mey00, Sec. 6.1, p. 471] Let A = [aij (x)] be an n × n matrix whose entries are complex differentiable functions of x. Let Di (i = 1, 2, . . . , n) denote the n × n matrix obtained from A when the entries in its i th row are replaced by their derivatives with respect to x. Then n d det Di . (det A) = dx i =1 14-11 Matrix Equalities and Inequalities 4. Let A = [aij ] be an n × n matrix and consider its entries as independent variables. Then ∂(det A) = det A({i }, { j }) (i, j = 1, 2, . . . , n), ∂aij where A({i }, { j }) denotes the submatrix of A obtained from A by deleting row i and column j . 5. [Mey00, Sec. 6.2] Let A ∈ F n×n and α ⊆ {1, 2, . . . , n}. If the submatrix of A whose rows and columns are indexed by α, A[α], is invertible, then det A = det A[α] det( A/A[α]). In particular, if A is partitioned in blocks as A= A11 A21 A12 , A22 where A11 and A22 are square matrices, then det A = det A11 det(A22 − A21 (A11 )−1 A12 ) if A11 is invertible det A22 det(A11 − A12 (A22 )−1 A21 ) if A22 is invertible. The following two facts for F = C can be found in [Mey00, Sec. 6.2, pp. 475, 483] and [Mey00, Exer. 6.2.15, p. 485], respectively. The proofs are valid for arbitrary fields. 6. Let A ∈ F n×n be invertible and c , d ∈ F n . Then det(A + c d T ) = det(A)(1 + d T A−1 c ). 7. Let A ∈ F n×n be invertible, x, y ∈ F n . Then det A yT x = − det(A + xy T ). −1 8. [HJ85, Theorem 7.8.1 and Corollary 7.8.2] (Hadamard’s inequalities) Let A = [aij ] ∈ Cn×n be a positive semidefinite matrix. Then det A ≤ n aii . i =1 If A is positive definite, equality holds if and only if A is a diagonal matrix. For a general matrix B = [bij ] ∈ Cn×n , applying the above inequality to B ∗ B and B B ∗ , respectively, one obtains | det B| ≤ n i =1 ⎛ ⎝ n ⎞1/2 2⎠ |bij | and | det B| ≤ j =1 n j =1 n 1/2 |bij | 2 . i =1 If B is nonsingular, equalities hold, respectively, if and only if the rows or the columns of B are orthogonal. 9. [HJ85, Theorem 7.8.3] (Fischer’s inequality) Consider a positive definite matrix A= X Y∗ Y , Z partitioned so that X, Z are square and nonvacuous. Then det A ≤ det X det Z. 14-12 Handbook of Linear Algebra Examples: For examples relating to Facts 1, 2, and 5, see Chapter 4. 1. Let ⎡ 1 3 1 2 ⎢ A=⎣ 0 −1 ⎤ −1 ⎥ 1⎦ and 2 ⎡ ⎤ ⎡ 1 2 ⎤ ⎢ ⎥ ⎢ ⎥ x = ⎣ 1⎦ , y = ⎣ 1⎦ . −1 1 Then, as noted by Fact 7, det A yT ⎡ 1 3 ⎢ x ⎢ 0 1 =⎢ −1 ⎣−1 2 2 1 ⎤ ⎡ 1 3 ⎥ 1⎥ ⎢ ⎥ = − det(A + xy T ) = ⎣2 1⎦ 1 −1 −1 1 2 −1 4 2 3 ⎤ −2 ⎥ 0⎦ = 10. 1 Next, letting c = [121]T and d = [0 − 1 − 1]T , by Fact 6, we have det(A + c d T ) = det(A)(1 + d T A−1 c ) = (−4) · (−1) = 4. 2. To illustrate Facts 8 and 9, let ⎡ ⎤ 1 1 X ⎥ 5 1⎦ = Y∗ 1 1 3 3 ⎢ A = ⎣1 Y . Z Note that A is positive definite and so Hadamard’s inequality says that det A ≤ 3 · 5 · 3 = 45; in fact, det A = 36. Fischer’s inequality gives a smaller upper bound for the determinant: det A ≤ det X det Z = 13 · 3 = 39. 3. Consider the matrix ⎡ ⎤ 1 2 −2 ⎢ ⎥ B = ⎣4 −1 1⎦ . 0 1 1 The first inequality about general matrices in Fact 8 applied to B gives √ | det B| ≤ 9 · 18 · 2 = 18. As the rows of B are mutually orthogonal, we have that | det B| = 18; in fact, det B = −18. 14.5 Rank and Nullity Equalities and Inequalities Let A be a matrix over a field F . Here we present relations among the fundamental subspaces of A and their dimensions. As general references consult, e.g., [HJ85] and [Mey00, Sec. 4.2, 4.4, 4.5] (even though the matrices discussed there are complex, most of the proofs remain valid for any field). Additional material on rank and nullity can also be found in Section 2.4. Facts: 1. Let A ∈ F m×n . Then rank(A) = dim rangeA = dim rangeAT . If F = C, then rank(A) = dim rangeA∗ = dim rangeA. 14-13 Matrix Equalities and Inequalities 2. If A ∈ Cm×n , then rangeA = (kerA∗ )⊥ and rangeA∗ = (kerA)⊥ . 3. If A ∈ F m×n and rank(A) = k, then there exist X ∈ F m×k and Y ∈ F k×n such that A = XY. 4. Let A, B ∈ F m×n . Then rank(A) = rank(B) if and only if there exist invertible matrices X ∈ F m×m and Y ∈ F n×n such that B = XAY. 5. (Dimension Theorem) Let A ∈ F m×n . Then rank(A) + null(A) = n rank(A) + null(AT ) = m. and If F = C, then rank(A) + null(A∗ ) = m. 6. Let A, B ∈ F m×n . Then rank(A) − rank(B) ≤ rank(A + B) ≤ rank(A) + rank(B). 7. Let A ∈ F m×n , B ∈ F m×k , and C = [A|B] ∈ F m×(n+k) . Then r rank(C ) = rank(A) + rank(B) − dim(rangeA ∩ rangeB). r null(C ) = null(A) + null(B) + dim(rangeA ∩ rangeB). 8. Let A ∈ F m×n and B ∈ F n×k . Then r rank(AB) = rank(B) − dim(kerA ∩ rangeB). r If F = C, then rank(AB) = rank(A) − dim(kerB ∗ ∩ rangeA∗ ). r Multiplication of a matrix from the left or right by an invertible matrix leaves the rank unchanged. r null(AB) = null(B) + dim(kerA ∩ rangeB). r rank(AB) ≤ min{rank(A), rank(B)}. r rank(AB) ≥ rank(A) + rank(B) − n. 9. (Sylvester’s law of nullity) Let A, B ∈ Cn×n . Then max{null(A), null(B)} ≤ null(AB) ≤ null(A) + null(B). The above fact is valid only for square matrices. 10. (Frobenius inequality) Let A ∈ F m×n , B ∈ F n×k , and C ∈ F k× p . Then rank(AB) + rank(BC) ≤ rank(B) + rank(ABC). 11. Let A ∈ Cm×n . Then rank(A∗ A) = rank(A) = rank(AA∗ ). In fact, range(A∗ A) = rangeA∗ and rangeA = range(AA∗ ), as well as ker(A∗ A) = kerA and ker(AA∗ ) = kerA∗ . 12. Let A ∈ F m×n and B ∈ F k× p . The rank of their direct sum is A rank(A ⊕ B) = rank 0 0 = rank(A) + rank(B). B 13. Let A = [aij ] ∈ F m×n and B ∈ F k× p . The rank of the Kronecker product A⊗ B = [aij B] ∈ F mk×np is rank(A ⊗ B) = rank(A)rank(B). 14-14 Handbook of Linear Algebra 14. Let A = [aij ] ∈ F m×n and B = [bij ] ∈ F m×n . The rank of the Hadamard product A ◦ B = [aij bij ] ∈ F m×n satisfies rank(A ◦ B) ≤ rank(A)rank(B). Examples: 1. Consider the matrices ⎡ ⎤ 1 −1 1 ⎢ ⎥ A = ⎣2 −1 0⎦ , 3 −2 1 ⎡ 2 ⎢ B = ⎣0 2 3 0 1 ⎤ ⎡ 4 1 ⎥ ⎢ −1⎦ , and C = ⎣1 2 2 2 2 4 −1 −1 −2 ⎤ 1 ⎥ 1⎦ . 2 We have that rank(A) = 2, rank(B) = 3, rank(AB) = 2, rank(C ) = 1, rank(BC) = 1, rank(A + B) = 3, rank(ABC) = 1. r As a consequence of Fact 5, we have null(A) = 3 − 2 = 1, null(B) = 3 − 3 = 0, null(C ) = 4 − 1 = 3, null(A + B) = 3 − 3 = 0, null(AB) = 3 − 2 = 1, null(BC) = 3 − 1 = 2, null(ABC) = 4 − 1 = 3. r Fact 6 states that −1 = 2 − 3 = rank(A) − rank(B) ≤ rank(A + B) = 0 ≤ rank(A) + rank(B) = 5. r Since rangeA ∩ rangeB = rangeA, Fact 7 states that rank([A|B]) = rank(A) + rank(B) − dim(rangeA ∩ rangeB) = 2 + 3 − 2 = 3, null([A|B]) = null(A) + null(B) + dim(rangeA ∩ rangeB) = 1 + 0 + 2 = 3. r Since ker A ∩ rangeB = kerA, Fact 8 states that 2 = rank(AB) = rank(B) − dim(kerA ∩ rangeB) = 3 − 1 = 2. 2 = rank(AB) ≤ min{rank(A), rank(B)} = 2. 2 = rank(AB) ≥ rank(A) + rank(B) − n = 2 + 3 − 3 = 2. r Fact 9 states that 1 = max{null(A), null(B)} ≤ null(AB) = 1 ≤ Null(A) + null(B) = 1. Fact 9 can fail for nonsquare matrices. For example, if D = [1 1], then 1 = max{null(D), null(D T )} ≤ null(DDT ) = 0. 14-15 Matrix Equalities and Inequalities r Fact 10 states that 3 = rank(AB) + rank(BC) ≤ rank(B) + rank(ABC) = 4. 14.6 Useful Identities for the Inverse This section presents facts and formulas related to inversion of matrices. Facts: 1. [Oue81, (1.9)], [HJ85, p. 18] Recall that A/A[α] denotes the Schur complement of the principal submatrix A[α] in A. (See Section 4.2 and Section 10.3.) If A ∈ F n×n is partitioned in blocks as A11 A= A21 A12 , A22 where A11 and A22 are square matrices, then, provided that A, A11 , and A22 are invertible, we have that the Schur complements A/A11 and A/A22 are invertible and −1 A −1 −A−1 11 A12 (A/A11 ) . −1 (A/A11 ) (A/A22 )−1 = −(A/A11 )−1 A21 A−1 11 More generally, given an invertible A ∈ F n×n and α ⊆ {1, 2, . . . , n} such that A[α] and A(α) are invertible, A−1 is obtained from A by replacing r A[α] by (A/A(α))−1 , r A[α, α c ] by −A[α]−1 A[α, α c ](A/A[α])−1 , r A[α c , α] by −(A/A[α])−1 A[α c , α]A[α]−1 , and r A(α) by (A/A[α])−1 . 2. [HJ85, pp. 18–19] Let A ∈ F n×n , X ∈ F n×r , R ∈ F r ×r , and Y ∈ F r ×n . Let B = A + X RY . Suppose that A, B, and R are invertible. Then B −1 = (A + X RY )−1 = A−1 − A−1 X(R −1 + Y A−1 X)−1 Y A−1 . 3. (Sherman–Morrison) Let A ∈ F n×n , x, y ∈ F n . Let B = A + xy T . Suppose that A and B are invertible. Then, if y T A−1 x = −1, B −1 = (A + xy T )−1 = A−1 − 1 A−1 xy T A−1 . 1 + y T A−1 x In particular, if y T x = −1, then (I + xy T )−1 = I − 1 xy T . 1 + yT x 4. Let A ∈ F n×n . Then the adjugate of A (see Section 4.2) satisfies (adjA)A = A(adjA) = (det A)I. If A is invertible, then A−1 = 1 adjA. det A 14-16 Handbook of Linear Algebra 5. Let A ∈ F n×n be invertible and let its characteristic polynomial be p A (x) = x n + an−1 x n−1 + an−2 x n−2 + · · · + a1 x + a0 . Then, A−1 = (−1)n+1 n+1 (A + a1 An + a2 An−1 + · · · + an−1 A). det A 6. [Mey00, Sec. 7.10, p. 618] Let A ∈ Cn×n . The following statements are equivalent. r The Neumann series, I + A + A2 + . . . , converges. r (I − A)−1 exists and (I − A)−1 = ∞ Ak . k=0 r ρ(A) < 1. r lim Ak = 0. k→∞ Examples: 1. Consider the partitioned matrix A= A11 A21 A12 A22 ⎡ ⎤ 1 3 −1 ⎢ ⎥ 2 1⎦ . =⎣ 0 −1 −1 1 Since −1 (A/A22 ) 0 = 1 2 3 −1 −1.5 1 = 0.5 0 and (A/A11 )−1 = (−1)−1 = −1, by Fact 1, we have ⎡ A−1 (A/A22 )−1 = −(A/A11 )−1 A21 A−1 11 2. To illustrate Fact 3, consider the invertible matrix ⎡ 1 i ⎢ A=⎣ 1 0 −2i 1 ⎤ −1 ⎥ 1⎦ −2 and the vectors x = y = [1 1 1]T . We have that ⎡ A−1 ⎤ −1.5 1 −2.5 −1 −A−1 ⎢ ⎥ 11 A12 (A/A11 ) 0.5⎦ . = ⎣ 0.5 0 −1 (A/A11 ) −1 1 −1 0.5i ⎢ = ⎣−1 − i −0.5i 1 + 0.5i −1 + i −0.5i ⎤ 0.5 ⎥ i ⎦. −0.5 Adding xy T to A amounts to adding 1 to each entry of A; since 1 + y T A−1 x = i = 0, 14-17 Matrix Equalities and Inequalities the resulting matrix is invertible and its inverse is given by (A + xy T )−1 = A−1 − ⎡ 1 −1 T −1 A xy A i 2.5 ⎢ = ⎣ −2 + 2i −1.5 − i 3. Consider the matrix −0.5 − 0.5i 1 0.5 + 0.5i ⎡ ⎤ −1 − i ⎥ 2 ⎦. i ⎤ −1 1 −1 ⎢ ⎥ A = ⎣ 1 −1 3⎦ . 1 −1 2 Since A3 = 0, A is a nilpotent matrix and, thus, all its eigenvalues equal 0. That is, ρ(A) = 0 < 1. As a consequence of Fact 6, I − A is invertible and ⎡ (I − A)−1 ⎤ 1 0 1 ⎢ ⎥ = I + A + A2 = ⎣2 −1 5⎦ . 1 −1 3 References [Bru82] R.A. Brualdi. Matrices, eigenvalues, and directed graphs. Lin. Multilin. Alg., 11:143–165, 1982. [HJ85] R.A. Horn and C.R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, 1985. [HJ91] R.A. Horn and C.R. Johnson. Topics in Matrix Analysis. Cambridge University Press, Cambridge, 1991. [MM92] M. Marcus and H. Minc. A Survey of Matrix Theory and Matrix Inequalities. Dover Publications, New York, 1992. [Mey00] C. D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia, 2000. [Oue81] D. Ouellette. Schur complements and statistics. Lin. Alg. Appl., 36:187–295, 1981. [VK99] R.S. Varga and A. Krautstengl. On Geršgorin-type problems and ovals of Cassini. Electron. Trans. Numer. Anal., 8:15–20, 1999.