38 Chapter 38 Matrix Factorizations and Direct Solution of Linear Systems

by taratuta

on 19 января 2017

Category: Documents

>> Downloads: 13

views

Report

Comments

Description

Download 38 Chapter 38 Matrix Factorizations and Direct Solution of Linear Systems

Transcript

38 Chapter 38 Matrix Factorizations and Direct Solution of Linear Systems

38
Matrix Factorizations
and Direct Solution
of Linear Systems
38.1 Perturbations of Linear Systems . . . . . . . . . . . . . . . . . . . .
38.2 Triangular Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . .
38.3 Gauss Elimination and LU Decomposition . . . . . . . . . .
38.4 Orthogonalization and QR Decomposition . . . . . . . . .
38.5 Symmetric Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christopher Beattie
Virginia Polytechnic Institute and State
University
38-2
38-5
38-7
38-13
38-15
38-17
The need to solve systems of linear equations arises often within diverse disciplines of science, engineering,
and finance. The expression “direct solution of linear systems” refers generally to computational strategies
that can produce solutions to linear systems after a predetermined number of arithmetic operations that
depends only on the structure and dimension of the coefficient matrix. The evolution of computers has
and continues to influence the development of these strategies as well as fostering particular styles of
perturbation analysis suited to illuminating their behavior. Some general themes have become dominant,
as a result; others have been pushed aside. For example, Cramer’s Rule may be properly thought of as a
direct solution strategy for solving linear systems; however it requires a much larger number of arithmetic
operations than Gauss elimination and is generally much more susceptable to the deleterious effects of
rounding. Most current approaches for the direct solution of a linear system, Ax = b, are patterned
after Gauss elimination and favor systematically decoupling the system of equations. Zeros are introduced
systematically into the coefficient matrix, transforming it into triangular form; the resulting triangular
system is easily solved. The entire process can be viewed in this way:
ρ
1. Find invertible matrices {Si }i =1 such that Sρ . . . S2 S1 A = U is triangular; then
2. Calculate a modified right-hand side y = Sρ . . . S2 S1 b; and then
3. Determine the solution set to the triangular system U x = y.
The matrices S1 , S2 , . . . Sρ are typically either row permutations of lower triangular matrices (Gauss
transformations) or unitary matrices and so have readily available inverses. Evidently A can be written
as A = NU, where N = (Sρ . . . S2 S1 )−1 , A solution framework may built around the availability of
decompositions such as this:
1. Find a decompostion A = NU such that U is triangular and Ny = b is easily solved;
2. Solve Ny = b; then
3. Determine the solution set to triangular system U x = y.
38-1
38-2
38.1
Handbook of Linear Algebra
Perturbations of Linear Systems
In the computational environment afforded by current computers, the finite representation of real numbers
creates a small but persistant source of errors that may on occasion severely degrade the overall accuracy
of a calculation. This effect is of fundamental concern in assessing strategies for solving linear systems.
Rounding errors can be introduced into the solution process for linear systems often before any calculations are performed — as soon as data are stored within the computer and represented within the internal
floating point number system of the computer. Further errors that may be introduced in the course of
computation often may be viewed in aggregate as an effective additional contribution to the initial representation error. Inevitably then the linear system for which a solution is computed deviates slightly from
the “true” linear system and it becomes of critical interest to determine whether such deviations will have
a significant effect on the accuracy of the final computed result.
Definitions:
Let A ∈ Cn×n be a nonsingular matrix, b ∈ Cn , and then denote by x̂ = A−1 b the unique solution of the
linear system Ax = b.
Given data perturbations δ A ∈ Cn×n and δb ∈ Cn to A and b, respectively, the solution perturbation,
δx ∈ Cn satisfies the associated perturbed linear system (A + δ A)(x̂ + δx) = b + δb (presuming then
that the perturbed system is consistent).
For any x̃ ∈ Cn , the residual vector associated with the linear system Ax = b is defined as r(x̃) = b− Ax̃.
For any x̃ ∈ Cn , the associated (norm-wise) relative backward error of the linear system Ax = b (with
respect to the the p-norm) is
⎧
⎨
η p (A, b; x̃) = min ε
⎩
⎫
there exist δ A, δb such that
δ A p ≤ εA p ⎬
(A + δ A)x̃ = b + δb with
δb p ≤ εb p ⎭
for 1 ≤ p ≤ ∞.
For any x̃ ∈ Cn , the associated component-wise relative backward error of the linear system Ax = b
is
⎧
⎨
ω(A, b; x̃) = min ε
⎩
⎫
there exist δ A, δb such that
|δ A| ≤ ε|A| ⎬
,
(A + δ A)x̃ = b + δb with
|δb| ≤ ε|b| ⎭
where the absolute values and inequalities applied to vectors and matrices are interpretted component-wise:
|B| ≤ |A| means |bij | ≤ |aij | for all index pairs i, j .
The (norm-wise) condition number of the linear system Ax = b (relative to the the p-norm) is
κ p (A, x̂) = A−1 p
b p
x̂ p
for 1 ≤ p ≤ ∞.
The matrix condition number relative to the the p-norm of A is
κ p (A) = A p A−1 p
for 1 ≤ p ≤ ∞.
The Skeel condition number of the linear system Ax = b is
cond(A, x̂) =
|A−1 | |A| |x̂|∞
.
x̂∞
The Skeel matrix condition number is cond(A) = |A−1 | |A| ∞ .
38-3
Matrix Factorizations and Direct Solution of Linear Systems
Facts: [Hig96],[StS90]
1. For any x̃ ∈ Cn , x̃ is the exact solution to any one of the family of perturbed linear systems
(A + δ Aθ )x̃ = b + δbθ ,
where θ ∈ C, δbθ = (θ − 1) r(x̃), δ Aθ = θ r(x̃)ỹ∗ , and ỹ ∈ Cn is any vector such that ỹ∗ x̃ = 1. In
particular, for θ = 0, δ A = 0 and δb = −r(x̃); for θ = 1, δ A = r(x̃)ỹ∗ and δb = 0.
2. (Rigal–Gaches Theorem) For any x̃ ∈ Cn ,
η p (A, b; x̃) =
r(x̃) p
.
A p x̃ p + b p
If ỹ is the dual vector to x̃ with respect to the p-norm (ỹ∗ x̃ = ỹq x̃ p = 1 with 1p + q1 = 1), then
x̃ is an exact solution to the perturbed linear system ( A + δ Aθ̃ )x̃ = b + δbθ̃ with data perturbations
A x̃ p
, and as a result
as in (1) and θ̃ = A p x̃p p +b
p
δ Aθ̃ p
δbθ̃ p
=
= η p (A, b; x̃).
A p
b p
3. (Oettli–Prager Theorem) For any x̃ ∈ Cn ,
ω(A, b; x̃) = max
i
|r i |
.
(|A| |x̃| + |b|)i
ri
If D1 = diag
and D2 = diag(sign(x̃)i ), then x̃ is an exact solution to the
(|A| |x̃| + |b|)i
perturbed linear system ( A + δ A)x̃ = b + δb with δ A = D1 |A| D2 and δb = −D1 |b|
|δ A| ≤ ω(A, b; x̃) |A|
and |δb| ≤ ω(A, b; x̃) |A|
and no smaller constant can be used in place of ω(A, b; x̃).
4. The reciprocal of κ p (A) is the smallest norm-wise relative distance of A to a singular matrix, i.e.,
1
= min
κ p (A)
δ A p A + δ A is singular .
A p In particular, the perturbed coefficient matrix A + δ A is nonsingular if
1
δ A p
.
<
A p
κ p (A)
x̂) ≤ cond(A) ≤ κ∞ (A).
5. 1 ≤ κ p (A, x̂) ≤ κp (A) and 1 ≤ cond(A, 6. cond(A) = min κ∞ (D A) D diagonal .
7. If δ A = 0, then
δb p
δx p
≤ κ p (A, x̂)
.
x̂ p
b p
8. If δb = 0 and A + δ A is nonsingular, then
δ A p
δx p
≤ κ p (A)
.
x̂ + δx p
A p
9. If δ A p ≤ A p , δb p ≤ b p , and <
1
, then
κ p (A)
2 κ p (A)
δx p
.
≤
x̂ p
1 − κ p (A)
38-4
Handbook of Linear Algebra
10. If |δ A| ≤ |A|, |δb| ≤ |b|, and <
1
, then
cond(A)
δx∞
2 cond(A, x̂)
.
≤
x̂∞
1 − cond(A)
Examples:
1000
999
−1
so A =
999 998
κ1 (A) ≈ 3.996 × 106 . Consider
1. Let A =
−998
999
999
−1000
. Then A1 = A−1 1 = 1999 so that
1
b=
associated with a solution x̂ =
.
1997
1
1999
−0.01
constitutes a relative change in the right-hand
0.01
20.97
δb1
−6
side of b̂ ≈ 5.005 × 10 yet it produces a perturbed solution x̂ + δx =
constituting
1
−18.99
1
1
= 19.98 ≤ 20 = κ1 (A) δb
. The bound determined by the condition
a relative change δx
x̂1
b1
number is very nearly achieved. Note that the same perturbed solution x̂ + δx could be produced
by a change in the coefficient matrix
A perturbation of right-hand side δb =
−0.01 1
δ A = r̃ỹ = −
0.01 39.96
∗
1 −1
1
= (1/3996)
−
39.96
−1
1
A1
constituting a relative change δA
≈ 2.5 × 10−7 . Then (A + δ A)(x̂ + δx) = b.
1
2. Let n = 100 and A be tridiagonal with diagonal entries equal to −2 and all superdiagonal and
subdiagonal entries equal to 1 (associated with a centered difference approximation to the second
derivative). Let b be a vector with a quadratic variation in entries
bk = (k − 1)(100 − k)/10, 000.
Then
κ2 (A, x̂) ≈ 1,
but
κ2 (A) ≈ 4.1336 × 103 .
Since the elements of b do not have an exact binary representation, the linear system that is presented
to any computational algorithm will be Ax = b + δb with δb2 ≤ b2 , where is the unit
roundoff error. For example, if the linear system data is stored in IEEE single precision format,
≈ 6 × 10−8 . The matrix condition number, κ2 (A), would yield a bound of (6 × 10−8 )(4.1336 ×
103 ) ≈ 2.5 × 10−4 anticipating the loss of more than 4 significant digits in solution components
even if all computations were done on the stored data with no further error. However, the condition
number of the linear system, κ2 (A, x̂), is substantially smaller and the predicted error for the system
is roughly the same as the initial representation error ≈ 6 × 10−8 , indicating that the solution will
be fairly insensitive to the consequences of rounding of the right-hand side data—assuming no
further errors occur. But, in fact, this conclusion remains true even if further errors occur, if whatever
computational algorithm that is used produces small backward error, as might be asserted if, say,
a final residual satisfies r2 ≤ O() b2 . This situation changes substantially if the right-hand
side is changed to
bk = (−1)k (k − 1)(100 − k)/10, 000,
Matrix Factorizations and Direct Solution of Linear Systems
38-5
which only introduces a sign variation in b. In this case, κ2 (A, x̂) ≈ κ2 (A), and the components of
the computed solution can be expected to lose about 4 significant digits purely on the basis of errors
that are made in the initial representation. Additional errors made in the course of the computation
can hardly be expected to improve this situation.
38.2
Triangular Linear Systems
Systems of linear equations for which the unknowns may be solved for one at a time in sequence may
be reordered to produce linear systems with triangular coefficient matrices. Such systems can be solved
both with remarkable accuracy and remarkable efficiency. Triangular systems are the archetype for easily
solvable systems of linear equations; as such, they often constitute an intermediate goal for strategies of
solving linear systems.
Definitions:
A linear system of equations T x = b with T ∈ Cn×n (representing n equations in n unknowns) is a
triangular system if T = [tij ] is either an upper triangular matrix (tij = 0 for i > j ) or a lower
triangular matrix (tij = 0 for i < j ).
Facts: [Hig96], [GV96]
1. [GV96, pp. 88–90]
Algorithm 1: Row-oriented forward-substitution for solving lower triangular system
Input: L ∈ Rn×n with ij = 0 for i < j ; b ∈ Rn
Output: solution vector x ∈ Rn that satisfies L x = b
x1 ← b1 /1,1
For k = 2 : n
xk ← (bk − L k,1:k−1 · x1:k−1 )/k,k
end
2. [GV96, pp. 88–90]
Algorithm 2: Column-oriented back-substitution for solving upper triangular system
Input: U ∈ Rn×n with ui j = 0 for i > j ; b ∈ Rn
Output: solution vector x ∈ Rn that satisfies U x = b
For k = n down to 2 by steps of −1
xk ← bk /uk,k
b1:k−1 ← b1:k−1 − xk U1:k−1,k
end
x1 ← b1 /u1,1
3. Algorithm 1 involves as a core calculation, dot products of portions of coefficient matrix rows with
corresponding portions of the emerging solution vector. This can incur a performance penalty for
large n from accumulation of dot products using a scalar recurrence. A “column-oriented” reformulation may have better performance for large n. Algorithm 2 is a “column-oriented” formulation
for solving upper triangular systems.
38-6
Handbook of Linear Algebra
4. The solution of triangular systems using either Algorithm 1 or 2 is componentwise backward stable. In
particular the computed result, x̃, produced either by Algorithm 1 or 2 in solving a triangular system,
n
|T |
T x = b, will be the exact result of a perturbed system (T + δT )x̃ = b, where |δT | ≤
1−n
and is the unit roundoff error.
5. The error in the solution of a triangular system, T x = b, using either Algorithm 1 or 2 satisfies
x̃ − x̂∞
n cond(T, x̂)
.
≤
x̂∞
1 − n (cond(T ) + 1)
6. If T = [ti j ] is an lower triangular matrix satisfying |tii | ≥ |ti j | for j ≤ i , the computed solution to
the linear system T x = b produced by either Algorithm 1 or the variant of Algorithm 2 for lower
triangular systems satisfies
2i n max |x̃ j |,
|x̂i − x̃i | ≤
1 − n j ≤i
where x̃i are the components of the computed solution, x̃, and x̂i are the components of the
exact solution, x̂. Although this bound degrades exponentially with i , it shows that early solution
components will be computed to high accuracy relative to those components already computed.
Examples:
1. Use Algorithm 2 to solve the triangular system
⎡
⎤⎡ ⎤ ⎡ ⎤
1 2 −3
x1
1
⎢
⎥⎢ ⎥ ⎢ ⎥
⎣0 2 −6⎦ ⎣ x2 ⎦ = ⎣1⎦.
0 0
3
1
x3
k = 3 step: Solve for x3 = 1/3. Update right-hand side:
⎡
⎤
⎡ ⎤
⎡ ⎤ ⎡ ⎤
1 2 1
−3
2
⎢
⎥ x1
⎢ ⎥
⎢ ⎥ ⎢ ⎥
= ⎣1⎦ − (1/3) ⎣−6⎦ = ⎣3⎦.
⎣0 2⎦
x2
0 0
1
3
0
k = 2 step: Solve for x2 = 3/2. Update right-hand side:
⎡ ⎤
⎡ ⎤
⎡ ⎤ ⎡ ⎤
1
2
2
−1
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣0⎦ x1 = ⎣3⎦ − (3/2) ⎣2⎦ = ⎣ 0⎦.
0
0
0
0
k = 1 step: Solve for x1 = −1.
⎡
1
⎢
2. ([Hig96, p. 156]) For > 0, consider T = ⎣ε
0
cond(T ) = 5, even though
0
ε
1
⎤
⎡
0
1
⎥
⎢
−1
0⎦. Then T = ⎣−1
1
1
0
1
ε
− ε1
⎤
0
⎥
0⎦, and so
1
2
1
κ∞ (T ) = 2(2 + ) ≈ + O(1).
ε
ε
Thus, linear systems having T as a coefficient matrix will be solved to high relative accuracy,
independent of both right-hand side and size of , despite the poor conditioning of T (as measured
by κ∞ ) as becomes small. However, note that
cond(T T ) = 1 +
2
ε
and
κ∞ (T T ) = (1 + ε)
2
2
≈ + O(1).
ε
ε
Matrix Factorizations and Direct Solution of Linear Systems
38-7
So, linear systems having T T as a coefficient matrix may have solutions that are sensitive to perturbations and indeed, cond(T T , x̂) ≈ cond(T T ) for any right-hand side b with b3 = 0 yielding
solutions that are sensitive to perturbations for small .
38.3
Gauss Elimination and LU Decomposition
Gauss elimination is an elementary approach to solving systems of linear equations, yet it still constitutes
the core of the most sophisticated of solution strategies. In the k th step, a transformation matrix, Mk ,
(a “Gauss transformation”) is designed so as to introduce zeros into A — typically into a portion of the
k th column — without harming zeros that have been introduced in earlier steps. Typically, successive
applications of Gauss transformations are interleaved with row interchanges. Remarkably, this reduction
process can be viewed as producing a decomposition of the coefficient matrix A = NU , where U is a
triangular matrix and N is a row permutation of a lower triangular matrix.
Definitions:
For each index k, a Gauss vector is a vector in Cn with the leading k entries equal to zero: k =
[0, . . . , 0, k+1 , . . . , n ]T . The entries k+1 , . . . , n are Gauss multipliers and the associated matrix
k
Mk = I − k ekT
is called a Gauss transformation.
For the pair of indices (i, j ), with i ≤ j the associated permutation matrix, i, j is an n × n identity
matrix with the ith row and jth row interchanged. Note that i,i is the identity matrix.
A matrix U ∈ Cm×n is in row-echelon form if (1) the first nonzero entry of each row has a strictly
smaller column index than all nonzero entries having a strictly larger row index and (2) zero rows
occur at the bottom. The first nonzero entry in each row of U is called a pivot. Thus, the determining feature of row echelon form is that pivots occur to the left of all nonzero entries in lower
rows.
A matrix A ∈ Cm×n has an LU decomposition if there exists a unit lower triangular matrix L ∈ Cm×m
(L i, j = 0 for i < j and L i,i = 1 for all i ) and an upper triangular matrix U ∈ Cm×n (Ui, j = 0 for i > j )
such that A = LU .
Facts: [GV96]
1. Let a ∈ Cn be a vector with a nonzero component in the r th entry, ar = 0. Define the Gauss vector,
, . . . , aanr ]T . The associated Gauss transformation Mr = I − r erT introduces
r = [0, . . . , 0, aar +1
r
r
zeros into the last n − r entries of a:
Mr a = [a1 , . . . , ar , 0, . . . , 0]T .
2. If A ∈ Cm×n with rank(A) = ρ ≥ 1 has ρ leading principal submatrices nonsingular, A1:r,1:r , r =
1, . . . , ρ, then there exist Gauss transformations M1 , M2 , . . . , Mρ so that
Mρ Mρ−1 · · · M1 A = U
with U upper triangular. Each Gauss transformation Mr introduces zeros into the r th column.
3. Gauss transformations are unit lower triangular matrices. They are invertible, and for the Gauss
transformation, Mr = I − r erT ,
Mr−1 = I + r erT .
38-8
Handbook of Linear Algebra
4. If Gauss vectors 1 , 2 , . . . , n−1 are given with
⎧ ⎫
0⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
21 ⎪
⎪
⎨ ⎪
⎬
1 = 31 ,
⎪
⎪
⎪
.. ⎪
⎪
⎪
⎪
. ⎪
⎪
⎪
⎪
⎪
⎪
⎩ ⎪
⎭
n1
⎧ ⎫
0⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
0
⎪
⎪
⎪
⎨ ⎪
⎬
2 = 32 , . . . ,
⎪
⎪
⎪
.. ⎪
⎪
⎪
⎪
. ⎪
⎪
⎪
⎪
⎪
⎪
⎩ ⎪
⎭
n2
⎧
⎫
0 ⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
0
⎪
⎪
⎪
⎪
⎨
⎬
.
..
=
,
⎪
⎪
⎪
⎪
⎪
⎪
⎪ 0 ⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
⎭
n,n−1
n−1
then the product of Gauss transformations Mn−1 Mn−2 · · · M2 M1 is invertible and has an explicit
inverse
(Mn−1 Mn−2 . . . M2 M1 )−1
⎡
1
⎢
⎢ 21
⎢
n−1
!
⎢
T
=I+
k ek = ⎢
⎢31
⎢.
k=1
⎢.
⎣.
n1
0
...
0
..
.
1
n2
⎤
⎥
⎥
⎥
0⎥
⎥.
⎥
⎥
0⎦
1
0⎥
1
32
0
. . . n,n−1
5. If A ∈ Cm×n with rank(A) = ρ has ρ leading principal submatrices nonsingular, A1:r,1:r , r =
1, . . . , ρ, then A has an LU decomposition: A = LU , with L unit lower triangular and U upper
triangular. The (i, j ) entry of L , L i, j with i > j is the Gauss multiplier used to introduce a zero into
the corresponding (i, j ) entry of A. If, additionally, ρ = m, then the LU decomposition is unique.
6. Let a be an arbitrary vector in Cn . For any index r , there is an index µ ≥ r , a permutation matrix
r,µ , and a Gauss transformation Mr so that
Mr r,µ a = [a1 , . . . , ar −1 , aµ , 0, . . . , 0]T .
n−r
The index µ is chosen so that aµ = 0 out of the set {ar , ar +1 , . . . , an }. If ar = 0, then µ = k and
r,µ = I is a possible choice; if each element is zero, ar = ar +1 = · · · = an = 0, then µ = k,
r,µ = I , and Mr = I is a possible choice.
7. For every matrix A ∈ Cm×n with rank(A) = ρ, there exists a sequence of ρ indices µ1 , µ2 , . . . , µρ
with i ≤ µi ≤ m for i = 1, . . . , ρ and Gauss transformations M1 , M2 , . . . , Mρ so that
Mρ ρ,µρ Mρ−1 ρ−1,µρ−1 · · · M1 1,µ1 A = U
with U upper triangular and in row echelon form. Each pair of transformations Mr r,µr introduces
zeros below the r th pivot.
" r i, j , where M
" r = I − ˜r erT and ˜r = i, j r (i.e., the i and j entries
8. For r < i < j , i, j Mr = M
˜
of r are interchanged to form r ).
9. For every matrix A ∈ Cm×n with rank(A) = ρ, there is a row permutation of A that has an
LU decomposition: P A = LU , with a permutation matrix P , unit lower triangular matrix
L , and an upper triangular matrix U that is in row echelon form. P can be chosen as P =
ρ,µρ ρ−1,µρ−1 . . . 1,µ1 from (7), though in general there can be many other possibilities as well.
10. Reduction of A with Gauss transformations (or equivalently, calculation of an LU factorization)
must generally incorporate row interchanges. As a practical matter, these row interchanges commonly are chosen so as to bring the largest magnitude entry within the column being reduced up
into the pivot location. This strategy is called “partial pivoting.” In particular, if zeros are to be
introduced into the k th column below the r th row (with r ≤ k), then one seeks an index µr such
that r ≤ µr ≤ m and |Aµr ,k | = maxr ≤i ≤m |Ai,k |. When µ1 , µ2 , . . . , µρ in (7) are chosen in this
way, the reduction process is called “Gaussian Elimination with Partial Pivoting” (GEPP) or, within
the context of factorization, the permuted LU factorization (PLU).
Matrix Factorizations and Direct Solution of Linear Systems
38-9
11. [GV96, p. 115]
Algorithm 1: GEPP/PLU decomposition of a rectangular matrix (outer product)
Input: A ∈ Rm×n
Output: L ∈ Rm×m (unit lower triangular matrix)
U ∈ Rm×n (upper triangular matrix - row echelon form)
P ∈ Rm×m (permutation matrix) so that P A = LU
(P is represented with an index vector p such that y = P z ⇔ y j = z p j )
L ← Im ; U ← 0 ∈ Rm×n ;
p = [1, 2, 3, . . . , m]
r ← 1;
For k = 1 to n
Find µ such that r ≤ µ ≤ m and |Aµ,k | = maxr ≤i ≤m |Ai,k |
If Aµ,k = 0, then
Exchange Aµ,k:n ↔ Ar,k:n , L µ,1:r −1 ↔ L r,1:r −1 , and pµ ↔ pr
L r +1:m,r ← Ar +1:m,k /Ar,k
Ur,k:n ← Ar,k:n
For i = r + 1 to m
For j = k + 1 to n
Ai, j ← Ai, j − L i,r Ur, j
r ←r +1
12. [GV96, p. 115]
Algorithm 2: GEPP/PLU decomposition of a rectangular matrix (gaxpy)
Input: A ∈ Rm×n
Output: L ∈ Rm×m (unit lower triangular matrix),
U ∈ Rm×n (upper triangular matrix - row echelon form), and
P ∈ Rm×m (permutation matrix) so that P A = LU
(P is represented with an index vector π that records row interchanges
π r = µ means row r and row µ > r were interchanged)
L ← Im ∈ Rm×m ; U ← 0 ∈ Rm×n ; and r ← 1;
For j = 1 to n
v ← A1:m, j
If r > 1, then
for i = 1 to r − 1,
Exchange v i ↔ v πi
Solve the triangular system, L 1:r −1,1:r −1 · z = v1:r −1 ;
U1:r −1, j ← z;
Update vr :m ← vr :m − L r :m,1:r −1 · z;
Find µ such that |v µ | = maxr ≤i ≤m |v i |
If v µ = 0, then
πr ← µ
Exchange v µ ↔ v r
For i = 1 to r − 1,
Exchange L µ,i ↔ L r,i
L r +1:m,r ← vr +1:m /v r
Ur, j ← v r
r ←r +1
13. The condition for skipping reduction steps ( Aµ,k = 0 in Algorithm 1 and v µ = 0 in Algorithm 2)
indicates deficiency of column rank and the potential for an infinite number of solutions. These
38-10
Handbook of Linear Algebra
conditions are sensitive to rounding errors that may occur in the calculation of those columns
and as such, GEPP/PLU is applied for the most part in full column rank settings (rank( A) = n),
guaranteeing that no zero pivots are encountered and that no reduction steps are skipped.
14. Both Algorithms 1 and 2 require approximately 23 ρ 3 +ρm(n−ρ)+ρn(m−ρ) arithmetic operations.
Algorithm 1 involves as a core calculation the updating of a submatrix having ever diminishing size.
For large matrix dimension, the contents of this submatrix, Ar +1:m,k+1:n , may be widely scattered
through computer memory and a performance penalty can occur in gathering the data for computation (which can be costly relative to the number of arithmetic operations that must be performed).
Algorithm 2 is a reorganization that avoids excess data motion by delaying updates to columns until
the step within which they have zeros introduced. This forces modifications to the matrix entries
to be made just one column at a time and the necessary data motion can be more efficient.
15. Other strategies for avoiding the adverse effects of small pivots exist. Some are more aggressive than
partial pivoting in producing the largest possible pivot, others are more restrained.
“Complete pivoting” uses both row and column permutations to bring in the largest possible
pivot: If zeros are to be introduced into the k th column in row entries r + 1 to m, then one seeks
indices µ and ν such that r ≤ µ ≤ m and k < ν ≤ n such that |Aµ,ν | = max r ≤i ≤m |Ai, j |. Gauss
k< j ≤n
elimination with complete pivoting produces a unit lower triangular matrix L ∈ Rm×m , an upper
triangular matrix U ∈ Rm×n , and two permutation matrices, P and Q, so that P AQ = LU .
“Threshold pivoting” identifies pivot candidates in each step that achieve a significant (predetermined) fraction of the magnitude of the pivot that would have been used in that step for partial
pivoting: Consider all µ̂ such that r ≤ µ̂ ≤ m and |Aµ̂,k | ≥ τ · maxr ≤i ≤m |Ai,k |, where τ ∈ (0, 1)
is a given threshold. This allows pivots to be chosen on the basis of other criteria such as influence
on sparsity while still providing some protection from instability. τ can often be chosen quite small
(τ = 0.1 or τ = 0.025 are typical values).
16. If P̂ ∈ Rm×m , L̂ ∈ Rm×m , and Û ∈ Rm×n are the computed permutation matrix and LU factors
from either Algorithm 1 or 2 on A ∈ Rm×n , then
L̂ Û = P̂ (A + δ A)
with
2n
| L̂ | |Û |
1−n
|δ A| ≤
and for the particular case that m = n and A is nonsingular, if an approximate solution, x̂, to
Ax = b is computed by solving the two triangular linear systems, L̂ y = P̂ b and Û x̂ = y, then x̂ is
the exact solution to a perturbed linear system:
(A + δ A)x̂ = b with
|δ A| ≤
2n
P̂ T | L̂ | |Û |.
1−n
Furthermore, |L i, j | ≤ 1 and |Ui, j | ≤ 2i −1 maxk≤i |Ak, j |, so
δ A∞ ≤
2n n2 A∞ .
1−n
Examples:
1. Using Algorithm 1, find a permuted LU factorization of
⎡
1
⎢ 2
⎢
A=⎣
−1
1
1
2
−1
1
2
4
−1
3
⎤
2
6⎥
⎥.
1⎦
1
38-11
Matrix Factorizations and Direct Solution of Linear Systems
p = [1 2 3 4], r ← 1
µ ← 2 p = [2⎡1 3 4]
2
Setup:
k = 1 step:
Permuted A:
⎡
L=
LU snapshot:
2
4 6
⎤
⎢ 1 1 2 3⎥
⎢
⎥
⎣−1 −1 −2 1⎦
1
⎤1
1 0 0 0
3 1
⎡
2 2 4 6
⎢ 12 1 0 0⎥
⎢0 0 0 0⎥
⎢
⎥
⎢
⎥
⎣− 12 0 1 0⎦ and U = ⎣0 0 0 0⎦.
1
2
0
0 1
⎡
0 0
−1
0 1
−2
Updated A2:4,2:4 : ⎣0 1
⎤
0 0 0 0
4⎦
r ←2
µ ← 2, |A2,2 | = max2≤i ≤4 |Ai,2 | = 0
µ ← 3, p = [2 3 1 ⎡
4], |A3,3 |⎤
= max2≤i ≤4 |Ai,3 | = 1
1
4
k = 2 step:
k = 3 step:
⎣0 −1⎦
Permuted A2:4,3:4 :
1 −2
⎡
⎤
⎡ 1 0 0 0⎤
2 2 4 6
1
⎢0 0 1 4⎥
⎢− 2 1 0 0⎥
⎥
L= ⎣ 1
⎦ and U = ⎢
⎣0 0 0 0⎦.
0
1
0
2
LU snapshot:
1
2
1
Updated A3:4,4 :
0 1
0 0 0 0
−1
−6
r ←3
µ ← 4, p = [2 3 41], |A
4,4 | = max3≤i ≤4 |Ai,4 | = 6
−6
Permuted A2:4,3:4 :
−1
⎡
⎤
⎡
1 0 0 0
2 2 4
k = 4 step:
L=
LU snapshot:
0
1
6
0
1
0 0
The permutation matrix associated with p = [2 3 4 1] is
⎡
⎤
0 1 0 0
⎡
⎤
⎢0
⎢
P = ⎢
⎣0
0
1
0
0
1⎦
1
0
0
0
⎡
1
⎢
⎥ ⎢ 1
⎢−1 −1 − 1⎥ ⎢− 2
⎥=⎢
PA= ⎢
⎢ 1
1
1 3 1⎥
⎣
⎦ ⎣ 2
1
1
1 2 2
2
2
2
4
6
0
1
1
0
6
⎤
⎢− 12 1 0 0⎥
⎢0 0 1 4⎥
⎢
⎥
⎢
⎥
⎣ 12 1 1 0⎦ and U = ⎣0 0 0 −6⎦.
1
2
and
⎤
0⎥
⎥
⎥
⎤ ⎡
2
⎥
⎢
0 0⎥ ⎢0
⎥ ⎢
1 0⎦ ⎣0
1
0
1
6
0
0
2. Using Algorithm 2, solve the system of linear equations
⎡
⎤⎡ ⎤ ⎡ ⎤
1
3
1
x1
1
⎢
⎥⎢ ⎥ ⎢ ⎥
2 −1⎦ ⎣ x2 ⎦ = ⎣−3⎦ .
⎣2
2 −1
0
3
x3
6
⎤
2
4
0
1
0
0
−6⎦
0
0
0
4⎥
⎥
⎥ = L · U.
0
38-12
Handbook of Linear Algebra
Phase 1: Find permuted LU decomposition.
r ←1
⎡ ⎤
⎡ ⎤
1
j = 1 step:
2
v ← ⎣2⎦. π 1 ← µ = 2. Permuted v: ⎣1⎦
2
LU snapshot:
⎡
1 0 0
L = ⎣ 12
π = [2]
2
⎤
2 0 0
0⎦
1
⎡
j = 2 step:
3
⎤
⎤
and U = ⎣0 0 0⎦ .
0 0 0
1 0 1
r ←2
⎡
⎡
2
⎤
v ← ⎣ 2⎦. Permuted v: ⎣ 3⎦.
−1
−1
Solve 1 · z = 2. U1,2 ← z = [2].
v2
←
v3
2
−3
=
3
1
2
−
−1
[2]
1
π2 ← µ = 3. L 3,2 ⎡
← − 23 and U⎤
2,2 ← −3
1
0 0
LU snapshot:
π = [2, 3]
L = ⎣1
1 0⎦
− 23
1
2
r ←3
j = 3 step:
v←
U1,3
U2,3
←z=
π = [2, 3, 3]
2 0
0
0 0
0
1
0
L = ⎣1
1
1
2
− 23
⎤
0
0⎦
1
−1
·z=
,
0
−1
. v 3 ← 2 16 = 1 − [ 12 , − 23 ] ·
1
0⎦ .
−1
π 3 → 3 and U3,3 ←⎡
2 16
1
LU snapshot:
⎤
−1
1
1
⎣
⎦
0 . Solve
−1 Permuted v:
1
0
1
⎤
2
and U = ⎣0 −3
1
⎡
⎡
1
⎡
2
2
and U = ⎣0 −3
0
0
⎡
⎤
0 1 0
⎢
⎥
The permutation matrix associated with π is P = ⎣0 0 1⎦ and
1 0 0
⎡
⎤ ⎡
⎤⎡
⎤
2
2 −1
1
0 0
2
2 −1
⎢
⎥ ⎢
⎥⎢
⎥
P A = ⎣2 −1
0⎦ = ⎣ 1
1 0⎦ ⎣0 −3
1⎦ = L · U.
1
1
3
1
0
0 2 16
− 23 1
2
Phase 2: Solve the lower triangular system L y = P b.
⎡
⎤⎡ ⎤ ⎡ ⎤
y1
−3
⎢
⎥⎢ ⎥ ⎢ ⎥
1 0⎦ ⎣ y 2 ⎦ = ⎣ 3 ⎦
⎣1
1
− 23 1
y3
1
2
1
0 0
⇒
1
y1 = −3, y2 = 6, y3 = 6 .
2
Phase 3: Solve the upper triangular system U x = y.
⎡
⎤⎡ ⎤ ⎡ ⎤
2
2 −1
x1
−3
⎢
⎥⎢ ⎥ ⎢ ⎥
1⎦ ⎣ x 2 ⎦ = ⎣ 6 ⎦
⎣0 −3
0
0 2 16
6 12
x3
⇒
x1 = 1, x2 = −1, x3 = 3.
⎤
−1
1⎦ .
2 16
38-13
Matrix Factorizations and Direct Solution of Linear Systems
38.4
Orthogonalization and QR Decomposition
The process of transforming an arbitrary linear system into a triangular system may also be approached
by systematically introducing zeros into the coefficient matrix with unitary transformations: Given a
system Ax = b, find unitary matrices V1 , V2 , · · · V such that V . . . V2 V1 A = T is triangular; calculate
y = V · · · V2 V1 b; solve the triangular system T x = y.
There are two different types of rudimentary unitary transformations that are described here: Householder transformations and Givens transformations.
Definitions:
2
∗
Let v ∈ Cn be a nonzero vector. The matrix H = I − v
2 vv is called a Householder transformation
2
(or Householder reflector). In this context, v, is called a Householder vector.
For θ, ϑ ∈ [0, 2π), let G (i, j, θ, ϑ) be an n × n identity matrix modified so that the (i, i ) and ( j, j )
entries are replaced by c = cos(θ), the (i, j ) entry is replaced by s = e ı ϑ sin(θ), and the ( j, i ) entry is
replaced by −s̄ = −e −ı ϑ sin(θ):
⎡
1 ··· 0
⎢. .
..
⎢.
..
.
⎢.
⎢
⎢0 · · · c
⎢
⎢.
..
G (i, j, θ, ϑ) = ⎢
.
⎢ ..
⎢
⎢0 · · · −s̄
⎢
⎢.
..
⎢.
.
⎣.
0
···
0
···
0
..
.
···
···
s
..
.
···
c
..
.
···
0
···
..
.
···
···
..
.
⎤
0
.. ⎥
⎥
.⎥
⎥
0⎥
⎥
.. ⎥
⎥
.⎥ .
⎥
0⎥
⎥
.. ⎥
⎥
.⎦
1
G (i, j, θ, ϑ) is called a Givens transformation (or Givens rotation).
Facts: [GV96]
1. Householder transformations are unitary matrices.
2. Let a ∈ Cn be a nonzero vector. Define v = sign(a1 )ae1 + a with e1 = [1, 0, . . . , 0]T ∈ Cn . Then
2
the Householder transformation H = I −
vv∗ satisfies
v22
Ha = αe1 with α = −sign(a1 )a.
3. [GV96,pp. 210–213]
Algorithm 3: Householder QR Factorization:
Input: matrix A ∈ Cm×n with m ≥ n
Output: the QR factorization A = Q R, where the upper triangular part of R is stored in the
upper triangular part of A
Q = Im
For k = 1 : n
x = Ak:m,k
vk = sign(x1 )xe1 + x, where e1 ∈ Cm−k+1
vk = vk /vk Ak:m,k:n = (Im−k+1 − 2vk v∗k )Ak:m,k:n
Q 1:k−1,k:m = Q 1:k−1,k:m (Im−k+1 − 2vk v∗k )
Q k:m,k:m = Q k:m,k:m (Im−k+1 − 2vk v∗k )
38-14
Handbook of Linear Algebra
4. [GV96, p. 215] A Givens rotation is a unitary matrix.
5. [GV96, pp. 216–221] For any scalars x, y ∈ C, there exists a Givens rotation G ∈ C2×2 such that
x
c
=
G
y
−s̄
x
r
=
,
c
y
0
s
where c , s , and r can be computed via
r If y = 0 (includes the case x = y = 0), then c = 1, s = 0, r = x.
r If x = 0 (y must be nonzero), then c = 0, s = sign( ȳ), r = |y|.
#
2
2
#
# |x| + |y| ,
s = sign(x) ȳ/ |x|2 + |y|2 , r = sign(x) |x|2 + |y|2 .
r If both x and y are nonzero, then c = |x|/
6. [GV96, pp. 226–227]
Algorithm 4: Givens QR Factorization
Input: matrix A ∈ Cm×n with m ≥ n
Output: the QR factorization A = Q R, where the upper triangular part of R is stored in the
upper triangular part of A
Q = Im
For k = 1 : n
For i = k + 1 : m
[x, y] = [Akk , Ai k ]
c s
Compute G =
via Fact 5.
−s̄ c
Ak,k:n
Ak,k:n
=G
Ai,k:n
Ai,k:n
[Q 1:m,k , Q 1:m,i ] = [Q 1:m,k , Q 1:m,i ]G ∗
7. [GV96, p. 212] In many applications, it is not necessary to compute Q explicitly in Algorithm 3
and Algorithm 4. See also [TB97, p. 74] for details.
8. [GV96, pp. 225–227] If A ∈ Rm×n with m ≥ n, then the cost of Algorithm 3 without explicitly
computing Q is 2n2 (m − n/3) flops and the cost of Algorithm 4 without explicitly computing Q
is 3n2 (m − n/3) flops.
9. [Mey00, p. 349] Algorithm 3 and Algorithm 4 are numerically stable for computing the QR factorization.
Examples:
⎡
⎤
1 1
1. We shall use Givens rotations to transform A = ⎣1 2⎦ to upper triangular form, as in Algorithm 4.
1 3
First, to annihilate
the
element
in
position
(2,1),
we use Fact 5 with (x, y) = (1, 1) and obtain
√
c = s = 1/ 2; hence:
⎡
⎤⎡
⎤ ⎡
⎤
0.7071
0.7071 0
1 1
1.4142 2.1213
⎢
⎥⎢
⎥ ⎢
⎥
A(1) = G 1 A = ⎣−0.7071 0.7071 0⎦ ⎣1 2⎦ = ⎣0
0.7071⎦ .
1 3
0
0
1
1
3
Matrix Factorizations and Direct Solution of Linear Systems
38-15
Next, to annihilate the element in position (3,1), we use (x, y) = (1.4142, 1) in Fact 5 and get
A(2) = G 2 A(1)
⎡
⎤
⎡
⎤
0.8165
0 0.5774
1.7321 3.4641
⎢
⎥ (1) ⎢
⎥
= ⎣0
1 0
0.7071⎦ .
⎦ A = ⎣0
−0.5774 0 0.8165
0
1.2247
Finally, we annihilate the element in position (3,2) using (x, y) = (.7071, 1.2247):
A(3) = G 3 A(2)
⎡
⎤
⎡
⎤
1.7321 3.4641
1 0
0
⎢
⎥
= ⎣0 0.5000
1.4142⎦ .
0.8660⎦ A(2) = ⎣0
0 −0.8660 0.5000
0
0
As a result, R = A(3) and $
R consists of the first two rows of A(3) . The matrix Q can be computed
T T T
as the product G 1 G 2 G 3 .
2. We shall use Householder reflections to transform A from Example 1 to upper triangular form
T
√
T
√
as in Algorithm 3. First, let a = A:,1 = 1 1 1 , γ1 = − 3, "
a = − 3 0 0 , and
u1 = 0.8881
0.3251
%
0.3251
; then
⎡
⎤
2u1T A
−1.7321 −3.4641
A = A − u1 3.0764 5.0267 = ⎣0
0.3660⎦ .
0
1.3660
&
T
A(1) = I − 2u1 u1
Next, γ2 = −A(1)
2:3,2 2 , u2 = 0
%
T
&
0.7934
A(2) = I − 2u2 u2T A(1)
0.6088
T
, and
⎡
⎤
2u2T A(1)
−1.7321 −3.4641
= A(1) − u2 0 2.2439 = ⎣0
−1.4142⎦ .
0
0
(2)
Note
% that R =T &A% has changed
& sign as compared with Example 1. The matrix Q can be computed
as I − 2u1 u1 I − 2u2 u2T . Therefore, we have full information about the transformation if we
store the vectors u1 and u2 .
38.5
Symmetric Factorizations
Real symmetric matrices (A = AT ) and their complex analogs, Hermitian matrices (Chapter 8), are
specified by roughly half the number of parameters than general n × n matrices, so one could anticipate
benefits that take advantage of this structure.
Definitions:
An n × n matrix, A, is Hermitian if A = A∗ = ĀT .
A ∈ Cn×n is positive-definite if x∗ Ax > 0 for all x ∈ Cn with x = 0.
The Cholesky decomposition (or Cholesky factorization) of a positive-definite matrix A is A = G G ∗
with G ∈ Cn×n lower triangular and having positive diagonal entries.
Facts: [Hig96], [GV96]
1. A positive-definite matrix is Hermitian. Note that the similar but weaker assertion for a matrix
A ∈ Rn×n that “xT Ax > 0 for all x ∈ Rn with x = 0” does not imply that A = AT .
38-16
Handbook of Linear Algebra
2. If A ∈ Cn×n is positive-definite, then A has an LU decomposition, A = LU , and the diagonal of
U , {u11 , u22 , . . . , unn }, has strictly positive entries.
3. If A ∈ Cn×n is positive-definite, then the LU decomposition of A satisfies A = LU with U = D L ∗
and D = diag(U ). Thus, A can be written as A = L D L ∗ with L unit lower triangular and D
diagonal with positive diagonal entries. Furthermore, A has a Cholesky decomposition A = G G ∗
with G ∈ Cn×n lower triangular. Indeed, if
$ = di ag ({√u11 , √u22 , . . . , √unn })
D
$D
$ = D and G = L D.
$
then D
4. [GV96, p. 144] The Cholesky decomposition of a positive-definite matrix A can be computed
directly:
Algorithm 1: Cholesky decomposition of a positive-definite matrix
Input: A ∈ Cn×n positive definite
Output: G ∈ Cn×n (lower triangular matrix so that A = G G ∗ )
G ← 0 ∈ Cn×n ;
For j = 1 to n
v ← A j :n, j
for k = 1 to j − 1,
v ← v − G j,k G j :n,k
G j :n, j ← √1v 1 v
5. Algorithm 1 requires approximately n3 /3 floating point arithmetic operations and n floating point
square roots to complete (roughly half of what is required for an LU decomposition).
6. If A ∈ Rn×n is symmetric and positive-definite and Algorithm 1 runs to completion producing a
computed Cholesky factor Ĝ ∈ Rn×n , then
Ĝ Ĝ T = A + δ A with
|δ A| ≤
(n + 1) |Ĝ | |Ĝ T |.
1 − (n + 1) Furthermore, if an approximate solution, x̂, to Ax = b is computed by solving the two triangular
√
linear systems Ĝ y = b and Ĝ T x̂ = y, and a scaling matrix is defined as = diag( aii ), then the
scaled error (x − x̂) satisfies
κ2 (H) (x − x̂)2
,
≤
x2
1 − κ2 (H) where A = H . If κ2 (H) κ2 (A), then it is quite likely that the entries of x̂ will have
mostly the same magnitude and so the error bound suggests that all entries of the solution will be
computed to high relative accuracy.
7. If A ∈ Cn×n is Hermitian and has all leading principal submatrices nonsingular, then A has an
LU decomposition that can be written as A = LU = L D L ∗ with L unit lower triangular and D
diagonal with real diagonal entries. Furthermore, the number of positive and negative entries of D
is equal to the number of positive and negative eigenvalues of A, respectively (the Sylvester law of
inertia).
8. Note that it may not be prudent to compute the LU (or L D L T ) decomposition of a Hermitian
indefinite matrix A without pivoting, yet the use of pivoting will likely eliminate the advantages
symmetry might offer. An alternative is a block L D L T decomposition that incorporates a diagonal
pivoting strategy (see [GV96] for details).
38-17
Matrix Factorizations and Direct Solution of Linear Systems
Examples:
1. Calculate the Cholesky decomposition of the 3 × 3 Hilbert matrix,
⎡
⎢
⎤
1
3
1⎥
.
4⎦
1
5
1
2
1
3
1
4
1
A = ⎣ 21
1
3
⎡
0 0 0
Setup:
⎤
G ← ⎣0 0 0⎦ .
0 0 0
j = 1 step:
G snapshot:
v ← [1, 12 , 13 ]T
⎡
1
0 0
G = ⎣ 12
0
0⎦
1
3
0
0
3
1
4
−
1
j = 2 step:
v←
⎡
G snapshot:
G=
G snapshot:
v←
⎣ 12
1
5
⎡
1
1
1
3
j = 3 step:
⎤
−
1
G = ⎣2
1
1
3
1
2
2
1
3
0
3
0
1
√
2 3
1
√
2 3
=
0
⎤
12
1
12
0⎦
1
√
2 3
1
√
2 3
% 1 &2
1
0
−
'
1
√
2 3
0
⎤
0
⎦
(2
=
1
180
=
'
1
√
(2
6 5
1
√
6 5
References
[Dem97] J. Demmel. Applied Numerical Linear Algebra. SIAM, Philadelphia, 1997.
[GV96] G.H. Golub and C.F. Van Loan. Matrix Computations. 3rd ed., Johns Hopkins University Press,
Baltimore, MD, 1996.
[Hig96] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 1996.
[Mey00] C. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia, 2000.
[StS90] G.W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, San Diego, CA, 1990.
[TB97] L.N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, Philadelphia, 1997.