...

Stationary values of manyvariable functions

by taratuta

on
Category: Documents
40

views

Report

Comments

Transcript

Stationary values of manyvariable functions
PARTIAL DIFFERENTIATION
theorem then becomes
f(x) = f(x0 ) +
∂f
1 ∂2 f
∆xi +
∆xi ∆xj + · · · ,
∂xi
2! i j ∂xi ∂xj
i
(5.20)
where ∆xi = xi − xi0 and the partial derivatives are evaluated at (x10 , x20 , . . . , xn0 ).
For completeness, we note that in this case the full Taylor series can be written
in the form
∞
1
(∆x · ∇)n f(x) x=x0 ,
f(x) =
n!
n=0
where ∇ is the vector differential operator del, to be discussed in chapter 10.
5.8 Stationary values of many-variable functions
The idea of the stationary points of a function of just one variable has already
been discussed in subsection 2.1.8. We recall that the function f(x) has a stationary
point at x = x0 if its gradient df/dx is zero at that point. A function may have
any number of stationary points, and their nature, i.e. whether they are maxima,
minima or stationary points of inflection, is determined by the value of the second
derivative at the point. A stationary point is
(i) a minimum if d2 f/dx2 > 0;
(ii) a maximum if d2 f/dx2 < 0;
(iii) a stationary point of inflection if d2 f/dx2 = 0 and changes sign through
the point.
We now consider the stationary points of functions of more than one variable;
we will see that partial differential analysis is ideally suited to the determination
of the position and nature of such points. It is helpful to consider first the case
of a function of just two variables but, even in this case, the general situation
is more complex than that for a function of one variable, as can be seen from
figure 5.2.
This figure shows part of a three-dimensional model of a function f(x, y). At
positions P and B there are a peak and a bowl respectively or, more mathematically, a local maximum and a local minimum. At position S the gradient in any
direction is zero but the situation is complicated, since a section parallel to the
plane x = 0 would show a maximum, but one parallel to the plane y = 0 would
show a minimum. A point such as S is known as a saddle point. The orientation
of the ‘saddle’ in the xy-plane is irrelevant; it is as shown in the figure solely for
ease of discussion. For any saddle point the function increases in some directions
away from the point but decreases in other directions.
162
5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS
S
P
B
y
x
Figure 5.2 Stationary points of a function of two variables. A minimum
occurs at B, a maximum at P and a saddle point at S.
For functions of two variables, such as the one shown, it should be clear that a
necessary condition for a stationary point (maximum, minimum or saddle point)
to occur is that
∂f
=0
∂x
and
∂f
= 0.
∂y
(5.21)
The vanishing of the partial derivatives in directions parallel to the axes is enough
to ensure that the partial derivative in any arbitrary direction is also zero. The
latter can be considered as the superposition of two contributions, one along
each axis; since both contributions are zero, so is the partial derivative in the
arbitrary direction. This may be made more precise by considering the total
differential
df =
∂f
∂f
dx +
dy.
∂x
∂y
Using (5.21) we see that although the infinitesimal changes dx and dy can be
chosen independently the change in the value of the infinitesimal function df is
always zero at a stationary point.
We now turn our attention to determining the nature of a stationary point of
a function of two variables, i.e. whether it is a maximum, a minimum or a saddle
point. By analogy with the one-variable case we see that ∂2 f/∂x2 and ∂2 f/∂y 2
must both be positive for a minimum and both be negative for a maximum.
However these are not sufficient conditions since they could also be obeyed at
complicated saddle points. What is important for a minimum (or maximum) is
that the second partial derivative must be positive (or negative) in all directions,
not just in the x- and y- directions.
163
PARTIAL DIFFERENTIATION
To establish just what constitutes sufficient conditions we first note that, since
f is a function of two variables and ∂f/∂x = ∂f/∂y = 0, a Taylor expansion of
the type (5.18) about the stationary point yields
f(x, y) − f(x0 , y0 ) ≈
1 (∆x)2 fxx + 2∆x∆yfxy + (∆y)2 fyy ,
2!
where ∆x = x − x0 and ∆y = y − y0 and where the partial derivatives have been
written in more compact notation. Rearranging the contents of the bracket as
the weighted sum of two squares, we find
2
2
fxy
1
fxy ∆y
2
fxx ∆x +
f(x, y) − f(x0 , y0 ) ≈
+ (∆y) fyy −
.
2
fxx
fxx
(5.22)
For a minimum, we require (5.22) to be positive for all ∆x and ∆y, and hence
2
/fxx ) > 0. Given the first constraint, the second can be
fxx > 0 and fyy − (fxy
2
written fxx fyy > fxy . Similarly for a maximum we require (5.22) to be negative,
2
and hence fxx < 0 and fxx fyy > fxy
. For minima and maxima, symmetry requires
that fyy obeys the same criteria as fxx . When (5.22) is negative (or zero) for some
values of ∆x and ∆y but positive (or zero) for others, we have a saddle point. In
2
. In summary, all stationary points have fx = fy = 0 and
this case fxx fyy < fxy
they may be classified further as
2
< fxx fyy ,
(i) minima if both fxx and fyy are positive and fxy
2
(ii) maxima if both fxx and fyy are negative and fxy
< fxx fyy ,
2
> fxx fyy .
(iii) saddle points if fxx and fyy have opposite signs or fxy
2
Note, however, that if fxy
= fxx fyy then f(x, y) − f(x0 , y0 ) can be written in one
of the four forms
2
1
∆x|fxx |1/2 ± ∆y|fyy |1/2 .
±
2
For some choice of the ratio ∆y/∆x this expression has zero value, showing
that, for a displacement from the stationary point in this particular direction,
f(x0 + ∆x, y0 + ∆y) does not differ from f(x0 , y0 ) to second order in ∆x and
∆y; in such situations further investigation is required. In particular, if fxx , fyy
and fxy are all zero then the Taylor expansion has to be taken to a higher
order. As examples, such extended investigations would show that the function
f(x, y) = x4 + y 4 has a minimum at the origin but that g(x, y) = x4 + y 3 has a
saddle point there.
164
5.8 STATIONARY VALUES OF MANY-VARIABLE FUNCTIONS
Show that the function f(x, y) = x3 exp(−x2 − y 2 ) has a maximum at the point ( 3/2, 0),
a minimum at (− 3/2, 0) and a stationary point at the origin whose nature cannot be
determined by the above procedures.
Setting the first two partial derivatives to zero to locate the stationary points, we find
∂f
= (3x2 − 2x4 ) exp(−x2 − y 2 ) = 0,
∂x
∂f
= −2yx3 exp(−x2 − y 2 ) = 0.
∂y
(5.23)
(5.24)
For (5.24) to be
satisfied we require x = 0 or y = 0 and for (5.23)
to be satisfied we
require
x = 0 or x = ± 3/2. Hence the stationary points are at (0, 0), ( 3/2, 0) and (− 3/2, 0).
We now find the second partial derivatives:
fxx = (4x5 − 14x3 + 6x) exp(−x2 − y 2 ),
fyy = x3 (4y 2 − 2) exp(−x2 − y 2 ),
fxy = 2x2 y(2x2 − 3) exp(−x2 − y 2 ).
We then substitute the pairs of values of x and y for each stationary point and find that
at (0, 0)
fxx = 0,
and at (±
fyy = 0,
fxy = 0
3/2, 0)
fxx = ∓6
3/2 exp(−3/2),
fyy = ∓3
3/2 exp(−3/2),
fxy = 0.
Hence, applying criteria (i)–(iii) above, we find that (0, 0) is an undetermined stationary
point, ( 3/2, 0) is a maximum and (− 3/2, 0) is a minimum. The function is shown in
figure 5.3. Determining the nature of stationary points for functions of a general number
of variables is considerably more difficult and requires a knowledge of the
eigenvectors and eigenvalues of matrices. Although these are not discussed until
chapter 8, we present the analysis here for completeness. The remainder of this
section can therefore be omitted on a first reading.
For a function of n real variables, f(x1 , x2 , . . . , xn ), we require that, at all
stationary points,
∂f
=0
∂xi
for all xi .
In order to determine the nature of a stationary point, we must expand the
function as a Taylor series about the point. Recalling the Taylor expansion (5.20)
for a function of n variables, we see that
∆f = f(x) − f(x0 ) ≈
1 ∂2 f
∆xi ∆xj .
2 i j ∂xi ∂xj
165
(5.25)
PARTIAL DIFFERENTIATION
maximum
0.4
0.2
0
−0.2
−0.4
2
minimum
−3
−2
−1
x
0
1
2
3
−2
0y
Figure 5.3 The function f(x, y) = x3 exp(−x2 − y 2 ).
If we define the matrix M to have elements given by
Mij =
∂2 f
,
∂xi ∂xj
then we can rewrite (5.25) as
∆f = 12 ∆xT M∆x,
(5.26)
where ∆x is the column vector with the ∆xi as its components and ∆xT is its
transpose. Since M is real and symmetric it has n real eigenvalues λr and n
orthogonal eigenvectors er , which after suitable normalisation satisfy
eTr es = δrs ,
Mer = λr er ,
where the Kronecker delta, written δrs , equals unity for r = s and equals zero
otherwise. These eigenvectors form a basis set for the n-dimensional space and
we can therefore expand ∆x in terms of them, obtaining
ar er ,
∆x =
r
166
Fly UP