Comments
Description
Transcript
Stationary values under constraints
5.9 STATIONARY VALUES UNDER CONSTRAINTS where the ar are coefficients dependent upon ∆x. Substituting this into (5.26), we find λr a2r . ∆f = 12 ∆xT M∆x = 12 r Now, for the stationary point to be a minimum, we require ∆f = 12 r λr a2r > 0 for all sets of values of the ar , and therefore all the eigenvalues of M to be greater than zero. Conversely, for a maximum we require ∆f = 12 r λr a2r < 0, and therefore all the eigenvalues of M to be less than zero. If the eigenvalues have mixed signs, then we have a saddle point. Note that the test may fail if some or all of the eigenvalues are equal to zero and all the non-zero ones have the same sign. Derive the conditions for maxima, minima and saddle points for a function of two real variables, using the above analysis. For a two-variable function the matrix M is given by fxx fxy M= . fyx fyy Therefore its eigenvalues satisfy the equation fxx − λ fxy fxy fyy − λ = 0. Hence 2 (fxx − λ)(fyy − λ) − fxy =0 ⇒ ⇒ 2 fxx fyy − (fxx + fyy )λ + λ2 − fxy =0 2λ = (fxx + fyy ) ± 2 ), (fxx + fyy )2 − 4(fxx fyy − fxy which by rearrangement of the terms under the square root gives 2 . 2λ = (fxx + fyy ) ± (fxx − fyy )2 + 4fxy Now, that M is real and symmetric implies that its eigenvalues are real, and so for both eigenvalues to be positive (corresponding to a minimum), we require fxx and fyy positive and also 2 ), fxx + fyy > (fxx + fyy )2 − 4(fxx fyy − fxy ⇒ 2 > 0. fxx fyy − fxy A similar procedure will find the criteria for maxima and saddle points. 5.9 Stationary values under constraints In the previous section we looked at the problem of finding stationary values of a function of two or more variables when all the variables may be independently 167 PARTIAL DIFFERENTIATION varied. However, it is often the case in physical problems that not all the variables used to describe a situation are in fact independent, i.e. some relationship between the variables must be satisfied. For example, if we walk through a hilly landscape and we are constrained to walk along a path, we will never reach the highest peak on the landscape unless the path happens to take us to it. Nevertheless, we can still find the highest point that we have reached during our journey. We first discuss the case of a function of just two variables. Let us consider finding the maximum value of the differentiable function f(x, y) subject to the constraint g(x, y) = c, where c is a constant. In the above analogy, f(x, y) might represent the height of the land above sea-level in some hilly region, whilst g(x, y) = c is the equation of the path along which we walk. We could, of course, use the constraint g(x, y) = c to substitute for x or y in f(x, y), thereby obtaining a new function of only one variable whose stationary points could be found using the methods discussed in subsection 2.1.8. However, such a procedure can involve a lot of algebra and becomes very tedious for functions of more than two variables. A more direct method for solving such problems is the method of Lagrange undetermined multipliers, which we now discuss. To maximise f we require df = ∂f ∂f dx + dy = 0. ∂x ∂y If dx and dy were independent, we could conclude fx = 0 = fy . However, here they are not independent, but constrained because g is constant: dg = ∂g ∂g dx + dy = 0. ∂x ∂y Multiplying dg by an as yet unknown number λ and adding it to df we obtain ∂f ∂g ∂g ∂f d(f + λg) = +λ +λ dx + dy = 0, ∂x ∂x ∂y ∂y where λ is called a Lagrange undetermined multiplier. In this equation dx and dy are to be independent and arbitrary; we must therefore choose λ such that ∂g ∂f +λ = 0, ∂x ∂x (5.27) ∂f ∂g +λ = 0. ∂y ∂y (5.28) These equations, together with the constraint g(x, y) = c, are sufficient to find the three unknowns, i.e. λ and the values of x and y at the stationary point. 168 5.9 STATIONARY VALUES UNDER CONSTRAINTS The temperature of a point (x, y) on a unit circle is given by T (x, y) = 1 + xy. Find the temperature of the two hottest points on the circle. We need to maximise T (x, y) subject to the constraint x2 + y 2 = 1. Applying (5.27) and (5.28), we obtain y + 2λx = 0, (5.29) x + 2λy = 0. (5.30) These results, together with the original constraint x2 + y 2 = 1, provide three simultaneous equations that may be solved for λ, x and y. From (5.29) and (5.30) we find λ = ±1/2, which in turn implies that y = ∓x. Remembering that x2 + y 2 = 1, we find that 1 y = x ⇒ x = ±√ , 2 1 y = −x ⇒ x = ∓ √ , 2 1 y = ±√ 2 1 y = ±√ . 2 We have not yet determined which of these stationary points are maxima and which are minima. In this simple case, we need only substitute the four pairs of x- and y- values into T (x, y) = 1 + xy to find √ that the maximum temperature on the unit circle is Tmax = 3/2 at the points y = x = ±1/ 2. The method of Lagrange multipliers can be used to find the stationary points of functions of more than two variables, subject to several constraints, provided that the number of constraints is smaller than the number of variables. For example, if we wish to find the stationary points of f(x, y, z) subject to the constraints g(x, y, z) = c1 and h(x, y, z) = c2 , where c1 and c2 are constants, then we proceed as above, obtaining ∂f ∂g ∂h ∂ (f + λg + µh) = +λ +µ = 0, ∂x ∂x ∂x ∂x ∂ ∂f ∂g ∂h (f + λg + µh) = +λ +µ = 0, ∂y ∂y ∂y ∂y (5.31) ∂ ∂f ∂g ∂h (f + λg + µh) = +λ +µ = 0. ∂z ∂z ∂z ∂z We may now solve these three equations, together with the two constraints, to give λ, µ, x, y and z. 169 PARTIAL DIFFERENTIATION Find the stationary points of f(x, y, z) = x3 + y 3 + z 3 subject to the following constraints: (i) g(x, y, z) = x2 + y 2 + z 2 = 1; (ii) g(x, y, z) = x2 + y 2 + z 2 = 1 and h(x, y, z) = x + y + z = 0. Case (i). Since there is only one constraint in this case, we need only introduce a single Lagrange multiplier to obtain ∂ (f + λg) = 3x2 + 2λx = 0, ∂x ∂ (5.32) (f + λg) = 3y 2 + 2λy = 0, ∂y ∂ (f + λg) = 3z 2 + 2λz = 0. ∂z These equations are highly symmetrical and clearly have √ the solution x = y = z = −2λ/3. Using the constraint x2 + y 2 + z 2 = 1 we find λ = ± 3/2 and so stationary points occur at 1 (5.33) x = y = z = ±√ . 3 In solving the three equations (5.32) in this way, however, we have implicitly assumed that x, y and z are non-zero. However, it is clear from (5.32) that any of these values can equal zero, with the exception of the case x = y = z = 0 since this is prohibited by the constraint x2 + y 2 + z 2 = 1. We must consider the other cases separately. If x = 0, for example, we require 3y 2 + 2λy = 0, 3z 2 + 2λz = 0, y 2 + z 2 = 1. Clearly, we require λ = 0, otherwise these equations are inconsistent. If neither y nor z is√zero we find y = −2λ/3 = z and from the third equation we require y = z = ±1/ 2. If y = 0, however, then z = ±1 and, similarly, if z = 0 then √ y =√±1. Thus the stationary points having x = 0 are (0, 0, ±1), (0, ±1, 0) and (0, ±1/ 2, ±1/ 2). A similar procedure can be followed for the cases y = 0 and z = 0 respectively addition √ and, in √ 2, 0, ±1/ 2) and to those already obtained, we find the stationary points (±1, 0, 0), (±1/ √ √ (±1/ 2, ±1/ 2, 0). Case (ii). We now have two constraints and must therefore introduce two Lagrange multipliers to obtain (cf. (5.31)) ∂ (5.34) (f + λg + µh) = 3x2 + 2λx + µ = 0, ∂x ∂ (5.35) (f + λg + µh) = 3y 2 + 2λy + µ = 0, ∂y ∂ (5.36) (f + λg + µh) = 3z 2 + 2λz + µ = 0. ∂z These equations are again highly symmetrical and the simplest way to proceed is to subtract (5.35) from (5.34) to obtain ⇒ 3(x2 − y 2 ) + 2λ(x − y) = 0 3(x + y)(x − y) + 2λ(x − y) = 0. (5.37) This equation is clearly satisfied if x = y; then, from the second constraint, x + y + z = 0, 170 5.9 STATIONARY VALUES UNDER CONSTRAINTS we find z = −2x. Substituting these values into the first constraint, x2 + y 2 + z 2 = 1, we obtain 1 y = ±√ , 6 1 x = ±√ , 6 2 z = ∓√ . 6 (5.38) Because of the high degree of symmetry amongst the equations (5.34)–(5.36), we may obtain by inspection two further relations analogous to (5.37), one containing the variables y, z and the other the variables x, z. Assuming y = z in the first relation and x = z in the second, we find the stationary points 1 x = ±√ , 6 2 y = ∓√ , 6 1 z = ±√ 6 (5.39) 2 x = ∓√ , 6 1 y = ±√ , 6 1 z = ±√ . 6 (5.40) and We note that in finding the stationary points (5.38)–(5.40) we did not need to evaluate the Lagrange multipliers λ and µ explicitly. This is not always the case, however, and in some problems it may be simpler to begin by finding the values of these multipliers. Returning to (5.37) we must now consider the case where x = y; then we find 3(x + y) + 2λ = 0. (5.41) However, in obtaining the stationary points (5.39), (5.40), we did not assume x = y but only required y = z and x = z respectively. It is clear that x = y at these stationary points, and it can be shown that they do indeed satisfy (5.41). Similarly, several stationary points for which x = z or y = z have already been found. Thus we need to consider further only two cases, x = y = z, and x, y and z are all different. The first is clearly prohibited by the constraint x + y + z = 0. For the second case, (5.41) must be satisfied, together with the analogous equations containing y, z and x, z respectively, i.e. 3(x + y) + 2λ = 0, 3(y + z) + 2λ = 0, 3(x + z) + 2λ = 0. Adding these three equations together and using the constraint x + y + z = 0 we find λ = 0. However, for λ = 0 the equations are inconsistent for non-zero x, y and z. Therefore all the stationary points have already been found and are given by (5.38)–(5.40). The method may be extended to functions of any number n of variables subject to any smaller number m of constraints. This means that effectively there are n − m independent variables and, as mentioned above, we could solve by substitution and then by the methods of the previous section. However, for large n this becomes cumbersome and the use of Lagrange undetermined multipliers is a useful simplification. 171 PARTIAL DIFFERENTIATION A system contains a very large number N of particles, each of which can be in any of R energy levels with a corresponding energy Ei , i = 1, 2, . . . , R. The number of particles in the ith level is ni and the total energy of the system is a constant, E. Find the distribution of particles amongst the energy levels that maximises the expression P = N! , n1 !n2 ! · · · nR ! subject to the constraints that both the number of particles and the total energy remain constant, i.e. R R ni = 0 and h=E− ni Ei = 0. g=N− i=1 i=1 The way in which we proceed is as follows. In order to maximise P , we must minimise its denominator (since the numerator is fixed). Minimising the denominator is the same as minimising the logarithm of the denominator, i.e. f = ln (n1 !n2 ! · · · nR !) = ln (n1 !) + ln (n2 !) + · · · + ln (nR !) . Using Stirling’s approximation, ln (n!) ≈ n ln n − n, we find that f = n1 ln n1 + n2 ln n2 + · · · + nR ln nR − (n1 + n2 + · · · + nR ) R ni ln ni − N. = i=1 It has been assumed here that, for the desired distribution, all the ni are large. Thus, we now have a function f subject to two constraints, g = 0 and h = 0, and we can apply the Lagrange method, obtaining (cf. (5.31)) ∂f ∂g ∂h +λ +µ = 0, ∂n1 ∂n1 ∂n1 ∂f ∂g ∂h +λ +µ = 0, ∂n2 ∂n2 ∂n2 .. . ∂f ∂g ∂h +λ +µ = 0. ∂nR ∂nR ∂nR Since all these equations are alike, we consider the general case ∂g ∂h ∂f +λ +µ = 0, ∂nk ∂nk ∂nk for k = 1, 2, . . . , R. Substituting the functions f, g and h into this relation we find nk + ln nk + λ(−1) + µ(−Ek ) = 0, nk which can be rearranged to give ln nk = µEk + λ − 1, and hence nk = C exp µEk . 172