Comments
Description
Transcript
Important continuous distributions
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS expression for the PDF of this Z does exist, but it is a rather complicated combination of exponentials and a modified Bessel function.§ Two types of e-mail arrive independently and at random: external e-mails at a mean rate of one every five minutes and internal e-mails at a rate of two every five minutes. Calculate the probability of receiving two or more e-mails in any two-minute interval. Let X = number of external e-mails per two-minute interval, Y = number of internal e-mails per two-minute interval. Since we expect on average one external e-mail and two internal e-mails every five minutes we have X ∼ Po(0.4) and Y ∼ Po(0.8). Letting Z = X + Y we have Z ∼ Po(0.4 + 0.8) = Po(1.2). Now Pr(Z ≥ 2) = 1 − Pr(Z < 2) = 1 − Pr(Z = 0) − Pr(Z = 1) and Pr(Z = 0) = e−1.2 = 0.301, 1.2 Pr(Z = 1) = e−1.2 = 0.361. 1 Hence Pr(Z ≥ 2) = 1 − 0.301 − 0.361 = 0.338. The above result can be extended, of course, to any number of Poisson processes, so that if Xi = Po(λi ), i = 1, 2, . . . , n then the random variable Z = X1 + X2 + · · · + Xn is distributed as Z ∼ Po(λ1 + λ2 + · · · + λn ). 30.9 Important continuous distributions Having discussed the most commonly encountered discrete probability distributions, we now consider some of the more important continuous probability distributions. These are summarised for convenience in table 30.2; we refer the reader to the relevant subsection below for an explanation of the symbols used. 30.9.1 The Gaussian distribution By far the most important continuous probability distribution is the Gaussian or normal distribution. The reason for its importance is that a great many random variables of interest, in all areas of the physical sciences and beyond, are described either exactly or approximately by a Gaussian distribution. Moreover, the Gaussian distribution can be used to approximate other, more complicated, probability distributions. § For a derivation see, for example, M. P. Hobson and A. N. Lasenby, Monthly Notices of the Royal Astronomical Society, 298, 905 (1998). 1179 PROBABILITY Distribution Gaussian Probability law f(x) (x − µ)2 1 √ exp − 2 2σ σ 2π exponential λe−λx gamma λ (λx)r−1 e−λx Γ(r) chi-squared uniform 1 x(n/2)−1 e−x/2 2n/2 Γ(n/2) 1 b−a MGF E[X] V [X] exp(µt + 12 σ 2 t2 ) λ λ−t r λ λ−t n/2 1 1 − 2t ebt − eat (b − a)t µ σ2 1 λ r λ 1 λ2 r λ2 n 2n a+b 2 (b − a)2 12 Table 30.2 Some important continuous probability distributions. The probability density function for a Gaussian distribution of a random variable X, with mean E[X] = µ and variance V [X] = σ 2 , takes the form 1 x − µ 2 1 . (30.105) f(x) = √ exp − 2 σ σ 2π √ The factor 1/ 2π arises from the normalisation of the distribution, ∞ f(x)dx = 1; −∞ the evaluation of this integral is discussed in subsection 6.4.2. The Gaussian distribution is symmetric about the point x = µ and has the characteristic ‘bell’ shape shown in figure 30.13. The width of the curve is described by the standard deviation σ: if σ is large then the curve is broad, and if σ is small then the curve is narrow (see the figure). At x = µ ± σ, f(x) falls to e−1/2 ≈ 0.61 of its peak value; these points are points of inflection, where d2 f/dx2 = 0. When a random variable X follows a Gaussian distribution with mean µ and variance σ 2 , we write X ∼ N(µ, σ 2 ). The effects of changing µ and σ are only to shift the curve along the x-axis or to broaden or narrow it, respectively. Thus all Gaussians are equivalent in that a change of origin and scale can reduce them to a standard form. We therefore consider the random variable Z = (X − µ)/σ, for which the PDF takes the form 2 z 1 φ(z) = √ exp − , (30.106) 2 2π which is called the standard Gaussian distribution and has mean µ = 0 and variance σ 2 = 1. The random variable Z is called the standard variable. From (30.105) we can define the cumulative probability function for a Gaussian 1180 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS µ=3 0.4 σ=1 0.3 σ=2 0.2 0.1 −6 −4 −2 σ=3 2 3 4 6 8 10 12 Figure 30.13 The Gaussian or normal distribution for mean µ = 3 and various values of the standard deviation σ. φ(z) Φ(z) 1 0.4 Φ(a) 0.3 Φ(a) 0.8 0.6 0.2 0.4 0.1 −4 −2 0.2 0 a 2 z 4 −2 −1 a y 2 z Figure 30.14 On the left, the standard Gaussian distribution φ(z); the shaded area gives Pr(Z < a) = Φ(a). On the right, the cumulative probability function Φ(z) for a standard Gaussian distribution φ(z). distribution as 1 F(x) = Pr(X < x) = √ σ 2π 1 u − µ 2 exp − du, 2 σ −∞ x (30.107) where u is a (dummy) integration variable. Unfortunately, this (indefinite) integral cannot be evaluated analytically. It is therefore standard practice to tabulate values of the cumulative probability function for the standard Gaussian distribution (see figure 30.14), i.e. 2 z u 1 exp − du. (30.108) Φ(z) = Pr(Z < z) = √ 2 2π −∞ 1181 PROBABILITY It is usual only to tabulate Φ(z) for z > 0, since it can be seen easily, from figure 30.14 and the symmetry of the Gaussian distribution, that Φ(−z) = 1−Φ(z); see table 30.3. Using such a table it is then straightforward to evaluate the probability that Z lies in a given range of z-values. For example, for a and b constant, Pr(Z < a) = Φ(a), Pr(Z > a) = 1 − Φ(a), Pr(a < Z ≤ b) = Φ(b) − Φ(a). Remembering that Z = (X − µ)/σ and comparing (30.107) and (30.108), we see that x − µ , F(x) = Φ σ and so we may also calculate the probability that the original random variable X lies in a given x-range. For example, b 1 u − µ 2 1 exp − du Pr(a < X ≤ b) = √ 2 σ σ 2π a = F(b) − F(a) a − µ b−µ =Φ . −Φ σ σ (30.109) (30.110) (30.111) If X is described by a Gaussian distribution of mean µ and variance σ 2 , calculate the probabilities that X lies within 1σ, 2σ and 3σ of the mean. From (30.111) Pr(µ − nσ < X ≤ µ + nσ) = Φ(n) − Φ(−n) = Φ(n) − [1 − Φ(n)], and so from table 30.3 Pr(µ − σ < X ≤ µ + σ) = 2Φ(1) − 1 = 0.6826 ≈ 68.3%, Pr(µ − 2σ < X ≤ µ + 2σ) = 2Φ(2) − 1 = 0.9544 ≈ 95.4%, Pr(µ − 3σ < X ≤ µ + 3σ) = 2Φ(3) − 1 = 0.9974 ≈ 99.7%. Thus we expect X to be distributed in such a way that about two thirds of the values will lie between µ − σ and µ + σ, 95% will lie within 2σ of the mean and 99.7% will lie within 3σ of the mean. These limits are called the one-, two- and three-sigma limits respectively; it is particularly important to note that they are independent of the actual values of the mean and variance. There are many other ways in which the Gaussian distribution may be used. We now illustrate some of the uses in more complicated examples. 1182 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS Φ(z) 0.0 0.1 0.2 0.3 0.4 .00 .5000 .5398 .5793 .6179 .6554 .01 .5040 .5438 .5832 .6217 .6591 .02 .5080 .5478 .5871 .6255 .6628 .03 .5120 .5517 .5910 .6293 .6664 .04 .5160 .5557 .5948 .6331 .6700 .05 .5199 .5596 .5987 .6368 .6736 .06 .5239 .5636 .6026 .6406 .6772 .07 .5279 .5675 .6064 .6443 .6808 .08 .5319 .5714 .6103 .6480 .6844 .09 .5359 .5753 .6141 .6517 .6879 0.5 0.6 0.7 0.8 0.9 .6915 .7257 .7580 .7881 .8159 .6950 .7291 .7611 .7910 .8186 .6985 .7324 .7642 .7939 .8212 .7019 .7357 .7673 .7967 .8238 .7054 .7389 .7704 .7995 .8264 .7088 .7422 .7734 .8023 .8289 .7123 .7454 .7764 .8051 .8315 .7157 .7486 .7794 .8078 .8340 .7190 .7517 .7823 .8106 .8365 .7224 .7549 .7852 .8133 .8389 1.0 1.1 1.2 1.3 1.4 .8413 .8643 .8849 .9032 .9192 .8438 .8665 .8869 .9049 .9207 .8461 .8686 .8888 .9066 .9222 .8485 .8708 .8907 .9082 .9236 .8508 .8729 .8925 .9099 .9251 .8531 .8749 .8944 .9115 .9265 .8554 .8770 .8962 .9131 .9279 .8577 .8790 .8980 .9147 .9292 .8599 .8810 .8997 .9162 .9306 .8621 .8830 .9015 .9177 .9319 1.5 1.6 1.7 1.8 1.9 .9332 .9452 .9554 .9641 .9713 .9345 .9463 .9564 .9649 .9719 .9357 .9474 .9573 .9656 .9726 .9370 .9484 .9582 .9664 .9732 .9382 .9495 .9591 .9671 .9738 .9394 .9505 .9599 .9678 .9744 .9406 .9515 .9608 .9686 .9750 .9418 .9525 .9616 .9693 .9756 .9429 .9535 .9625 .9699 .9761 .9441 .9545 .9633 .9706 .9767 2.0 2.1 2.2 2.3 2.4 .9772 .9821 .9861 .9893 .9918 .9778 .9826 .9864 .9896 .9920 .9783 .9830 .9868 .9898 .9922 .9788 .9834 .9871 .9901 .9925 .9793 .9838 .9875 .9904 .9927 .9798 .9842 .9878 .9906 .9929 .9803 .9846 .9881 .9909 .9931 .9808 .9850 .9884 .9911 .9932 .9812 .9854 .9887 .9913 .9934 .9817 .9857 .9890 .9916 .9936 2.5 2.6 2.7 2.8 2.9 .9938 .9953 .9965 .9974 .9981 .9940 .9955 .9966 .9975 .9982 .9941 .9956 .9967 .9976 .9982 .9943 .9957 .9968 .9977 .9983 .9945 .9959 .9969 .9977 .9984 .9946 .9960 .9970 .9978 .9984 .9948 .9961 .9971 .9979 .9985 .9949 .9962 .9972 .9979 .9985 .9951 .9963 .9973 .9980 .9986 .9952 .9964 .9974 .9981 .9986 3.0 3.1 3.2 3.3 3.4 .9987 .9990 .9993 .9995 .9997 .9987 .9991 .9993 .9995 .9997 .9987 .9991 .9994 .9995 .9997 .9988 .9991 .9994 .9996 .9997 .9988 .9992 .9994 .9996 .9997 .9989 .9992 .9994 .9996 .9997 .9989 .9992 .9994 .9996 .9997 .9989 .9992 .9995 .9996 .9997 .9990 .9993 .9995 .9996 .9997 .9990 .9993 .9995 .9997 .9998 Table 30.3 The cumulative probability function Φ(z) for the standard Gaussian distribution, as given by (30.108). The units and the first decimal place of z are specified in the column under Φ(z) and the second decimal place is specified by the column headings. Thus, for example, Φ(1.23) = 0.8907. 1183 PROBABILITY Sawmill A produces boards whose lengths are Gaussian distributed with mean 209.4 cm and standard deviation 5.0 cm. A board is accepted if it is longer than 200 cm but is rejected otherwise. Show that 3% of boards are rejected. Sawmill B produces boards of the same standard deviation but of mean length 210.1 cm. Find the proportion of boards rejected if they are drawn at random from the outputs of A and B in the ratio 3 : 1. Let X = length of boards from A, so that X ∼ N(209.4, (5.0)2 ) and 200 − µ 200 − 209.4 =Φ = Φ(−1.88). Pr(X < 200) = Φ σ 5.0 But, since Φ(−z) = 1 − Φ(z) we have, using table 30.3, Pr(X < 200) = 1 − Φ(1.88) = 1 − 0.9699 = 0.0301, i.e. 3.0% of boards are rejected. Now let Y = length of boards from B, so that Y ∼ N(210.1, (5.0)2 ) and 200 − 210.1 = Φ(−2.02) Pr(Y < 200) = Φ 5.0 = 1 − Φ(2.02) = 1 − 0.9783 = 0.0217. Therefore, when taken alone, only 2.2% of boards from B are rejected. If, however, boards are drawn at random from A and B in the ratio 3 : 1 then the proportion rejected is 1 (3 4 × 0.030 + 1 × 0.022) = 0.028 = 2.8%. We may sometimes work backwards to derive the mean and standard deviation of a population that is known to be Gaussian distributed. The time taken for a computer ‘packet’ to travel from Cambridge UK to Cambridge MA is Gaussian distributed. 6.8% of the packets take over 200 ms to make the journey, and 3.0% take under 140 ms. Find the mean and standard deviation of the distribution. Let X = journey time in ms; we are told that X ∼ N(µ, σ 2 ) where µ and σ are unknown. Since 6.8% of journey times are longer than 200 ms, 200 − µ = 0.068, Pr(X > 200) = 1 − Φ σ from which we find Φ 200 − µ σ = 1 − 0.068 = 0.932. Using table 30.3, we have therefore 200 − µ = 1.49. σ Also, 3.0% of journey times are under 140 ms, so 140 − µ = 0.030. Pr(X < 140) = Φ σ 1184 (30.112) 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS Now using Φ(−z) = 1 − Φ(z) gives µ − 140 = 1 − 0.030 = 0.970. Φ σ Using table 30.3 again, we find µ − 140 = 1.88. (30.113) σ Solving the simultaneous equations (30.112) and (30.113) gives µ = 173.5, σ = 17.8. The moment generating function for the Gaussian distribution Using the definition of the MGF (30.85), ∞ (x − µ)2 1 √ exp tx − dx MX (t) = E etX = 2σ 2 −∞ σ 2π = c exp µt + 12 σ 2 t2 , where the final equality is established by completing the square in the argument of the exponential and writing ∞ [x − (µ + σ 2 t)]2 1 √ exp − dx. c= 2σ 2 −∞ σ 2π However, the final integral is simply the normalisation integral for the Gaussian distribution, and so c = 1 and the MGF is given by (30.114) MX (t) = exp µt + 12 σ 2 t2 . We showed in subsection 30.7.2 that this MGF leads to E[X] = µ and V [X] = σ 2 , as required. Gaussian approximation to the binomial distribution We may consider the Gaussian distribution as the limit of the binomial distribution when the number of trials n → ∞ but the probability of a success p remains finite, so that np → ∞ also. (This contrasts with the Poisson distribution, which corresponds to the limit n → ∞ and p → 0 with np = λ remaining finite.) In other words, a Gaussian distribution results when an experiment with a finite probability of success is repeated a large number of times. We now show how this Gaussian limit arises. The binomial probability function gives the probability of x successes in n trials as n! px (1 − p)n−x . f(x) = x!(n − x)! Taking the limit as n → ∞ (and x → ∞) we may approximate the factorials by Stirling’s approximation n n √ n! ∼ 2πn e 1185 PROBABILITY x 0 1 2 3 4 5 6 7 8 9 10 f(x) (binomial) 0.0001 0.0016 0.0106 0.0425 0.1115 0.2007 0.2508 0.2150 0.1209 0.0403 0.0060 f(x) (Gaussian) 0.0001 0.0014 0.0092 0.0395 0.1119 0.2091 0.2575 0.2091 0.1119 0.0395 0.0092 Table 30.4 Comparison of the binomial distribution for n = 10 and p = 0.6 with its Gaussian approximation. to obtain 1 x −x−1/2 n − x −n+x−1/2 x p (1 − p)n−x n 2πn n x n−x 1 exp − x + 12 ln − n − x + 12 ln =√ n n 2πn + x ln p + (n − x) ln(1 − p) . f(x) ≈ √ By expanding the argument of the exponential in terms of y = x − np, where 1 y np and keeping only the dominant terms, it can be shown that 1 1 (x − np)2 1 √ exp − f(x) ≈ √ , 2 np(1 − p) 2πn p(1 − p) √ which is of Gaussian form with µ = np and σ = np(1 − p). Thus we see that the value of the Gaussian probability density function f(x) is a good approximation to the probability of obtaining x successes in n trials. This approximation is actually very good even for relatively small n. For example, if n = 10 and p = 0.6 then the Gaussian approximation to the binomial distribution √ is (30.105) with µ = 10 × 0.6 = 6 and σ = 10 × 0.6(1 − 0.6) = 1.549. The probability functions f(x) for the binomial and associated Gaussian distributions for these parameters are given in table 30.4, and it can be seen that the Gaussian approximation is a good one. Strictly speaking, however, since the Gaussian distribution is continuous and the binomial distribution is discrete, we should use the integral of f(x) for the Gaussian distribution in the calculation of approximate binomial probabilities. More specifically, we should apply a continuity correction so that the discrete integer x in the binomial distribution becomes the interval [x − 0.5, x + 0.5] in 1186 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS the Gaussian distribution. Explicitly, x+0.5 1 1 u − µ 2 Pr(X = x) ≈ √ exp − du. 2 σ σ 2π x−0.5 The Gaussian approximation is particularly useful for estimating the binomial probability that X lies between the (integer) values x1 and x2 , x2 +0.5 1 1 u − µ 2 Pr(x1 < X ≤ x2 ) ≈ √ exp − du. 2 σ σ 2π x1 −0.5 A manufacturer makes computer chips of which 10% are defective. For a random sample of 200 chips, find the approximate probability that more than 15 are defective. We first define the random variable X = number of defective chips in the sample, which has a binomial distribution X ∼ Bin(200, 0.1). Therefore, the mean and variance of this distribution are E[X] = 200 × 0.1 = 20 V [X] = 200 × 0.1 × (1 − 0.1) = 18, and and we may approximate the binomial distribution with a Gaussian distribution such that X ∼ N(20, 18). The standard variable is Z= X − 20 √ , 18 and so, using X = 15.5 to allow for the continuity correction, 15.5 − 20 = Pr(Z > −1.06) Pr(X > 15.5) = Pr Z > √ 18 = Pr(Z < 1.06) = 0.86. Gaussian approximation to the Poisson distribution We first met the Poisson distribution as the limit of the binomial distribution for n → ∞ and p → 0, taken in such a way that np = λ remains finite. Further, in the previous subsection, we considered the Gaussian distribution as the limit of the binomial distribution when n → ∞ but p remains finite, so that np → ∞ also. It should come as no surprise, therefore, that the Gaussian distribution can also be used to approximate the Poisson distribution when the mean λ becomes large. The probability function for the Poisson distribution is f(x) = e−λ λx , x! which, on taking the logarithm of both sides, gives ln f(x) = −λ + x ln λ − ln x!. 1187 (30.115) PROBABILITY Stirling’s approximation for large x gives x x √ x! ≈ 2πx e implying that √ ln x! ≈ ln 2πx + x ln x − x, which, on substituting into (30.115), yields √ ln f(x) ≈ −λ + x ln λ − (x ln x − x) − ln 2πx. Since we expect the Poisson distribution to peak around x = λ, we substitute = x − λ to obtain ! " + (λ + ) − ln 2π(λ + ). ln f(x) ≈ −λ + (λ + ) ln λ − ln λ 1 + λ Using the expansion ln(1 + z) = z − z 2 /2 + · · · , we find √ 2 2 − 2 − ln 2πλ − − 2 ln f(x) ≈ − (λ + ) λ 2λ λ 2λ 2 √ ≈ − − ln 2πλ, 2λ when only the dominant terms are retained, after using the fact that is of the order of the standard deviation of x, i.e. of order λ1/2 . On exponentiating this result we obtain (x − λ)2 1 exp − , f(x) ≈ √ 2λ 2πλ which is the Gaussian distribution with µ = λ and σ 2 = λ. The larger the value of λ, the better is the Gaussian approximation to the Poisson distribution; the approximation is reasonable even for λ = 5, but λ ≥ 10 is safer. As in the case of the Gaussian approximation to the binomial distribution, a continuity correction is necessary since the Poisson distribution is discrete. E-mail messages are received by an author at an average rate of one per hour. Find the probability that in a day the author receives 24 messages or more. We first define the random variable X = number of messages received in a day. Thus E[X] = 1 × 24 = 24, and so X ∼ Po(24). Since λ > 10 we may approximate the Poisson distribution by X ∼ N(24, 24). Now the standard variable is X − 24 , Z= √ 24 and, using the continuity correction, we find 23.5 − 24 Pr(X > 23.5) = Pr Z > √ 24 = Pr(Z > −0.102) = Pr(Z < 0.102) = 0.54. 1188 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS In fact, almost all probability distributions tend towards a Gaussian when the numbers involved become large – that this should happen is required by the central limit theorem, which we discuss in section 30.10. Multiple Gaussian distributions Suppose X and Y are independent Gaussian-distributed random variables, so that X ∼ N(µ1 , σ12 ) and Y ∼ N(µ2 , σ22 ). Let us now consider the random variable Z = X + Y . The PDF for this random variable may be found directly using (30.61), but it is easier to use the MGF. From (30.114), the MGFs of X and Y are MY (t) = exp µ2 t + 12 σ22 t2 . MX (t) = exp µ1 t + 12 σ12 t2 , Using (30.89), since X and Y are independent RVs, the MGF of Z = X + Y is simply the product of MX (t) and MY (t). Thus, we have MZ (t) = MX (t)MY (t) = exp µ1 t + 12 σ12 t2 exp µ2 t + 12 σ22 t2 = exp (µ1 + µ2 )t + 12 (σ12 + σ22 )t2 , which we recognise as the MGF for a Gaussian with mean µ1 + µ2 and variance σ12 + σ22 . Thus, Z is also Gaussian distributed: Z ∼ N(µ1 + µ2 , σ12 + σ22 ). A similar calculation may be performed to calculate the PDF of the random variable W = X − Y . If we introduce the variable Ỹ = −Y then W = X + Ỹ , where Ỹ ∼ N(−µ1 , σ12 ). Thus, using the result above, we find W ∼ N(µ1 − µ2 , σ12 + σ22 ). An executive travels home from her office every evening. Her journey consists of a train ride, followed by a bicycle ride. The time spent on the train is Gaussian distributed with mean 52 minutes and standard deviation 1.8 minutes, while the time for the bicycle journey is Gaussian distributed with mean 8 minutes and standard deviation 2.6 minutes. Assuming these two factors are independent, estimate the percentage of occasions on which the whole journey takes more than 65 minutes. We first define the random variables X = time spent on train, Y = time spent on bicycle, so that X ∼ N(52, (1.8) ) and Y ∼ N(8, (2.6) ). Since X and Y are independent, the total journey time T = X + Y is distributed as 2 2 T ∼ N(52 + 8, (1.8)2 + (2.6)2 ) = N(60, (3.16)2 ). The standard variable is thus Z= T − 60 , 3.16 and the required probability is given by 65 − 60 = Pr(Z > 1.58) = 1 − 0.943 = 0.057. Pr(T > 65) = Pr Z > 3.16 Thus the total journey time exceeds 65 minutes on 5.7% of occasions. 1189 PROBABILITY The above results may be extended. For example, if the random variables Xi , i = 1, 2, . . . , n, are distributed as Xi ∼ N(µi , σi2 ) then the random variable Z = i ci Xi (where the ci are constants) is distributed as Z ∼ N( i ci µi , i c2i σi2 ). 30.9.2 The log-normal distribution If the random variable X follows a Gaussian distribution then the variable Y = eX is described by a log-normal distribution. Clearly, if X can take values in the range −∞ to ∞, then Y will lie between 0 and ∞. The probability density function for Y is found using the result (30.58). It is dx 1 1 (ln y − µ)2 exp − . g(y) = f(x(y)) = √ dy 2σ 2 σ 2π y We note that µ and σ 2 are not the mean and variance of the log-normal distribution, but rather the parameters of the corresponding Gaussian distribution for X. The mean and variance of Y , however, can be found straightforwardly using the MGF of X, which reads MX (t) = E[etX ] = exp(µt + 12 σ 2 t2 ). Thus, the mean of Y is given by E[Y ] = E[eX ] = MX (1) = exp(µ + 12 σ 2 ), and the variance of Y reads V [Y ] = E[Y 2 ] − (E[Y ])2 = E[e2X ] − (E[eX ])2 = MX (2) − [MX (1)]2 = exp(2µ + σ 2 )[exp(σ 2 ) − 1]. In figure 30.15, we plot some examples of the log-normal distribution for various values of the parameters µ and σ 2 . 30.9.3 The exponential and gamma distributions The exponential distribution with positive parameter λ is given by # λe−λx for x > 0, (30.116) f(x) = 0 for x ≤ 0 ∞ and satisfies −∞ f(x) dx = 1 as required. The exponential distribution occurs naturally if we consider the distribution of the length of intervals between successive events in a Poisson process or, equivalently, the distribution of the interval (i.e. the waiting time) before the first event. If the average number of events per unit interval is λ then on average there are λx events in interval x, so that from the Poisson distribution the probability that there will be no events in this interval is given by Pr(no events in interval x) = e−λx . 1190 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS g(y) 1 µ = 0, µ = 0, µ = 0, µ = 1, 0.8 0.6 σ=0 σ = 0.5 σ = 1.5 σ=1 0.4 0.2 y 0 0 1 2 4 3 Figure 30.15 The PDF g(y) for the log-normal distribution for various values of the parameters µ and σ. The probability that an event occurs in the next infinitestimal interval [x, x + dx] is given by λ dx, so that Pr(the first event occurs in interval [x, x + dx]) = e−λx λ dx. Hence the required probability density function is given by f(x) = λe−λx . The expectation and variance of the exponential distribution can be evaluated as 1/λ and (1/λ)2 respectively. The MGF is given by λ . (30.117) λ−t We may generalise the above discussion to obtain the PDF for the interval between every rth event in a Poisson process or, equivalently, the interval (waiting time) before the rth event. We begin by using the Poisson distribution to give M(t) = Pr(r − 1 events occur in interval x) = e−λx (λx)r−1 , (r − 1)! from which we obtain Pr(rth event occurs in the interval [x, x + dx]) = e−λx (λx)r−1 λ dx. (r − 1)! Thus the required PDF is f(x) = λ (λx)r−1 e−λx , (r − 1)! (30.118) which is known as the gamma distribution of order r with parameter λ. Although our derivation applies only when r is a positive integer, the gamma distribution is 1191 PROBABILITY f(x) 1 0.8 r=1 0.6 0.4 r=2 r=5 0.2 r = 10 x 0 0 2 4 6 8 10 12 14 16 18 20 Figure 30.16 The PDF f(x) for the gamma distributions γ(λ, r) with λ = 1 and r = 1, 2, 5, 10. defined for all positive r by replacing (r − 1)! by Γ(r) in (30.118); see the appendix for a discussion of the gamma function Γ(x). If a random variable X is described by a gamma distribution of order r with parameter λ, we write X ∼ γ(λ, r); we note that the exponential distribution is the special case γ(λ, 1). The gamma distribution γ(λ, r) is plotted in figure 30.16 for λ = 1 and r = 1, 2, 5, 10. For large r, the gamma distribution tends to the Gaussian distribution whose mean and variance are specified by (30.120) below. The MGF for the gamma distribution is obtained from that for the exponential distribution, by noting that we may consider the interval between every rth event in a Poisson process as the sum of r intervals between successive events. Thus the rth-order gamma variate is the sum of r independent exponentially distributed random variables. From (30.117) and (30.90), the MGF of the gamma distribution is therefore given by r λ , (30.119) M(t) = λ−t from which the mean and variance are found to be r r V [X] = 2 . (30.120) E[X] = , λ λ We may also use the above MGF to prove another useful theorem regarding multiple gamma distributions. If Xi ∼ γ(λ, ri ), i = 1, 2, . . . , n, are independent gamma variates then the random variable Y = X1 + X2 + · · · + Xn has MGF ri r1 +r2 +···+rn n λ λ M(t) = = . (30.121) λ−t λ−t i=1 Thus Y is also a gamma variate, distributed as Y ∼ γ(λ, r1 + r2 + · · · + rn ). 1192 30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS 30.9.4 The chi-squared distribution In subsection 30.6.2, we showed that if X is Gaussian distributed with mean µ and variance σ 2 , such that X ∼ N(µ, σ 2 ), then the random variable Y = (x − µ)2 /σ 2 is distributed as the gamma distribution Y ∼ γ( 12 , 12 ). Let us now consider n independent Gaussian random variables Xi ∼ N(µi , σi2 ), i = 1, 2, . . . , n, and define the new variable χ2n = n (Xi − µi )2 σi2 i=1 . (30.122) Using the result (30.121) for multiple gamma distributions, χ2n must be distributed as the gamma variate χ2n ∼ γ( 12 , 12 n), which from (30.118) has the PDF f(χ2n ) = = 1 2 ( 1 χ2n )(n/2)−1 Γ( 12 n) 2 exp(− 12 χ2n ) 1 (χ2n )(n/2)−1 exp(− 12 χ2n ). 2n/2 Γ( 12 n) (30.123) This is known as the chi-squared distribution of order n and has numerous applications in statistics (see chapter 31). Setting λ = 12 and r = 12 n in (30.120), we find that E[χ2n ] = n, V [χ2n ] = 2n. An important generalisation occurs when the n Gaussian variables Xi are not linearly independent but are instead required to satisfy a linear constraint of the form c1 X1 + c2 X2 + · · · + cn Xn = 0, (30.124) in which the constants ci are not all zero. In this case, it may be shown (see exercise 30.40) that the variable χ2n defined in (30.122) is still described by a chisquared distribution, but one of order n − 1. Indeed, this result may be trivially extended to show that if the n Gaussian variables Xi satisfy m linear constraints of the form (30.124) then the variable χ2n defined in (30.122) is described by a chi-squared distribution of order n − m. 30.9.5 The Cauchy and Breit–Wigner distributions A random variable X (in the range −∞ to ∞) that obeys the Cauchy distribution is described by the PDF f(x) = 1 1 . π 1 + x2 1193 PROBABILITY f(x) 0.8 x0 = 0, Γ=1 0.6 x0 = 2, Γ=1 0.4 0.2 x0 = 0, Γ=3 0 −4 −2 2 0 4 x Figure 30.17 The PDF f(x) for the Breit–Wigner distribution for different values of the parameters x0 and Γ. This is a special case of the Breit–Wigner distribution f(x) = 1 π 1 2Γ 1 2 4Γ + (x − x0 )2 , which is encountered in the study of nuclear and particle physics. In figure 30.17, we plot some examples of the Breit–Wigner distribution for several values of the parameters x0 and Γ. We see from the figure that the peak (or mode) of the distribution occurs at x = x0 . It is also straightforward to show that the parameter Γ is equal to the width of the peak at half the maximum height. Although the Breit–Wigner distribution is symmetric about its peak, it does not formally possess a mean since 0 ∞ the integrals −∞ xf(x) dx and 0 xf(x) dx both diverge. Similar divergences occur for all higher moments of the distribution. 30.9.6 The uniform distribution Finally we mention the very simple, but common, uniform distribution, which describes a continuous random variable that has a constant PDF over its allowed range of values. If the limits on X are a and b then # 1/(b − a) for a ≤ x ≤ b, f(x) = 0 otherwise. The MGF of the uniform distribution is found to be M(t) = ebt − eat , (b − a)t 1194