...

Important continuous distributions

by taratuta

on
Category: Documents
56

views

Report

Comments

Transcript

Important continuous distributions
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
expression for the PDF of this Z does exist, but it is a rather complicated
combination of exponentials and a modified Bessel function.§
Two types of e-mail arrive independently and at random: external e-mails at a mean rate
of one every five minutes and internal e-mails at a rate of two every five minutes. Calculate
the probability of receiving two or more e-mails in any two-minute interval.
Let
X = number of external e-mails per two-minute interval,
Y = number of internal e-mails per two-minute interval.
Since we expect on average one external e-mail and two internal e-mails every five minutes
we have X ∼ Po(0.4) and Y ∼ Po(0.8). Letting Z = X + Y we have Z ∼ Po(0.4 + 0.8) =
Po(1.2). Now
Pr(Z ≥ 2) = 1 − Pr(Z < 2) = 1 − Pr(Z = 0) − Pr(Z = 1)
and
Pr(Z = 0) = e−1.2 = 0.301,
1.2
Pr(Z = 1) = e−1.2
= 0.361.
1
Hence Pr(Z ≥ 2) = 1 − 0.301 − 0.361 = 0.338. The above result can be extended, of course, to any number of Poisson processes,
so that if Xi = Po(λi ), i = 1, 2, . . . , n then the random variable Z = X1 + X2 +
· · · + Xn is distributed as Z ∼ Po(λ1 + λ2 + · · · + λn ).
30.9 Important continuous distributions
Having discussed the most commonly encountered discrete probability distributions, we now consider some of the more important continuous probability
distributions. These are summarised for convenience in table 30.2; we refer the
reader to the relevant subsection below for an explanation of the symbols used.
30.9.1 The Gaussian distribution
By far the most important continuous probability distribution is the Gaussian
or normal distribution. The reason for its importance is that a great many
random variables of interest, in all areas of the physical sciences and beyond, are
described either exactly or approximately by a Gaussian distribution. Moreover,
the Gaussian distribution can be used to approximate other, more complicated,
probability distributions.
§
For a derivation see, for example, M. P. Hobson and A. N. Lasenby, Monthly Notices of the Royal
Astronomical Society, 298, 905 (1998).
1179
PROBABILITY
Distribution
Gaussian
Probability law f(x)
(x − µ)2
1
√ exp −
2
2σ
σ 2π
exponential
λe−λx
gamma
λ
(λx)r−1 e−λx
Γ(r)
chi-squared
uniform
1
x(n/2)−1 e−x/2
2n/2 Γ(n/2)
1
b−a
MGF
E[X]
V [X]
exp(µt + 12 σ 2 t2 )
λ
λ−t
r
λ
λ−t
n/2
1
1 − 2t
ebt − eat
(b − a)t
µ
σ2
1
λ
r
λ
1
λ2
r
λ2
n
2n
a+b
2
(b − a)2
12
Table 30.2 Some important continuous probability distributions.
The probability density function for a Gaussian distribution of a random
variable X, with mean E[X] = µ and variance V [X] = σ 2 , takes the form
1 x − µ 2
1
.
(30.105)
f(x) = √ exp −
2
σ
σ 2π
√
The factor 1/ 2π arises from the normalisation of the distribution,
∞
f(x)dx = 1;
−∞
the evaluation of this integral is discussed in subsection 6.4.2. The Gaussian
distribution is symmetric about the point x = µ and has the characteristic ‘bell’
shape shown in figure 30.13. The width of the curve is described by the standard
deviation σ: if σ is large then the curve is broad, and if σ is small then the curve
is narrow (see the figure). At x = µ ± σ, f(x) falls to e−1/2 ≈ 0.61 of its peak
value; these points are points of inflection, where d2 f/dx2 = 0. When a random
variable X follows a Gaussian distribution with mean µ and variance σ 2 , we write
X ∼ N(µ, σ 2 ).
The effects of changing µ and σ are only to shift the curve along the x-axis or
to broaden or narrow it, respectively. Thus all Gaussians are equivalent in that
a change of origin and scale can reduce them to a standard form. We therefore
consider the random variable Z = (X − µ)/σ, for which the PDF takes the form
2
z
1
φ(z) = √ exp −
,
(30.106)
2
2π
which is called the standard Gaussian distribution and has mean µ = 0 and
variance σ 2 = 1. The random variable Z is called the standard variable.
From (30.105) we can define the cumulative probability function for a Gaussian
1180
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
µ=3
0.4
σ=1
0.3
σ=2
0.2
0.1
−6 −4 −2
σ=3
2 3 4
6
8
10
12
Figure 30.13 The Gaussian or normal distribution for mean µ = 3 and
various values of the standard deviation σ.
φ(z)
Φ(z)
1
0.4
Φ(a)
0.3
Φ(a)
0.8
0.6
0.2
0.4
0.1
−4
−2
0.2
0
a
2
z
4
−2
−1
a
y
2
z
Figure 30.14 On the left, the standard Gaussian distribution φ(z); the shaded
area gives Pr(Z < a) = Φ(a). On the right, the cumulative probability function
Φ(z) for a standard Gaussian distribution φ(z).
distribution as
1
F(x) = Pr(X < x) = √
σ 2π
1 u − µ 2
exp −
du,
2
σ
−∞
x
(30.107)
where u is a (dummy) integration variable. Unfortunately, this (indefinite) integral
cannot be evaluated analytically. It is therefore standard practice to tabulate values of the cumulative probability function for the standard Gaussian distribution
(see figure 30.14), i.e.
2
z
u
1
exp −
du.
(30.108)
Φ(z) = Pr(Z < z) = √
2
2π −∞
1181
PROBABILITY
It is usual only to tabulate Φ(z) for z > 0, since it can be seen easily, from
figure 30.14 and the symmetry of the Gaussian distribution, that Φ(−z) = 1−Φ(z);
see table 30.3. Using such a table it is then straightforward to evaluate the
probability that Z lies in a given range of z-values. For example, for a and b
constant,
Pr(Z < a) = Φ(a),
Pr(Z > a) = 1 − Φ(a),
Pr(a < Z ≤ b) = Φ(b) − Φ(a).
Remembering that Z = (X − µ)/σ and comparing (30.107) and (30.108), we see
that
x − µ
,
F(x) = Φ
σ
and so we may also calculate the probability that the original random variable
X lies in a given x-range. For example,
b
1 u − µ 2
1
exp −
du
Pr(a < X ≤ b) = √
2
σ
σ 2π a
= F(b) − F(a)
a − µ
b−µ
=Φ
.
−Φ
σ
σ
(30.109)
(30.110)
(30.111)
If X is described by a Gaussian distribution of mean µ and variance σ 2 , calculate the
probabilities that X lies within 1σ, 2σ and 3σ of the mean.
From (30.111)
Pr(µ − nσ < X ≤ µ + nσ) = Φ(n) − Φ(−n) = Φ(n) − [1 − Φ(n)],
and so from table 30.3
Pr(µ − σ < X ≤ µ + σ) = 2Φ(1) − 1 = 0.6826 ≈ 68.3%,
Pr(µ − 2σ < X ≤ µ + 2σ) = 2Φ(2) − 1 = 0.9544 ≈ 95.4%,
Pr(µ − 3σ < X ≤ µ + 3σ) = 2Φ(3) − 1 = 0.9974 ≈ 99.7%.
Thus we expect X to be distributed in such a way that about two thirds of the values will
lie between µ − σ and µ + σ, 95% will lie within 2σ of the mean and 99.7% will lie within
3σ of the mean. These limits are called the one-, two- and three-sigma limits respectively;
it is particularly important to note that they are independent of the actual values of the
mean and variance. There are many other ways in which the Gaussian distribution may be used.
We now illustrate some of the uses in more complicated examples.
1182
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
Φ(z)
0.0
0.1
0.2
0.3
0.4
.00
.5000
.5398
.5793
.6179
.6554
.01
.5040
.5438
.5832
.6217
.6591
.02
.5080
.5478
.5871
.6255
.6628
.03
.5120
.5517
.5910
.6293
.6664
.04
.5160
.5557
.5948
.6331
.6700
.05
.5199
.5596
.5987
.6368
.6736
.06
.5239
.5636
.6026
.6406
.6772
.07
.5279
.5675
.6064
.6443
.6808
.08
.5319
.5714
.6103
.6480
.6844
.09
.5359
.5753
.6141
.6517
.6879
0.5
0.6
0.7
0.8
0.9
.6915
.7257
.7580
.7881
.8159
.6950
.7291
.7611
.7910
.8186
.6985
.7324
.7642
.7939
.8212
.7019
.7357
.7673
.7967
.8238
.7054
.7389
.7704
.7995
.8264
.7088
.7422
.7734
.8023
.8289
.7123
.7454
.7764
.8051
.8315
.7157
.7486
.7794
.8078
.8340
.7190
.7517
.7823
.8106
.8365
.7224
.7549
.7852
.8133
.8389
1.0
1.1
1.2
1.3
1.4
.8413
.8643
.8849
.9032
.9192
.8438
.8665
.8869
.9049
.9207
.8461
.8686
.8888
.9066
.9222
.8485
.8708
.8907
.9082
.9236
.8508
.8729
.8925
.9099
.9251
.8531
.8749
.8944
.9115
.9265
.8554
.8770
.8962
.9131
.9279
.8577
.8790
.8980
.9147
.9292
.8599
.8810
.8997
.9162
.9306
.8621
.8830
.9015
.9177
.9319
1.5
1.6
1.7
1.8
1.9
.9332
.9452
.9554
.9641
.9713
.9345
.9463
.9564
.9649
.9719
.9357
.9474
.9573
.9656
.9726
.9370
.9484
.9582
.9664
.9732
.9382
.9495
.9591
.9671
.9738
.9394
.9505
.9599
.9678
.9744
.9406
.9515
.9608
.9686
.9750
.9418
.9525
.9616
.9693
.9756
.9429
.9535
.9625
.9699
.9761
.9441
.9545
.9633
.9706
.9767
2.0
2.1
2.2
2.3
2.4
.9772
.9821
.9861
.9893
.9918
.9778
.9826
.9864
.9896
.9920
.9783
.9830
.9868
.9898
.9922
.9788
.9834
.9871
.9901
.9925
.9793
.9838
.9875
.9904
.9927
.9798
.9842
.9878
.9906
.9929
.9803
.9846
.9881
.9909
.9931
.9808
.9850
.9884
.9911
.9932
.9812
.9854
.9887
.9913
.9934
.9817
.9857
.9890
.9916
.9936
2.5
2.6
2.7
2.8
2.9
.9938
.9953
.9965
.9974
.9981
.9940
.9955
.9966
.9975
.9982
.9941
.9956
.9967
.9976
.9982
.9943
.9957
.9968
.9977
.9983
.9945
.9959
.9969
.9977
.9984
.9946
.9960
.9970
.9978
.9984
.9948
.9961
.9971
.9979
.9985
.9949
.9962
.9972
.9979
.9985
.9951
.9963
.9973
.9980
.9986
.9952
.9964
.9974
.9981
.9986
3.0
3.1
3.2
3.3
3.4
.9987
.9990
.9993
.9995
.9997
.9987
.9991
.9993
.9995
.9997
.9987
.9991
.9994
.9995
.9997
.9988
.9991
.9994
.9996
.9997
.9988
.9992
.9994
.9996
.9997
.9989
.9992
.9994
.9996
.9997
.9989
.9992
.9994
.9996
.9997
.9989
.9992
.9995
.9996
.9997
.9990
.9993
.9995
.9996
.9997
.9990
.9993
.9995
.9997
.9998
Table 30.3 The cumulative probability function Φ(z) for the standard Gaussian distribution, as given by (30.108). The units and the first decimal place
of z are specified in the column under Φ(z) and the second decimal place is
specified by the column headings. Thus, for example, Φ(1.23) = 0.8907.
1183
PROBABILITY
Sawmill A produces boards whose lengths are Gaussian distributed with mean 209.4 cm
and standard deviation 5.0 cm. A board is accepted if it is longer than 200 cm but is
rejected otherwise. Show that 3% of boards are rejected.
Sawmill B produces boards of the same standard deviation but of mean length 210.1 cm.
Find the proportion of boards rejected if they are drawn at random from the outputs of A
and B in the ratio 3 : 1.
Let X = length of boards from A, so that X ∼ N(209.4, (5.0)2 ) and
200 − µ
200 − 209.4
=Φ
= Φ(−1.88).
Pr(X < 200) = Φ
σ
5.0
But, since Φ(−z) = 1 − Φ(z) we have, using table 30.3,
Pr(X < 200) = 1 − Φ(1.88) = 1 − 0.9699 = 0.0301,
i.e. 3.0% of boards are rejected.
Now let Y = length of boards from B, so that Y ∼ N(210.1, (5.0)2 ) and
200 − 210.1
= Φ(−2.02)
Pr(Y < 200) = Φ
5.0
= 1 − Φ(2.02)
= 1 − 0.9783 = 0.0217.
Therefore, when taken alone, only 2.2% of boards from B are rejected. If, however, boards
are drawn at random from A and B in the ratio 3 : 1 then the proportion rejected is
1
(3
4
× 0.030 + 1 × 0.022) = 0.028 = 2.8%. We may sometimes work backwards to derive the mean and standard deviation
of a population that is known to be Gaussian distributed.
The time taken for a computer ‘packet’ to travel from Cambridge UK to Cambridge MA
is Gaussian distributed. 6.8% of the packets take over 200 ms to make the journey, and
3.0% take under 140 ms. Find the mean and standard deviation of the distribution.
Let X = journey time in ms; we are told that X ∼ N(µ, σ 2 ) where µ and σ are unknown.
Since 6.8% of journey times are longer than 200 ms,
200 − µ
= 0.068,
Pr(X > 200) = 1 − Φ
σ
from which we find
Φ
200 − µ
σ
= 1 − 0.068 = 0.932.
Using table 30.3, we have therefore
200 − µ
= 1.49.
σ
Also, 3.0% of journey times are under 140 ms, so
140 − µ
= 0.030.
Pr(X < 140) = Φ
σ
1184
(30.112)
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
Now using Φ(−z) = 1 − Φ(z) gives
µ − 140
= 1 − 0.030 = 0.970.
Φ
σ
Using table 30.3 again, we find
µ − 140
= 1.88.
(30.113)
σ
Solving the simultaneous equations (30.112) and (30.113) gives µ = 173.5, σ = 17.8. The moment generating function for the Gaussian distribution
Using the definition of the MGF (30.85),
∞
(x − µ)2
1
√ exp tx −
dx
MX (t) = E etX =
2σ 2
−∞ σ 2π
= c exp µt + 12 σ 2 t2 ,
where the final equality is established by completing the square in the argument
of the exponential and writing
∞
[x − (µ + σ 2 t)]2
1
√
exp −
dx.
c=
2σ 2
−∞ σ 2π
However, the final integral is simply the normalisation integral for the Gaussian
distribution, and so c = 1 and the MGF is given by
(30.114)
MX (t) = exp µt + 12 σ 2 t2 .
We showed in subsection 30.7.2 that this MGF leads to E[X] = µ and V [X] = σ 2 ,
as required.
Gaussian approximation to the binomial distribution
We may consider the Gaussian distribution as the limit of the binomial distribution when the number of trials n → ∞ but the probability of a success p remains
finite, so that np → ∞ also. (This contrasts with the Poisson distribution, which
corresponds to the limit n → ∞ and p → 0 with np = λ remaining finite.) In
other words, a Gaussian distribution results when an experiment with a finite
probability of success is repeated a large number of times. We now show how
this Gaussian limit arises.
The binomial probability function gives the probability of x successes in n trials
as
n!
px (1 − p)n−x .
f(x) =
x!(n − x)!
Taking the limit as n → ∞ (and x → ∞) we may approximate the factorials by
Stirling’s approximation
n n
√
n! ∼ 2πn
e
1185
PROBABILITY
x
0
1
2
3
4
5
6
7
8
9
10
f(x) (binomial)
0.0001
0.0016
0.0106
0.0425
0.1115
0.2007
0.2508
0.2150
0.1209
0.0403
0.0060
f(x) (Gaussian)
0.0001
0.0014
0.0092
0.0395
0.1119
0.2091
0.2575
0.2091
0.1119
0.0395
0.0092
Table 30.4 Comparison of the binomial distribution for n = 10 and p = 0.6
with its Gaussian approximation.
to obtain
1 x −x−1/2 n − x −n+x−1/2 x
p (1 − p)n−x
n
2πn n
x n−x
1
exp − x + 12 ln − n − x + 12 ln
=√
n
n
2πn
+ x ln p + (n − x) ln(1 − p) .
f(x) ≈ √
By expanding the argument of the exponential in terms of y = x − np, where
1 y np and keeping only the dominant terms, it can be shown that
1
1 (x − np)2
1
√
exp −
f(x) ≈ √
,
2 np(1 − p)
2πn p(1 − p)
√
which is of Gaussian form with µ = np and σ = np(1 − p).
Thus we see that the value of the Gaussian probability density function f(x) is
a good approximation to the probability of obtaining x successes in n trials. This
approximation is actually very good even for relatively small n. For example, if
n = 10 and p = 0.6 then the Gaussian approximation to the binomial distribution
√
is (30.105) with µ = 10 × 0.6 = 6 and σ = 10 × 0.6(1 − 0.6) = 1.549. The
probability functions f(x) for the binomial and associated Gaussian distributions
for these parameters are given in table 30.4, and it can be seen that the Gaussian
approximation is a good one.
Strictly speaking, however, since the Gaussian distribution is continuous and
the binomial distribution is discrete, we should use the integral of f(x) for the
Gaussian distribution in the calculation of approximate binomial probabilities.
More specifically, we should apply a continuity correction so that the discrete
integer x in the binomial distribution becomes the interval [x − 0.5, x + 0.5] in
1186
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
the Gaussian distribution. Explicitly,
x+0.5
1
1 u − µ 2
Pr(X = x) ≈ √
exp −
du.
2
σ
σ 2π x−0.5
The Gaussian approximation is particularly useful for estimating the binomial
probability that X lies between the (integer) values x1 and x2 ,
x2 +0.5
1
1 u − µ 2
Pr(x1 < X ≤ x2 ) ≈ √
exp −
du.
2
σ
σ 2π x1 −0.5
A manufacturer makes computer chips of which 10% are defective. For a random sample
of 200 chips, find the approximate probability that more than 15 are defective.
We first define the random variable
X = number of defective chips in the sample,
which has a binomial distribution X ∼ Bin(200, 0.1). Therefore, the mean and variance of
this distribution are
E[X] = 200 × 0.1 = 20
V [X] = 200 × 0.1 × (1 − 0.1) = 18,
and
and we may approximate the binomial distribution with a Gaussian distribution such that
X ∼ N(20, 18). The standard variable is
Z=
X − 20
√
,
18
and so, using X = 15.5 to allow for the continuity correction,
15.5 − 20
= Pr(Z > −1.06)
Pr(X > 15.5) = Pr Z > √
18
= Pr(Z < 1.06) = 0.86. Gaussian approximation to the Poisson distribution
We first met the Poisson distribution as the limit of the binomial distribution for
n → ∞ and p → 0, taken in such a way that np = λ remains finite. Further, in
the previous subsection, we considered the Gaussian distribution as the limit of
the binomial distribution when n → ∞ but p remains finite, so that np → ∞ also.
It should come as no surprise, therefore, that the Gaussian distribution can also
be used to approximate the Poisson distribution when the mean λ becomes large.
The probability function for the Poisson distribution is
f(x) = e−λ
λx
,
x!
which, on taking the logarithm of both sides, gives
ln f(x) = −λ + x ln λ − ln x!.
1187
(30.115)
PROBABILITY
Stirling’s approximation for large x gives
x x
√
x! ≈ 2πx
e
implying that
√
ln x! ≈ ln 2πx + x ln x − x,
which, on substituting into (30.115), yields
√
ln f(x) ≈ −λ + x ln λ − (x ln x − x) − ln 2πx.
Since we expect the Poisson distribution to peak around x = λ, we substitute
= x − λ to obtain
!
"
+ (λ + ) − ln 2π(λ + ).
ln f(x) ≈ −λ + (λ + ) ln λ − ln λ 1 +
λ
Using the expansion ln(1 + z) = z − z 2 /2 + · · · , we find
√
2
2
− 2 − ln 2πλ −
− 2
ln f(x) ≈ − (λ + )
λ 2λ
λ 2λ
2
√
≈ − − ln 2πλ,
2λ
when only the dominant terms are retained, after using the fact that is of the
order of the standard deviation of x, i.e. of order λ1/2 . On exponentiating this
result we obtain
(x − λ)2
1
exp −
,
f(x) ≈ √
2λ
2πλ
which is the Gaussian distribution with µ = λ and σ 2 = λ.
The larger the value of λ, the better is the Gaussian approximation to the
Poisson distribution; the approximation is reasonable even for λ = 5, but λ ≥ 10
is safer. As in the case of the Gaussian approximation to the binomial distribution,
a continuity correction is necessary since the Poisson distribution is discrete.
E-mail messages are received by an author at an average rate of one per hour. Find the
probability that in a day the author receives 24 messages or more.
We first define the random variable
X = number of messages received in a day.
Thus E[X] = 1 × 24 = 24, and so X ∼ Po(24). Since λ > 10 we may approximate the
Poisson distribution by X ∼ N(24, 24). Now the standard variable is
X − 24
,
Z= √
24
and, using the continuity correction, we find
23.5 − 24
Pr(X > 23.5) = Pr Z > √
24
= Pr(Z > −0.102) = Pr(Z < 0.102) = 0.54. 1188
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
In fact, almost all probability distributions tend towards a Gaussian when the
numbers involved become large – that this should happen is required by the
central limit theorem, which we discuss in section 30.10.
Multiple Gaussian distributions
Suppose X and Y are independent Gaussian-distributed random variables, so
that X ∼ N(µ1 , σ12 ) and Y ∼ N(µ2 , σ22 ). Let us now consider the random variable
Z = X + Y . The PDF for this random variable may be found directly using
(30.61), but it is easier to use the MGF. From (30.114), the MGFs of X and Y
are
MY (t) = exp µ2 t + 12 σ22 t2 .
MX (t) = exp µ1 t + 12 σ12 t2 ,
Using (30.89), since X and Y are independent RVs, the MGF of Z = X + Y is
simply the product of MX (t) and MY (t). Thus, we have
MZ (t) = MX (t)MY (t) = exp µ1 t + 12 σ12 t2 exp µ2 t + 12 σ22 t2
= exp (µ1 + µ2 )t + 12 (σ12 + σ22 )t2 ,
which we recognise as the MGF for a Gaussian with mean µ1 + µ2 and variance
σ12 + σ22 . Thus, Z is also Gaussian distributed: Z ∼ N(µ1 + µ2 , σ12 + σ22 ).
A similar calculation may be performed to calculate the PDF of the random
variable W = X − Y . If we introduce the variable Ỹ = −Y then W = X + Ỹ ,
where Ỹ ∼ N(−µ1 , σ12 ). Thus, using the result above, we find W ∼ N(µ1 −
µ2 , σ12 + σ22 ).
An executive travels home from her office every evening. Her journey consists of a train
ride, followed by a bicycle ride. The time spent on the train is Gaussian distributed with
mean 52 minutes and standard deviation 1.8 minutes, while the time for the bicycle journey
is Gaussian distributed with mean 8 minutes and standard deviation 2.6 minutes. Assuming
these two factors are independent, estimate the percentage of occasions on which the whole
journey takes more than 65 minutes.
We first define the random variables
X = time spent on train,
Y = time spent on bicycle,
so that X ∼ N(52, (1.8) ) and Y ∼ N(8, (2.6) ). Since X and Y are independent, the total
journey time T = X + Y is distributed as
2
2
T ∼ N(52 + 8, (1.8)2 + (2.6)2 ) = N(60, (3.16)2 ).
The standard variable is thus
Z=
T − 60
,
3.16
and the required probability is given by
65 − 60
= Pr(Z > 1.58) = 1 − 0.943 = 0.057.
Pr(T > 65) = Pr Z >
3.16
Thus the total journey time exceeds 65 minutes on 5.7% of occasions. 1189
PROBABILITY
The above results may be extended. For example, if the random variables
Xi , i = 1, 2, . . . , n, are distributed as Xi ∼ N(µi , σi2 ) then the random variable
Z = i ci Xi (where the ci are constants) is distributed as Z ∼ N( i ci µi , i c2i σi2 ).
30.9.2 The log-normal distribution
If the random variable X follows a Gaussian distribution then the variable
Y = eX is described by a log-normal distribution. Clearly, if X can take values
in the range −∞ to ∞, then Y will lie between 0 and ∞. The probability density
function for Y is found using the result (30.58). It is
dx 1 1
(ln y − µ)2
exp −
.
g(y) = f(x(y)) = √
dy
2σ 2
σ 2π y
We note that µ and σ 2 are not the mean and variance of the log-normal
distribution, but rather the parameters of the corresponding Gaussian distribution
for X. The mean and variance of Y , however, can be found straightforwardly
using the MGF of X, which reads MX (t) = E[etX ] = exp(µt + 12 σ 2 t2 ). Thus, the
mean of Y is given by
E[Y ] = E[eX ] = MX (1) = exp(µ + 12 σ 2 ),
and the variance of Y reads
V [Y ] = E[Y 2 ] − (E[Y ])2 = E[e2X ] − (E[eX ])2
= MX (2) − [MX (1)]2 = exp(2µ + σ 2 )[exp(σ 2 ) − 1].
In figure 30.15, we plot some examples of the log-normal distribution for various
values of the parameters µ and σ 2 .
30.9.3 The exponential and gamma distributions
The exponential distribution with positive parameter λ is given by
#
λe−λx for x > 0,
(30.116)
f(x) =
0
for x ≤ 0
∞
and satisfies −∞ f(x) dx = 1 as required. The exponential distribution occurs naturally if we consider the distribution of the length of intervals between successive
events in a Poisson process or, equivalently, the distribution of the interval (i.e.
the waiting time) before the first event. If the average number of events per unit
interval is λ then on average there are λx events in interval x, so that from the
Poisson distribution the probability that there will be no events in this interval is
given by
Pr(no events in interval x) = e−λx .
1190
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
g(y)
1
µ = 0,
µ = 0,
µ = 0,
µ = 1,
0.8
0.6
σ=0
σ = 0.5
σ = 1.5
σ=1
0.4
0.2
y
0
0
1
2
4
3
Figure 30.15 The PDF g(y) for the log-normal distribution for various values
of the parameters µ and σ.
The probability that an event occurs in the next infinitestimal interval [x, x + dx]
is given by λ dx, so that
Pr(the first event occurs in interval [x, x + dx]) = e−λx λ dx.
Hence the required probability density function is given by
f(x) = λe−λx .
The expectation and variance of the exponential distribution can be evaluated as
1/λ and (1/λ)2 respectively. The MGF is given by
λ
.
(30.117)
λ−t
We may generalise the above discussion to obtain the PDF for the interval
between every rth event in a Poisson process or, equivalently, the interval (waiting
time) before the rth event. We begin by using the Poisson distribution to give
M(t) =
Pr(r − 1 events occur in interval x) = e−λx
(λx)r−1
,
(r − 1)!
from which we obtain
Pr(rth event occurs in the interval [x, x + dx]) = e−λx
(λx)r−1
λ dx.
(r − 1)!
Thus the required PDF is
f(x) =
λ
(λx)r−1 e−λx ,
(r − 1)!
(30.118)
which is known as the gamma distribution of order r with parameter λ. Although
our derivation applies only when r is a positive integer, the gamma distribution is
1191
PROBABILITY
f(x)
1
0.8
r=1
0.6
0.4
r=2
r=5
0.2
r = 10
x
0
0
2
4
6
8
10
12 14
16
18
20
Figure 30.16 The PDF f(x) for the gamma distributions γ(λ, r) with λ = 1
and r = 1, 2, 5, 10.
defined for all positive r by replacing (r − 1)! by Γ(r) in (30.118); see the appendix
for a discussion of the gamma function Γ(x). If a random variable X is described
by a gamma distribution of order r with parameter λ, we write X ∼ γ(λ, r);
we note that the exponential distribution is the special case γ(λ, 1). The gamma
distribution γ(λ, r) is plotted in figure 30.16 for λ = 1 and r = 1, 2, 5, 10. For
large r, the gamma distribution tends to the Gaussian distribution whose mean
and variance are specified by (30.120) below.
The MGF for the gamma distribution is obtained from that for the exponential
distribution, by noting that we may consider the interval between every rth event
in a Poisson process as the sum of r intervals between successive events. Thus the
rth-order gamma variate is the sum of r independent exponentially distributed
random variables. From (30.117) and (30.90), the MGF of the gamma distribution
is therefore given by
r
λ
,
(30.119)
M(t) =
λ−t
from which the mean and variance are found to be
r
r
V [X] = 2 .
(30.120)
E[X] = ,
λ
λ
We may also use the above MGF to prove another useful theorem regarding
multiple gamma distributions. If Xi ∼ γ(λ, ri ), i = 1, 2, . . . , n, are independent
gamma variates then the random variable Y = X1 + X2 + · · · + Xn has MGF
ri r1 +r2 +···+rn
n λ
λ
M(t) =
=
.
(30.121)
λ−t
λ−t
i=1
Thus Y is also a gamma variate, distributed as Y ∼ γ(λ, r1 + r2 + · · · + rn ).
1192
30.9 IMPORTANT CONTINUOUS DISTRIBUTIONS
30.9.4 The chi-squared distribution
In subsection 30.6.2, we showed that if X is Gaussian distributed with mean µ and
variance σ 2 , such that X ∼ N(µ, σ 2 ), then the random variable Y = (x − µ)2 /σ 2
is distributed as the gamma distribution Y ∼ γ( 12 , 12 ). Let us now consider n
independent Gaussian random variables Xi ∼ N(µi , σi2 ), i = 1, 2, . . . , n, and define
the new variable
χ2n =
n
(Xi − µi )2
σi2
i=1
.
(30.122)
Using the result (30.121) for multiple gamma distributions, χ2n must be distributed
as the gamma variate χ2n ∼ γ( 12 , 12 n), which from (30.118) has the PDF
f(χ2n ) =
=
1
2
( 1 χ2n )(n/2)−1
Γ( 12 n) 2
exp(− 12 χ2n )
1
(χ2n )(n/2)−1 exp(− 12 χ2n ).
2n/2 Γ( 12 n)
(30.123)
This is known as the chi-squared distribution of order n and has numerous
applications in statistics (see chapter 31). Setting λ = 12 and r = 12 n in (30.120),
we find that
E[χ2n ] = n,
V [χ2n ] = 2n.
An important generalisation occurs when the n Gaussian variables Xi are not
linearly independent but are instead required to satisfy a linear constraint of the
form
c1 X1 + c2 X2 + · · · + cn Xn = 0,
(30.124)
in which the constants ci are not all zero. In this case, it may be shown (see
exercise 30.40) that the variable χ2n defined in (30.122) is still described by a chisquared distribution, but one of order n − 1. Indeed, this result may be trivially
extended to show that if the n Gaussian variables Xi satisfy m linear constraints
of the form (30.124) then the variable χ2n defined in (30.122) is described by a
chi-squared distribution of order n − m.
30.9.5 The Cauchy and Breit–Wigner distributions
A random variable X (in the range −∞ to ∞) that obeys the Cauchy distribution
is described by the PDF
f(x) =
1
1
.
π 1 + x2
1193
PROBABILITY
f(x)
0.8
x0 = 0,
Γ=1
0.6
x0 = 2,
Γ=1
0.4
0.2
x0 = 0,
Γ=3
0
−4
−2
2
0
4
x
Figure 30.17 The PDF f(x) for the Breit–Wigner distribution for different
values of the parameters x0 and Γ.
This is a special case of the Breit–Wigner distribution
f(x) =
1
π
1
2Γ
1 2
4Γ
+ (x − x0 )2
,
which is encountered in the study of nuclear and particle physics. In figure 30.17,
we plot some examples of the Breit–Wigner distribution for several values of the
parameters x0 and Γ.
We see from the figure that the peak (or mode) of the distribution occurs
at x = x0 . It is also straightforward to show that the parameter Γ is equal to
the width of the peak at half the maximum height. Although the Breit–Wigner
distribution is symmetric about its peak, it does not formally possess a mean since
0
∞
the integrals −∞ xf(x) dx and 0 xf(x) dx both diverge. Similar divergences occur
for all higher moments of the distribution.
30.9.6 The uniform distribution
Finally we mention the very simple, but common, uniform distribution, which
describes a continuous random variable that has a constant PDF over its allowed
range of values. If the limits on X are a and b then
#
1/(b − a) for a ≤ x ≤ b,
f(x) =
0
otherwise.
The MGF of the uniform distribution is found to be
M(t) =
ebt − eat
,
(b − a)t
1194
Fly UP