Important discrete distributions

by taratuta

on 20 января 2017

Category: Documents

>> Downloads: 14

101

views

Report

Comments

Description

Download Important discrete distributions

Transcript

Important discrete distributions

PROBABILITY
Distribution
Probability law f(x)
binomial
n
negative binomial
r+x−1
geometric
q x−1 p
hypergeometric
(Np)!(Nq)!n!(N−n)!
x!(Np−x)!(n−x)!(Nq−n+x)!N!
Poisson
λx −λ
e
x!
Cx px q n−x
Cx p r q x
MGF
E[X]
V [X]
(pet + q)n
r
p
1 − qet
pet
1 − qet
np
npq
rq
p
1
p
rq
p2
q
p2
N−n
npq
N−1
np
t
eλ(e −1)
λ
λ
Table 30.1 Some important discrete probability distributions.
30.8 Important discrete distributions
Having discussed some general properties of distributions, we now consider the
more important discrete distributions encountered in physical applications. These
are discussed in detail below, and summarised for convenience in table 30.1; we
refer the reader to the relevant section below for an explanation of the symbols
used.
30.8.1 The binomial distribution
Perhaps the most important discrete probability distribution is the binomial distribution. This distribution describes processes that consist of a number of independent identical trials with two possible outcomes, A and B = Ā. We may call
these outcomes ‘success’ and ‘failure’ respectively. If the probability of a success
is Pr(A) = p then the probability of a failure is Pr(B) = q = 1 − p. If we perform
n trials then the discrete random variable
X = number of times A occurs
can take the values 0, 1, 2, . . . , n; its distribution amongst these values is described
by the binomial distribution.
We now calculate the probability that in n trials we obtain x successes (and so
n − x failures). One way of obtaining such a result is to have x successes followed
by n−x failures. Since the trials are assumed independent, the probability of this is
pp · · · p × qq · · · q = px q n−x .
8 9: ;
8 9: ;
x times n − x times
This is, however, just one permutation of x successes and n − x failures. The total
1168
30.8 IMPORTANT DISCRETE DISTRIBUTIONS
f(x)
f(x)
n = 5, p = 0.6
n = 5, p = 0.167
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
01 23 4 5
0
x
f(x)
01 23 4 5
f(x)
n = 10, p = 0.6
n = 10, p = 0.167
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
x
0 1 2 3 4 5 6 7 8 9 10
0
x
0 1 2 3 4 5 6 7 8 9 10
x
Figure 30.11 Some typical binomial distributions with various combinations
of parameters n and p.
number of permutations of n objects, of which x are identical and of type 1 and
n − x are identical and of type 2, is given by (30.33) as
n!
≡ n Cx .
x!(n − x)!
Therefore, the total probability of obtaining x successes from n trials is
f(x) = Pr(X = x) = n Cx px q n−x = n Cx px (1 − p)n−x ,
(30.94)
which is the binomial probability distribution formula. When a random variable
X follows the binomial distribution for n trials, with a probability of success p,
we write X ∼ Bin(n, p). Then the random variable X is often referred to as a
binomial variate. Some typical binomial distributions are shown in ﬁgure 30.11.
If a single six-sided die is rolled ﬁve times, what is the probability that a six is thrown
exactly three times?
Here the number of ‘trials’ n = 5, and we are interested in the random variable
X = number of sixes thrown.
Since the probability of a ‘success’ is p = 16 , the probability of obtaining exactly three sixes
in ﬁve throws is given by (30.94) as
3 (5−3)
5!
1
5
= 0.032. Pr(X = 3) =
3!(5 − 3)! 6
6
1169
PROBABILITY
For evaluating binomial probabilities a useful result is the binomial recurrence
formula
p n−x
Pr(X = x + 1) =
Pr(X = x),
(30.95)
q x+1
which enables successive probabilities Pr(X = x + k), k = 1, 2, . . . , to be calculated
once Pr(X = x) is known; it is often quicker to use than (30.94).
The random variable X is distributed as X ∼ Bin(3, 12 ). Evaluate the probability function
f(x) using the binomial recurrence formula.
The probability Pr(X = 0) may be calculated using (30.94) and is
0 1 3
= 18 .
Pr(X = 0) = 3 C0 12
2
The ratio p/q =
(30.95), we ﬁnd
1 1
/
2 2
= 1 in this case and so, using the binomial recurrence formula
Pr(X = 1) = 1 ×
3−0 1
3
× = ,
0+1 8
8
Pr(X = 2) = 1 ×
3
3−1 3
× = ,
1+1 8
8
Pr(X = 3) = 1 ×
1
3−2 3
× = ,
2+1 8
8
results which may be veriﬁed by direct application of (30.94). We note that, as required, the binomial distribution satiﬁes
n
x=0
f(x) =
n
n
Cx px q n−x = (p + q)n = 1.
x=0
Furthermore, from the deﬁnitions of E[X] and V [X] for a discrete distribution,
we may show that for the binomial distribution E[X] = np and V [X] = npq. The
direct summations involved are, however, rather cumbersome and these results
are obtained much more simply using the moment generating function.
The moment generating function for the binomial distribution
To ﬁnd the MGF for the binomial distribution we consider the binomial random
variable X to be the sum of the random variables Xi , i = 1, 2, . . . , n, which are
deﬁned by
#
1 if a ‘success’ occurs on the ith trial,
Xi =
0 if a ‘failure’ occurs on the ith trial.
1170
30.8 IMPORTANT DISCRETE DISTRIBUTIONS
Thus
Mi (t) = E etXi = e0t × Pr(Xi = 0) + e1t × Pr(Xi = 1)
= 1 × q + et × p
= pet + q.
From (30.89), it follows that the MGF for the binomial distribution is given by
M(t) =
n
Mi (t) = (pet + q)n .
(30.96)
i=1
We can now use the moment generating function to derive the mean and
variance of the binomial distribution. From (30.96)
M (t) = npet (pet + q)n−1 ,
and from (30.86)
E[X] = M (0) = np(p + q)n−1 = np,
where the last equality follows from p + q = 1.
Diﬀerentiating with respect to t once more gives
M (t) = et (n − 1)np2 (pet + q)n−2 + et np(pet + q)n−1 ,
and from (30.86)
E[X 2 ] = M (0) = n2 p2 − np2 + np.
Thus, using (30.87)
2
V [X] = M (0) − M (0) = n2 p2 − np2 + np − n2 p2 = np(1 − p) = npq.
Multiple binomial distributions
Suppose X and Y are two independent random variables, both of which are
described by binomial distributions with a common probability of success p, but
with (in general) diﬀerent numbers of trials n1 and n2 , so that X ∼ Bin(n1 , p)
and Y ∼ Bin(n2 , p). Now consider the random variable Z = X + Y . We could
calculate the probability distribution of Z directly using (30.60), but it is much
easier to use the MGF (30.96).
Since X and Y are independent random variables, the MGF MZ (t) of the new
variable Z = X + Y is given simply by the product of the individual MGFs
MX (t) and MY (t). Thus, we obtain
MZ (t) = MX (t)MY (t) = (pet + q)n1 (pet + q)n1 = (pet + q)n1 +n2 ,
which we recognise as the MGF of Z ∼ Bin(n1 + n2 , p). Hence Z is also described
by a binomial distribution.
This result may be extended to any number of binomial distributions. If Xi ,
1171
PROBABILITY
i = 1, 2, . . . , N, is distributed as Xi ∼ Bin(ni , p) then Z = X1 + X2 + · · · + XN is
distributed as Z ∼ Bin(n1 + n2 + · · · + nN , p), as would be expected since the result
of i ni trials cannot depend on how they are split up. A similar proof is also
possible using either the probability or cumulant generating functions.
Unfortunately, no equivalent simple result exists for the probability distribution
of the diﬀerence Z = X − Y of two binomially distributed variables.
30.8.2 The geometric and negative binomial distributions
A special case of the binomial distribution occurs when instead of the number of
successes we consider the discrete random variable
X = number of trials required to obtain the ﬁrst success.
The probability that x trials are required in order to obtain the ﬁrst success, is
simply the probability of obtaining x − 1 failures followed by one success. If the
probability of a success on each trial is p, then for x > 0
f(x) = Pr(X = x) = (1 − p)x−1 p = q x−1 p,
where q = 1 − p. This distribution is sometimes called the geometric distribution.
The probability generating function for this distribution is given in (30.78). By
replacing t by et in (30.78) we immediately obtain the MGF of the geometric
distribution
pet
,
M(t) =
1 − qet
from which its mean and variance are found to be
E[X] =
1
,
p
V [X] =
q
.
p2
Another distribution closely related to the binomial is the negative binomial
distribution. This describes the probability distribution of the random variable
X = number of failures before the rth success.
One way of obtaining x failures before the rth success is to have r − 1 successes
followed by x failures followed by the rth success, for which the probability is
pp · · · p × qq · · · q × p = pr q x .
8 9: ;
8 9: ;
r − 1 times
x times
However, the ﬁrst r + x − 1 factors constitute just one permutation of r − 1
successes and x failures. The total number of permutations of these r + x − 1
objects, of which r − 1 are identical and of type 1 and x are identical and of type
1172
30.8 IMPORTANT DISCRETE DISTRIBUTIONS
2, is r+x−1 Cx . Therefore, the total probability of obtaining x failures before the
rth success is
f(x) = Pr(X = x) = r+x−1 Cx pr q x ,
which is called the negative binomial distribution (see the related discussion on
p. 1137). It is straightforward to show that the MGF of this distribution is
r
p
,
M(t) =
1 − qet
and that its mean and variance are given by
rq
rq
and
V [X] = 2 .
E[X] =
p
p
30.8.3 The hypergeometric distribution
In subsection 30.8.1 we saw that the probability of obtaining x successes in n
independent trials was given by the binomial distribution. Suppose that these n
‘trials’ actually consist of drawing at random n balls, from a set of N such balls
of which M are red and the rest white. Let us consider the random variable
X = number of red balls drawn.
On the one hand, if the balls are drawn with replacement then the trials are
independent and the probability of drawing a red ball is p = M/N each time.
Therefore, the probability of drawing x red balls in n trials is given by the
binomial distribution as
Pr(X = x) =
n!
px (1 − p)n−x .
x!(n − x)!
On the other hand, if the balls are drawn without replacement the trials are not
independent and the probability of drawing a red ball depends on how many red
balls have already been drawn. We can, however, still derive a general formula
for the probability of drawing x red balls in n trials, as follows.
The number of ways of drawing x red balls from M is M Cx , and the number
of ways of drawing n − x white balls from N − M is N−M Cn−x . Therefore, the
total number of ways to obtain x red balls in n trials is M Cx N−M Cn−x . However,
the total number of ways of drawing n objects from N is simply N Cn . Hence the
probability of obtaining x red balls in n trials is
M
Pr(X = x) =
Cx
N−M
Cn−x
NC
n
=
(N − M)!
n!(N − n)!
M!
,
x!(M − x)! (n − x)!(N − M − n + x)!
N!
(30.97)
=
(Np)!(Nq)! n!(N − n)!
,
x!(Np − x)!(n − x)!(Nq − n + x)! N!
(30.98)
1173
PROBABILITY
where in the last line p = M/N and q = 1 − p. This is called the hypergeometric
distribution.
By performing the relevant summations directly, it may be shown that the
hypergeometric distribution has mean
E[X] = n
M
= np
N
and variance
V [X] =
nM(N − M)(N − n)
N−n
=
npq.
N 2 (N − 1)
N−1
In the UK National Lottery each participant chooses six diﬀerent numbers between 1
and 49. In each weekly draw six numbered winning balls are subsequently drawn. Find the
probabilities that a participant chooses 0, 1, 2, 3, 4, 5, 6 winning numbers correctly.
The probabilities are given by a hypergeometric distribution with N (the total number of
balls) = 49, M (the number of winning balls drawn) = 6, and n (the number of numbers
chosen by each participant) = 6. Thus, substituting in (30.97), we ﬁnd
6
C0
43
C6
6
1
,
2.29
6
1
C2 C4
=
Pr(2) = 49
,
C6
7.55
6
1
C4 43 C2
=
Pr(4) = 49
,
C6
1032
Pr(0) =
49 C
6
Pr(6) =
C6
43
49 C
C1
43
C5
1
,
2.42
6
1
C3 C3
=
Pr(3) = 49
,
C6
56.6
6
1
C5 43 C1
=
Pr(5) = 49
,
C6
54 200
=
Pr(1) =
6
43
49 C
=
6
43
C0
=
6
1
.
13.98 × 106
It can easily be seen that
6
Pr(i) = 0.44 + 0.41 + 0.13 + 0.02 + O(10−3 ) = 1,
i=0
as expected. Note that if the number of trials (balls drawn) is small compared with N, M
and N − M then not replacing the balls is of little consequence, and we may
approximate the hypergeometric distribution by the binomial distribution (with
p = M/N); this is much easier to evaluate.
30.8.4 The Poisson distribution
We have seen that the binomial distribution describes the number of successful
outcomes in a certain number of trials n. The Poisson distribution also describes
the probability of obtaining a given number of successes but for situations
in which the number of ‘trials’ cannot be enumerated; rather it describes the
situation in which discrete events occur in a continuum. Typical examples of
1174
30.8 IMPORTANT DISCRETE DISTRIBUTIONS
discrete random variables X described by a Poisson distribution are the number
of telephone calls received by a switchboard in a given interval, or the number
of stars above a certain brightness in a particular area of the sky. Given a mean
rate of occurrence λ of these events in the relevant interval or area, the Poisson
distribution gives the probability Pr(X = x) that exactly x events will occur.
We may derive the form of the Poisson distribution as the limit of the binomial
distribution when the number of trials n → ∞ and the probability of ‘success’
p → 0, in such a way that np = λ remains ﬁnite. Thus, in our example of a
telephone switchboard, suppose we wish to ﬁnd the probability that exactly x
calls are received during some time interval, given that the mean number of calls
in such an interval is λ. Let us begin by dividing the time interval into a large
number, n, of equal shorter intervals, in each of which the probability of receiving
a call is p. As we let n → ∞ then p → 0, but since we require the mean number
of calls in the interval to equal λ, we must have np = λ. The probability of x
successes in n trials is given by the binomial formula as
Pr(X = x) =
n!
px (1 − p)n−x .
x!(n − x)!
(30.99)
Now as n → ∞, with x ﬁnite, the ratio of the n-dependent factorials in (30.99)
behaves asymptotically as a power of n, i.e.
lim
n→∞
n!
= lim n(n − 1)(n − 2) · · · (n − x + 1) ∼ nx .
(n − x)! n→∞
Also
(1 − p)λ/p
e−λ
.
=
x
p→0 (1 − p)
1
lim lim(1 − p)n−x = lim
n→∞ p→0
Thus, using λ = np, (30.99) tends to the Poisson distribution
f(x) = Pr(X = x) =
e−λ λx
,
x!
(30.100)
which gives the probability of obtaining exactly x calls in the given time interval.
As we shall show below, λ is the mean of the distribution. Events following a
Poisson distribution are usually said to occur randomly in time.
Alternatively we may derive the Poisson distribution directly, without considering a limit of the binomial distribution. Let us again consider our example
of a telephone switchboard. Suppose that the probability that x calls have been
received in a time interval t is Px (t). If the average number of calls received in a
unit time is λ then in a further small time interval ∆t the probability of receiving
a call is λ∆t, provided ∆t is short enough that the probability of receiving two or
more calls in this small interval is negligible. Similarly the probability of receiving
no call during the same small interval is simply 1 − λ∆t.
Thus, for x > 0, the probability of receiving exactly x calls in the total interval
1175
PROBABILITY
t + ∆t is given by
Px (t + ∆t) = Px (t)(1 − λ∆t) + Px−1 (t)λ∆t.
Rearranging the equation, dividing through by ∆t and letting ∆t → 0, we obtain
the diﬀerential recurrence equation
dPx (t)
= λPx−1 (t) − λPx (t).
dt
For x = 0 (i.e. no calls received), however, (30.101) simpliﬁes to
(30.101)
dP0 (t)
= −λP0 (t),
dt
which may be integrated to give P0 (t) = P0 (0)e−λt . But since the probability P0 (0)
of receiving no calls in a zero time interval must equal unity, we have P0 (t) = e−λt .
This expression for P0 (t) may then be substituted back into (30.101) with x = 1
to obtain a diﬀerential equation for P1 (t) that has the solution P1 (t) = λte−λt .
We may repeat this process to obtain expressions for P2 (t), P3 (t), . . . , Px (t), and we
ﬁnd
(λt)x −λt
(30.102)
e .
Px (t) =
x!
By setting t = 1 in (30.102), we again obtain the Poisson distribution (30.100) for
obtaining exactly x calls in a unit time interval.
If a discrete random variable is described by a Poisson distribution of mean λ
then we write X ∼ Po(λ). As it must be, the sum of the probabilities is unity:
∞
Pr(X = x) = e−λ
x=0
∞
λx
x=0
x!
= e−λ eλ = 1.
From (30.100) we may also derive the Poisson recurrence formula,
Pr(X = x + 1) =
λ
Pr(X = x)
x+1
for x = 0, 1, 2, . . . ,
(30.103)
which enables successive probabilities to be calculated easily once one is known.
A person receives on average one e-mail message per half-hour interval. Assuming that
the e-mails are received randomly in time, ﬁnd the probabilities that in any particular hour
0, 1, 2, 3, 4, 5 messages are received.
Let X = number of e-mails received per hour. Clearly the mean number of e-mails per
hour is two, and so X follows a Poisson distribution with λ = 2, i.e.
Pr(X = x) =
2x −2
e .
x!
Thus Pr(X = 0) = e−2 = 0.135, Pr(X = 1) = 2e−2 = 0.271, Pr(X = 2) = 22 e−2 /2! = 0.271,
Pr(X = 3) = 23 e−2 /3! = 0.180, Pr(X = 4) = 24 e−2 /4! = 0.090, Pr(X = 5) = 25 e−2 /5! =
0.036. These results may also be calculated using the recurrence formula (30.103). 1176
30.8 IMPORTANT DISCRETE DISTRIBUTIONS
f(x)
f(x)
λ=1
0.3
λ=2
0.3
0.2
0.2
0.1
0.1
0
0
0 1 2 3 4 5
0 1 2 3 4 5 6 7
x
x
f(x)
λ=5
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9 10 11
x
Figure 30.12 Three Poisson distributions for diﬀerent values of the parameter λ.
The above example illustrates the point that a Poisson distribution typically
rises and then falls. It either has a maximum when x is equal to the integer part
of λ or, if λ happens to be an integer, has equal maximal values at x = λ − 1 and
x = λ. The Poisson distribution always has a long ‘tail’ towards higher values of X
but the higher the value of the mean the more symmetric the distribution becomes.
Typical Poisson distributions are shown in ﬁgure 30.12. Using the deﬁnitions of
mean and variance, we may show that, for the Poisson distribution, E[X] = λ and
V [X] = λ. Nevertheless, as in the case of the binomial distribution, performing
the relevant summations directly is rather tiresome, and these results are much
more easily proved using the MGF.
The moment generating function for the Poisson distribution
The MGF of the Poisson distribution is given by
∞
∞
etx e−λ λx
(λet )x
t
t
= e−λ
= e−λ eλe = eλ(e −1)
MX (t) = E etX =
x!
x!
x=0
x=0
(30.104)
1177
PROBABILITY
from which we obtain
MX (t) = λet eλ(e −1) ,
t
MX (t) = (λ2 e2t + λet )eλ(e −1) .
t
Thus, the mean and variance of the Poisson distribution are given by
E[X] = MX (0) = λ
and
V [X] = MX (0) − [MX (0)]2 = λ.
The Poisson approximation to the binomial distribution
Earlier we derived the Poisson distribution as the limit of the binomial distribution
when n → ∞ and p → 0 in such a way that np = λ remains ﬁnite, where λ is the
mean of the Poisson distribution. It is not surprising, therefore, that the Poisson
distribution is a very good approximation to the binomial distribution for large
n (≥ 50, say) and small p (≤ 0.1, say). Moreover, it is easier to calculate as it
involves fewer factorials.
In a large batch of light bulbs, the probability that a bulb is defective is 0.5%. For a
sample of 200 bulbs taken at random, ﬁnd the approximate probabilities that 0, 1 and 2 of
the bulbs respectively are defective.
Let the random variable X = number of defective bulbs in a sample. This is distributed
as X ∼ Bin(200, 0.005), implying that λ = np = 1.0. Since n is large and p small, we may
approximate the distribution as X ∼ Po(1), giving
1x
,
x!
from which we ﬁnd Pr(X = 0) ≈ 0.37, Pr(X = 1) ≈ 0.37, Pr(X = 2) ≈ 0.18. For comparison,
it may be noted that the exact values calculated from the binomial distribution are identical
to those found here to two decimal places. Pr(X = x) ≈ e−1
Multiple Poisson distributions
Mirroring our discussion of multiple binomial distributions in subsection 30.8.1,
let us suppose X and Y are two independent random variables, both of which
are described by Poisson distributions with (in general) diﬀerent means, so that
X ∼ Po(λ1 ) and Y ∼ Po(λ2 ). Now consider the random variable Z = X + Y . We
may calculate the probability distribution of Z directly using (30.60), but we may
derive the result much more easily by using the moment generating function (or
indeed the probability or cumulant generating functions).
Since X and Y are independent RVs, the MGF for Z is simply the product of
the individual MGFs for X and Y . Thus, from (30.104),
MZ (t) = MX (t)MY (t) = eλ1 (e −1) eλ2 (e −1) = e(λ1 +λ2 )(e −1) ,
t
t
t
which we recognise as the MGF of Z ∼ Po(λ1 + λ2 ). Hence Z is also Poisson
distributed and has mean λ1 + λ2 . Unfortunately, no such simple result holds for
the diﬀerence Z = X − Y of two independent Poisson variates. A closed-form
1178