Comments
Description
Transcript
Joint distributions
PROBABILITY where MXi (t) is the MGF of fi (x). Now t t2 t MXi = 1 + E[Xi ] + 12 2 E[Xi2 ] + · · · n n n = 1 + µi and as n becomes large MXi t t2 + 12 (σi2 + µ2i ) 2 + · · · , n n t µi t 1 2 t2 + 2 σi 2 , ≈ exp n n n as may be verified by expanding the exponential up to terms including (t/n)2 . Therefore 2 n µi t 1 2 t2 σ i µi + 2 σi 2 = exp t + 12 i 2 i t2 . exp MZ (t) ≈ n n n n i=1 Comparing this with the form of the MGF for a Gaussian distribution, (30.114), we can see that the probability density function g(z) of Z tends to a Gaussian dis tribution with mean i µi /n and variance i σi2 /n2 . In particular, if we consider Z to be the mean of n independent measurements of the same random variable X (so that Xi = X for i = 1, 2, . . . , n) then, as n → ∞, Z has a Gaussian distribution with mean µ and variance σ 2 /n. We may use the central limit theorem to derive an analogous result to (iii) above for the product W = X1 X2 · · · Xn of the n independent random variables Xi . Provided the Xi only take values between zero and infinity, we may write ln W = ln X1 + ln X2 + · · · + ln Xn , which is simply the sum of n new random variables ln Xi . Thus, provided these new variables each possess a formal mean and variance, the PDF of ln W will tend to a Gaussian in the limit n → ∞, and so the product W will be described by a log-normal distribution (see subsection 30.9.2). 30.11 Joint distributions As mentioned briefly in subsection 30.4.3, it is common in the physical sciences to consider simultaneously two or more random variables that are not independent, in general, and are thus described by joint probability density functions. We will return to the subject of the interdependence of random variables after first presenting some of the general ways of characterising joint distributions. We will concentrate mainly on bivariate distributions, i.e. distributions of only two random variables, though the results may be extended readily to multivariate distributions. The subject of multivariate distributions is large and a detailed study is beyond the scope of this book; the interested reader should therefore 1196 30.11 JOINT DISTRIBUTIONS consult one of the many specialised texts. However, we do discuss the multinomial and multivariate Gaussian distributions, in section 30.15. The first thing to note when dealing with bivariate distributions is that the distinction between discrete and continuous distributions may not be as clear as for the single variable case; the random variables can both be discrete, or both continuous, or one discrete and the other continuous. In general, for the random variables X and Y , the joint distribution will take an infinite number of values unless both X and Y have only a finite number of values. In this chapter we will consider only the cases where X and Y are either both discrete or both continuous random variables. 30.11.1 Discrete bivariate distributions In direct analogy with the one-variable (univariate) case, if X is a discrete random variable that takes the values {xi } and Y one that takes the values {yj } then the probability function of the joint distribution is defined as # Pr(X = xi , Y = yj ) for x = xi , y = yj , f(x, y) = 0 otherwise. We may therefore think of f(x, y) as a set of spikes at valid points in the xy-plane, whose height at (xi , yi ) represents the probability of obtaining X = xi and Y = yj . The normalisation of f(x, y) implies f(xi , yj ) = 1, (30.125) i j where the sums over i and j take all valid pairs of values. We can also define the cumulative probability function F(x, y) = f(xi , yj ), (30.126) xi ≤x yj ≤y from which it follows that the probability that X lies in the range [a1 , a2 ] and Y lies in the range [b1 , b2 ] is given by Pr(a1 < X ≤ a2 , b1 < Y ≤ b2 ) = F(a2 , b2 ) − F(a1 , b2 ) − F(a2 , b1 ) + F(a1 , b1 ). Finally, we define X and Y to be independent if we can write their joint distribution in the form f(x, y) = fX (x)fY (y), i.e. as the product of two univariate distributions. 1197 (30.127) PROBABILITY 30.11.2 Continuous bivariate distributions In the case where both X and Y are continuous random variables, the PDF of the joint distribution is defined by f(x, y) dx dy = Pr(x < X ≤ x + dx, y < Y ≤ y + dy), (30.128) so f(x, y) dx dy is the probability that x lies in the range [x, x + dx] and y lies in the range [y, y + dy]. It is clear that the two-dimensional function f(x, y) must be everywhere non-negative and that normalisation requires ∞ ∞ f(x, y) dx dy = 1. −∞ −∞ It follows further that b2 Pr(a1 < X ≤ a2 , b1 < Y ≤ b2 ) = a2 f(x, y) dx dy. b1 a1 (30.129) We can also define the cumulative probability function by x y f(u, v) du dv, F(x, y) = Pr(X ≤ x, Y ≤ y) = −∞ −∞ from which we see that (as for the discrete case), Pr(a1 < X ≤ a2 , b1 < Y ≤ b2 ) = F(a2 , b2 ) − F(a1 , b2 ) − F(a2 , b1 ) + F(a1 , b1 ). Finally we note that the definition of independence (30.127) for discrete bivariate distributions also applies to continuous bivariate distributions. A flat table is ruled with parallel straight lines a distance D apart, and a thin needle of length l < D is tossed onto the table at random. What is the probability that the needle will cross a line? Let θ be the angle that the needle makes with the lines, and let x be the distance from the centre of the needle to the nearest line. Since the needle is tossed ‘at random’ onto the table, the angle θ is uniformly distributed in the interval [0, π], and the distance x is uniformly distributed in the interval [0, D/2]. Assuming that θ and x are independent, their joint distribution is just the product of their individual distributions, and is given by f(θ, x) = 1 1 2 = . π D/2 πD The needle will cross a line if the distance x of its centre from that line is less than 12 l sin θ. Thus the required probability is 2 πD π 0 1 l sin θ 2 0 dx dθ = 2 l πD 2 π sin θ dθ = 0 2l . πD This gives an experimental (but cumbersome) method of determining π. 1198