1 / 37

Multivariable Distributions

Multivariable Distributions. ch4. Multivariable Distributions. It may be favorable to take more than one measurement on a random experiment. The data may then be collected in pairs of (x i , y i ).

Télécharger la présentation

Multivariable Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multivariable Distributions ch4

  2. Multivariable Distributions • It may be favorable to take more than one measurement on a random experiment. • The data may then be collected in pairs of (xi, yi). • Def.4.1-1: X & Y are two discrete R.V. defined over the support S. The probability that X=x, Y=y is denoted as f(x,y)=P(X=x,Y=y). f(x,y) is the joint probability mass function (joint p.m.f.) of X and Y: • 0≤f(x,y)≤1; ΣΣ(x,y)∈Sf(x,y)=1; P[(X,Y)∈A]=ΣΣ(x,y)∈Af(x,y), A ⊆S.

  3. Illustration Example • Ex.4.1-3: Roll a pair of dice: X is the smaller and Y is the larger. • The outcome is (3, 2) or (2, 3) ⇒X=2 & Y=3 with 2/36 probability. • The outcome is (2, 2) ⇒X=2 & Y=2 with 1/36 probability. • Thus, the joint p.m.f. of X and Y is y x Marginal p.m.f.

  4. Marginal Probability and Independence • Def.4.1-2: X and Y have the joint p.m.f f(x,y) with space S. • The marginal p.m.f. of X is f1(x)=Σyf(x,y)=P(X=x), x∈S1. • The marginal p.m.f. of Y is f2(y)=Σxf(x,y)=P(Y=y), y∈S2. • X and Y are independent iff P(X=x, Y=y)=P(X=x)P(Y=y), namely, f(x,y)=f1(x)f2(y), x∈S1, y∈S2. • Otherwise, X and Y are dependent. • X and Y in Ex5.1-3 are dependent: 1/36=f(1,1) ≠ f1(1)f2(1)=11/36*1/36. • Ex4.1-4: The joint p.m.f. f(x,y)=(x+y)/21, x=1,2,3, y=1,2. • Then, f1(x)=Σy=1~2(x+y)/21=(2x+3)/21, x=1,2,3. • Likewise, f2(1)=Σx=1~3(x+y)/21=(6+3y)/21, y=1,2. • Since f(x,y)≠f1(x)f2(y), X and Y are dependent. • Ex4.1-6: f(x,y)=xy2/13, (x,y)=(1,1),(1,2),(2,2).

  5. Quick Dependence Checks • Practically, “dependence” can be quickly determined if • The support of X and Y is NOT rectangular, or • S is therefore not the product set {(x,y): x∈S1, y∈S2}, as in Ex4.1-6. • f(x,y) cannot be factored (separated) into the product of an x-alone expression and a pure y function. • In Ex4.1-4, f(x,y) is a sum, not a product, of x-alone and y-alone functions. • Ex4.1-7: [Probability Histogram for a joint p.m.f.]

  6. Mathematical Expectation • If u(X1,X2) is a function of two R.V. X1& X2, thenif it exists, is called the mathematical expectation (or expected value) of u(X1,X2). • The mean of Xi, i=1,2: • The variance of Xi: • Ex4.1-8: A player selects a chip from a bowl having 8 chips: 3 marked (0,0), 2 (1,0), 2 (0,1), 1 (1,1).

  7. Probability Density Function Joint • Joint Probability Density Function, joint p.d.f., of two continuous-type R.V. X & Y, is an integrable function f(x,y): • f(x,y)≥0; ∫y=-∞~∞∫x=-∞~∞f(x,y)dxdy=1; • P[(X,Y)∈A]=∫∫Af(x,y)dxdy, for an event A. • Ex4.1-9: X and Y have the joint p.d.f. • A={(x,y): 0<x<1, 0<y<x}. • The respective marginal p.d.f.s are X and Y are independent!

  8. Independence of Continuous Type R.V.s • Two continuous type R.V. X and Y are independent iff the joint p.d.f. factors into the product of their marginal p.d.f.s. • Ex4.1-10: X and Y have the joint p.d.f. • The support S={(x,y): 0≤x≤y≤1}, bounded by x=0, y=1, x=y lines. • The marginal p.d.f.s are • Various expected values: X and Y are dependent!

  9. Multivariate Hypergeometric Distribution • Ex4.1-11: Of 200 students, 40 have As, 60 Bs; 100 Cs, Ds, or Fs. • A sample of size 25 is taken at random without replacement. • X1 is the number of A students, X2 is the number of B students, and • 25 –X1–X2 is the number of the other students. • The space S = {(x1,x2): x1,x2≥0, x1+x2≤25}. • The marginal p.m.f. of X1can be also obtained as: From the knowledge of the model. X1and X2 are dependent!

  10. Binomial ⇒ Trinomial Distribution • Trinomial Distribution: The experiment is repeated n times. • The probability p1: perfect, p2: second; p3: defective, p3=1-p1-p2. • X1: the number of perfect items, X2 for second, X3 for defective. • The joint p.m.f. is • X1 is b(n,p1), X2 is b(n,p2); both are dependent. • Ex4.1-13: In manufacturing a certain item, • 95% of the items are good; 4% are “seconds”, and 1% defective. • An inspector observes n=20 items selected at random, counting the number X of seconds, and the number Y of defectives. • The probability that at least 2 seconds or at least 2 defective items are found, namely A={(x,y): x≥2 or y≥2}, is

  11. Correlation Coefficient • For two R.V. X1 & X2, • The mean of Xi, i=1,2: • The variance of Xi: • The covariance of X1 & X2 is • The correlation coefficient of X1& X2 is • Ex4.2-1: X1& X2 have the joint p.m.f. → Not a product ⇒Dependent!

  12. Insights of the Meaning of ρ • Among all points in S, ρ tends to be positive if more points are simultaneously above orbelow their respective means with larger probability. • The least-squares regression line is a line passing given (μx,μy) with the best slope b s.t. K(b)=E{[(Y-μy)-b(X-μx)]2} is minimized. • The square of the vertical distance from a point to the line. • ρ= ±1: K(b)=0 ⇒all the points lie on the least-squares regression line. • ρ= 0: K(b)=σy2, the line is y=μy; X and Y could be independent!! • ρmeasures the amount of linearity in the probability distribution.

  13. Example • Ex4.2-2: Roll a pair of 4-sided die: X is the number of ones, Y is the number of twos and threes. • The joint p.m.f. is • The line of best fit is

  14. Independence ⇒ ρ=0 ∵independence • The converse is not necessarily true! • Ex4.2-3: The joint p.m.f. of X and Y is f(x,y)=1/3, (x,y)=(0,1), (1,0), (2,1). • Obviously, the support is not “rectangular”, so X and Y are dependent. • Empirical Data: from n bivariate observations: (xi,yi), i=1..n. • We can compute the sample mean and variance for each variate. • We can also compute the sample correlation coefficient and the sample least squares regression line. (Ref. p.241)

  15. Conditional Distributions • Def.4.3-1: The conditional probability mass function of X, given that Y=y, is defined by g(x|y)=f(x,y)/f2(y), if f2(y)>0. • Likewise, h(y|x)=f(x,y)/f1(x), if f1(x)>0. • Ex.4.3-1: X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2. • f1(x)=(2x+3)/21, x=1,2,3; f2(y)=(3y+6)/21, y=1,2. • Thus, given Y=y, the conditional p.m.f. of X is • When y=1, g(x|1)=(x+1)/9, x=1,2,3; g(1|1):g(2|1):g(3|1)=2:3:4. • When y=2, g(x|2)=(x+2)/12, x=1,2,3; g(1|2):g(2|2):g(3|2)=3:4:5. • Similar relationships about h(y|x) can be obtained. Dependent!

  16. Conditional Mean and Variance • The conditional mean of Y, given X=x, is • The conditional variance of Y, given X=x, is • Ex.4.3-2: [from Ex.4.3-1] X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2.

  17. Relationship about Conditional Mean • The point (μX,μY) locates on the above two lines, and is their junction. • The product of the slopes is ρ2. • The ratio of the slopes is These relations can derive the unknown from the others known.

  18. Example • Ex.4.3-3: X and Y have the trinomial p.m.f. with n, p1, p2, p3=1-p1-p2 • They have the marginal p.m.f. b(n, p1), b(n, p2), so

  19. Example for Continuous-type R.V. • Ex4.3-5: [From Ex4.1-10] ⇒The conditional distribution of Y given X=x is U(x,1).[U(a,b) has mean (b+a)/2, and variance (b-a)2/12.]

  20. Bivariate Normal Distribution • The joint p.d.f of X : N(μX,σX2)and Y : N(μY,σY2) is • Therefore, A linear function of x. A constant w.r.t. x.

  21. Examples • Ex.5.6-1: • Ex.5.6-2

  22. Bivariate Normal:ρ=0 ⇒ Independence • Thm5.6-1: For X and Y with a bivariate normal distribution with ρ, X and Y are independent iffρ=0. • So are trivariate and multivariate normal distributions. • When ρ=0,

  23. Transformations of R.V.s Continuous type • In Section 3.5, the transformation of a single variable X with f(x) to another Y=v(X), an increasing or decreasing fn, can be done as: • Ex.4.4-1: X: b(n,p), Y=X2, if n=3, p=1/4, then • What is the transformation u(X/n) leading to a variance free of p? Taylor’s expansion about p: • Ex: X: b(100,1/4)or b(100,9/10). Discrete type When the variance is constant, or free of p,

  24. Multivariate Transformations • When the function Y=u(X) does not have a single-valued inverse, it needs to consider possible inverse functions individually. • Each range will be delimited to match the right inverse. • For multivariate, the derivative is replaced by the Jacobian. • Continuous R.V. X1 and X2 have the joint p.d.f. f(x1, x2). • If has the single-valued inverse then the joint p.d.f. of Y1 and Y2 is • [Most difficult] The mapping of the supports are considered.

  25. Transformation to the Independent • Ex4.4-2: X1 and X2 have the joint p.d.f. f(x1, x2)=2, 0<x1<x2<1. • Consider Y1=X1/X2, Y2=X2: • The mapping of the supports: • The marginal p.d.f.: • ∵g(y1,y2)=g1(y1)g2(y2) ∴Y1,Y2 Independent. →

  26. Transformation to the Dependent • Ex4.4-3: X1 and X2 are indep., each with p.d.f. f(x)=e-x, 0<x<∞. • Their joint p.d.f. f(x1, x2)= e-x1e-x2, 0<x1<∞, 0<x2<∞. • Consider Y1=X1-X2, Y2=X1-X2: • The mapping of the supports: • The marginal p.d.f.: • ∵g(y1,y2) ≠g1(y1)g2(y2) ∴Y1,Y2 Dependent. → Double exponential p.d.f.

  27. Beta Distribution • Ex4.4-4: X1 and X2 have indep. Gamma distributions withα,θ and β, θ. Their joint p.d.f. is • Consider Y1=X1/(X1+X2), Y2=X1+X2:i.e., X1=Y1Y2, X2=Y2-Y1Y2. • The marginal p.d.f.: • ∵g(y1,y2)=g1(y1)g2(y2) ∴Y1,Y2 Independent. Beta p.d.f. Gamma p.d.f.

  28. Box-Muller Transformation • Ex5.3-4: X1 and X2 have indep. Uniform distributions U(0,1). • Consider • Two indep. U(0,1) ⇒ two indep. N(0,1)!!

  29. Distribution Function Technique • Ex.5.3-5: Z is N(0,1), U is χ2(r), Z and U are independent. • The joint p.d.f. of Z and U is χ2(r+1)

  30. Another Example • Ex.4.4-5: U: χ2(r1) and V: χ2(r2) are independent. • The joint p.d.f. of Z and U is • The knowledge of known distributionsand their associated integration relationshipsare useful to derivethe distributions of unknown distributions. χ2(r1+r2)

  31. Order Statistics • The order statistics are the observations of the random sample arranged in magnitude from the smallest to the largest. • Assume there is no tie: identical observations. • Ex6.9-1: n=5 trials: {0.62, 0.98, 0.31, 0.81, 0.53} for the p.d.f. f(x)=2x, 0<x<1. The order statistics are {0.31, 0.53, 0.62, 0.81, 0.98}. • The sample median is 0.62, and the sample range is 0.98-0.31=0.67. • Ex6.9-2: Let Y1<Y2<Y3<Y4<Y5 be the order statistics for X1, X2, X3, X4, X5, each from the p.d.f. f(x)=2x, 0<x<1. • Consider P(Y4<1/2) ≡at least 4 of Xi’s must be less than 1/2: 4 successes.

  32. General Cases • The event that the rth order statistic Yr is at most y, {Yr≤y}, can occur iff at least r of the n observations are no more than y. • The probability of “success” on each trial is F(y). • We must have at least r successes. Thus,

  33. Alternative Approach • A heuristic approach to obtain gr(y): • Within a short interval Δy: • There are (r-1) items fall less than y, and (n-r) items above y+Δy. • The multinomial probability with n trials is approximated as. • Ex5.9-3: (from Ex6.9-2) Y1<Y2<Y3<Y4<Y5 are the order statistics for X1, X2, X3, X4, X5, each from the p.d.f. f(x)=2x, 0<x<1. On a single trial

  34. More Examples • Ex: 4 indep. Trials(Y1 ~ Y4) from a distribution with f(x)=1, 0<x<1. • Find the p.d.f. of Y3. • Ex: 7 indep. trials(Y1 ~ Y7) from a distribution f(x)=3(1-x)2, 0<x<1. • Find the p.d.f. of the sample median, i.e. Y4, is less than • Method 1: find g4(y), then • Method 2: find then By Table II on p.647.

  35. Order Statistics of Uniform Distributions • Thm3.5-2: if X has a distribution function F(X), which has U(0,1). {F(X1),F(X2),…,F(Xn)} ⇒Wi’s are the order statistics of n indep. observations from U(0,1). • The distribution function of U(0,1) is G(w)=w, 0<w<1. • The p.d.f. of the rth order statistic Wr=F(Yr) is ⇒Y’s partition the support of X into n+1 parts, and thus n+1 areas under f(x) and above the x-axis. • Each area equals 1/(n+1) on the average. p.d.f. Beta

  36. Percentiles • The (100p)th sample percentile πp is defined s.t. the area under f(x) to the left of πp is p. • Therefore, Yr is the estimator of πp, where r=(n+1)p. • In case (n+1)p is not an integer, a (weighted) average of Yr and Yr+1 can be used, where r=floor[(n+1)p]. • The sample median is • Ex6.9-5: X is the weight of soap; n=12 observations of X is listed: • 1013, 1019, 1021, 1024, 1026, 1028, 1033, 1035, 1039, 1040, 1043, 1047. • ∵n=12, the sample median is • ∵(n+1)(0.25)=3.25, the 25th percentile or first quartile is • ∵(n+1)(0.75)=9.75, the 75th percentile or third quartile is • ∵(n+1)(0.6)=7.8, the 60th percentile

  37. Another Example • Ex5.6-7: The order statistics of 13 indep. Trials(Y1<Y2< …< Y13) from a continuous type distribution with the 35th percentile π0.35. • Find P(Y3< π0.35< Y7) • The event {Y3< π0.35< Y7} happens iff there are at least 3 but less than 7 “successes”, where the success probability is p=0.35. Success By Table II on p.677~681.

More Related