520 likes | 644 Vues
This analysis examines the probability of experiencing a heart attack among middle-aged men based on specific variables such as balding. It reveals that 28% of middle-aged men are balding, with a 18% risk of heart attacks within 10 years for this group. For non-balding men, the probability drops to 11%. We calculate the overall probability of a heart attack by incorporating these probabilities using the law of total probability. Additionally, the discussion transitions into random variables and the application of Bernoulli trials, outlining their relevance and expected values in statistical modeling.
E N D
1 1 Econ 240A Power Four
Last Time • Probability
The Classical Statistical Trail Rates & Proportions Inferential Statistics Application Descriptive Statistics Discrete Random Variables Binomial Probability Discrete Probability Distributions; Moments
Problem 6.61 • A survey of middle aged men reveals that 28% of them are balding at the crown of their head. Moreover, it is known that such men have an 18% probability of suffering a heart attack in the next ten years. Men who are not balding in this way have an 11% probability of a heart attack. Find the probability that a middle aged man will suffer a heart attack in the next ten years.
P (Bald and MA) = 0.28 Bald Not Bald Middle Aged men
P (Bald and MA) = 0.28 P(HA/Bald and MA) = 0.18 P(HA/Not Bald and MA) = 0.11 Bald Not Bald Middle Aged men
Probability of a heart attack in the next ten years • P(HA) = P(HA and Bald and MA) + P(HA and Not Bald and MA) • P(HA) = P(HA/Bald and MA)*P(BALD and MA) + P(HA/Not BALD and MA)* P(Not Bald and MA) • P(HA) = 0.18*0.28 + 0.11*0.72 = 0.054 + .0792 = 0.1296
Random Variables • There is a natural transition or easy segue from our discussion of probability and Bernoulli trials last time to random variables • Define k to be the random variable # of heads in 1 flip, 2 flips or n flips of a coin • We can find the probability that k=0, or k=n by brute force using probability trees. We can find the histogram for k, its central tendency and its dispersion
Outline • Random Variables & Bernoulli Trials • example: one flip of a coin • expected value of the number of heads • variance in the number of heads • example: two flips of a coin • a fair coin: frequency distribution of the number of heads • one flip • two flips
Outline (Cont.) • Three flips of a fair coin, the number of combinations of the number of heads • The binomial distribution • frequency distributions for the binomial • The expected value of a discrete random variable • the variance of a discrete random variable
Concept • Bernoulli Trial • two outcomes, e.g. success or failure • successive independent trials • probability of success is the same in each trial • Example: flipping a coin multiple times
Flipping a Coin Once The random variable k is the number of heads it is variable because k can equal one or zero it is random because the value of k depends on probabilities of occurrence, p and 1-p Heads, k=1 Prob. = p Prob. = 1-p Tails, k=0
Flipping a coin once • Expected value of the number of heads is the value of k weighted by the probability that value of k occurs • E(k) = 1*p + 0*(1-p) = p • variance of k is the value of k minus its expected value, squared, weighted by the probability that value of k occurs • VAR(k) = (1-p)2 *p +(0-p)2 *(1-p) = VAR(k) = (1-p)*p[(1-p)+p] =(1-p)*p
Flipping a coin twice: 4 elementary outcomes h, h; k=2 heads h, h Prob =p heads Prob=1-p Prob =p tails h, t; k=1 h, t Prob=p heads t, h; k=1 t, h Prob =1-p tails Prob =1-p t, t tails t, t; k=0
Flipping a Coin Twice • Expected number of heads • E(k)=2*p2 +1*p*(1-p) +1*(1-p)*p + 0*(1-p)2 E(k) = 2*p2 + p - p2 + p - p2 =2p • so we might expect the expected value of k in n independent flips is n*p • Variance in k • VAR(k) = (2-2p)2 *p2 + 2*(1-2p)2 *p(1-p) + (0-2p)2 (1-p)2
Continuing with the variance in k • VAR(k) = (2-2p)2 *p2 + 2*(1-2p)2 *p(1-p) + (0-2p)2 (1-p)2 • VAR(k) = 4(1-p)2 *p2 +2*(1 - 4p +4p2)*p*(1-p) + 4p2 *(1-p)2 • adding the first and last terms, 8p2 *(1-p)2 + 2*(1 - 4p +4p2)*p*(1-p) • and expanding this last term, 2p(1-p) -8p2 *(1-p) + 8p3 *(1-p) • VAR(k) = 8p2 *(1-p)2 + 2p(1-p) -8p2 *(1-p)(1-p) • so VAR(k) = 2p(1-p) , or twice VAR(k) for 1 flip
Frequency Distribution for the Number of Heads • A fair coin
One Flip of the Coin probability 1/2 1 head O heads # of heads
Two Flips of a Fair Coin probability 1/2 1/4 0 2 # of heads 1
Three Flips of a Fair Coin • It is not so hard to see what the value of the number of heads, k, might be for three flips of a coin: zero, one ,two, three • But one head can occur two ways, as can two heads • Hence we need to consider the number of ways k can occur, I.e. the combinations of branching probabilities where order does not count
Three flips of a coin; 8 elementary outcomes 3 heads 2 heads 2 heads 1 head 2 heads 1 head 1 head 0 heads
Three Flips of a Coin • There is only one way of getting three heads or of getting zero heads • But there are three ways of getting two heads or getting one head • One way of calculating the number of combinations is Cn(k) = n!/k!*(n-k)! • Another way of calculating the number of combinations is Pascal’s triangle
Three Flips of a Coin Probability 3/8 2/8 1/8 0 1 2 3 # of heads
The Probability of Getting k Heads • The probability of getting k heads (along a given branch) in n trials is: pk *(1-p)n-k • The number of branches with k heads in n trials is given by Cn(k) • So the probability of k heads in n trials is Prob(k) = Cn(k) pk *(1-p)n-k • This is the discrete binomial distribution where k can only take on discrete values of 0, 1, …k
Expected Value of a discrete random variable • E(x) = • the expected value of a discrete random variable is the weighted average of the observations where the weight is the frequency of that observation
Expected Value of the sum of random variables • E(x + y) = E(x) + E(y)
Expected Number of Heads After Two Flips • Flip One: kiI heads • Flip Two: kjII heads • Because of independence p(kiI and kjII) = p(kiI)*p(kjII) • Expected number of heads after two flips: E(kiI + kjII) = (kiI + kjII) p(kiI)*p(kjII) • E(kiI + kjII) = kiI p(kiI)* p(kjII) +
Cont. • E(kiI + kjII) = kiI p(kiI)* p(kjII) + kjII *p(kjII) p(kiI) • E(kiI + kjII) = E(kiI) + E(kjII) = p*1 + p*1 =2p • So the mean after n flips is n*p
Variance of a discrete random variable • VAR(xi) = • the variance of a discrete random variable is the weighted sum of each observation minus its expected value, squared,where the weight is the frequency of that observation
Cont. • VAR(xi) = • VAR(xi) = • VAR(xi) = • So the variance equals the second moment minus the first moment squared
The variance of the sum of discrete random variables • VAR[xi + yj] = E[xi + yj - E(xi + yj)]2 • VAR[xi + yj] = E[(xi - Exi) + (yj - Eyj)]2 • VAR[xi + yj] = E[(xi - Exi)2 + 2(xi - Exi) (yj - Eyj) + (yj - Eyj)2] • VAR[xi + yj] = VAR[xi] + 2 COV[xi*yj] + VAR[yj]
The variance of the sum if x and y are independent • COV [xi*yj] = E(xi - Exi) (yj - Eyj) • COV [xi*yj]= (xi - Exi) (yj - Eyj) • COV [xi*yj]= (xi - Exi) p[x(i)]* (yj - Eyj)* p[y(j)] • COV [xi*yj] = 0
Variance of the number of heads after two flips • Since we know the variance of the number of heads on the first flip is p*(1-p) • and ditto for the variance in the number of heads for the second flip • then the variance in the number of heads after two flips is the sum, 2p(1-p) • and the variance after n flips is np(1-p)
Application • Rates and Proportions
Field Poll • The estimated proportion, from the sample, that will vote for Guliani is: • where is 0.35 or 35% • k is the number of “successes”, the number of likely voters sampled who are for Guliani, approximately 122 • n is the size of the sample, 348
Field Poll • What is the expected proportion of voters Nov. 7 who will vote for Guliani? • = E(k)/n = np/n = p, where from the binomial distribution, E(k) = np • So if the sample is representative of voters and their preferences, 35% should vote for Guliani next February
Field Poll • How much dispersion is in this estimate, i.e. as reported by the Field Poll, what is the sampling error? • The sampling error is calculated as twice the standard deviation or square root of the variance in • = VAR(k)/n2 = np(1-p)/n2 =p(1-p)/n • and using 0.35 as an estimate of p, • = 0.35*0.65/348 =0.000654
Field Poll • So the sampling error should be 2*0.026 or 5.2%. • The Field Poll reports a 95% confidence interval or about two standard errors , I.e 2*2.6% ~ 5.4%
Field Poll • Is it possible that Guliani might get 50% of the vote or more? Not likely since the probabilty of Guliani reciving more then 40% of the vote is only 2.5% • Based on a normal approximation to the binomial, the true proportion voting for Guliani should fall between 29.5% and 40.5% with probability of about 95%, unless sentiments change.