300 likes | 581 Vues
Introduction to Probability : Binomial & Normal Distribution. Dr. Marvin Reid. Objectives. Define probability and its importance in statistical theory Describe the associative and multiplicative rules for joint probability under statistical independence
E N D
Introduction to Probability : Binomial & Normal Distribution Dr. Marvin Reid
Objectives • Define probability and its importance in statistical theory • Describe the associative and multiplicative rules for joint probability under statistical independence • Describe the properties and uses of 2 probability distribution functions the normal & binomial distributions.
Review • Populations • samples drawn from these populations • Methods used to summarize the data obtained from these samples • The relation between sample and population is uncertain. Thus to make inferences about our data we need to set up mathematical models which capture this uncertainty • The foundation of statistical models is probability theory
Probability • Relative frequency • Degree of belief • Probability reasons from the population to the sample • Probability lies between 0 and 1
Definitions • Trial (Experiment) – any process generates a set of results • Outcome – the result of carrying out the trial • Event – one or more outcomes • Marginal probability –one event occurs
Joint conditions • Mutually exclusive if the events cannot occur simultaneously • Independent if the occurrence of an event does not influence the probability of another event occurring
Start No Addition Rule P(A or B)=P(A)+P(B)-P(AB) Are events mutually exclusive yes Addition Rule P(A or B)=P(A)+P(B) Joint Probability P(AB)=P(A) x P(B) Are events statistically independent yes Marginal Probability P(A)
Probability distributions • Many statistical methods use probability distribution • Probability distribution is used to calculate the theoretical probability of different values occurring • Normal distribution – continuous data • Binomial distribution- discrete data
Normal Distribution • Extends from –infinity to +infinity • Height=probability density • Area under curve=1 • Unimodal • mean=median=mode
y1;µ=50;σ=5 y2;µ=50;σ=10 Normal distribution-variation with sd Completely described by the mean & sd
Standard Normal Distribution Any normally distributed variable can be related to the standard normal distribution whose mean is zero and standard deviation is 1. This can be done by performing the following calculation Z is the distance along the x axis in sd units
Some uses of Normal Distribution 95% • Used to calculate probability of values being within specified range eg 95%CI= m ± (1.96 x se) • Used to test inferences about the difference between a single mean and a hypothesized value and the difference between two means
The Binomial Distribution • Describes discrete data resulting from experiments called Bernoulli process • Each experiment (trial) has only 2 possible outcomes. • The probability of the outcome of any trial remains fixed over time. • The trials are statistically independent. • Example = Toss of fair Coin
Binomial Formula • p=probability of success • q=(1-p)=probability of failure • r=number of successes desired • n=number of trials undertaken Binomial Formula
Problem • A couple each with sickle trait have 4 children. What is the probability that two children will have sickle cell disease. • P(SS)=0.25, q(Non-SS)=0.75, n=4.
Characteristics of the Binomial Distribution • When p is small the binomial distribution is skewed to the right • As p increases the skewness is less noticeable • When p=.5 the binomial distribution is symmetrical • When p >0.5 the distribution is skewed to the left • As n increases binomial distribution approximates the normal distribution (np and nq>5)
Family of Binomial Distribution P=0.1 P=0.5 P=0.7 P=0.4
Binomial statistics Mean of a Binomial Distribution Standard Deviation of a Binomial Distribution Standard error of the proportion
Some uses of the Binomial distribution • Used to calculate probability of values being within specified range eg CI • Used to test inferences about the difference between a single proportion and a hypothesized value and the difference between two proportions
Calculating interval estimates of the proportion from large samples Z is the appropriate percentage point of the normal distribution
Example • Dr. McGaw-Binns surveyed 150 medical students and found that 42% of them had a sedentary lifestyle • A) Estimate the standard error of the proportion • B) Construct a 95% confidence interval for the true proportion of students who had a sedentary lifestyle
Solution to example N=150, p=0.42
Solution with stata • The stata command • cii number probability • cii 150 0.42 -- Binomial Exact -- Variable | Obs Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------------------- | 150 .42 .0402989 .3399811 .503244
Comparing two proportions • p1 and p2 are the proportions • se=standard error • p=overall proportion based on the two sample proportions Compare calculated z with the appropriate percentage point Zαof the normal distribution appropriate
Objectives • Define probability and its importance in statistical theory • Describe the associative and multiplicative rules for joint probability under statistical independence • Describe the properties and uses of 2 probability distribution functions the normal & binomial distributions.