Introduction to Bayesian statistics

Introduction to Bayesian statistics • Three approaches to Probability • Axiomatic • Probability by definition and properties • Relative Frequency • Repeated trials • Degree of belief (subjective) • Personal measure of uncertainty • Problems • The chance that a meteor strikes earth is 1% • The probability of rain today is 30% • The chance of getting an A on the exam is 50%

Problems of statistical inference • Ho: θ=1 versus Ha: θ>1 • Classical approach • P-value = P(Data | θ=1) • P-value is NOT P(Null hypothesis is true) • Confidence interval [a, b] : What does it mean? • But scientist wants to know: • P(θ=1 | Data) • P(Ho is true) = ? • Problem • θ “not random”

Bayesian statistics • Fundamental change in philosophy • Θ assumed to be a random variable • Allows us to assign a probability distribution for θ based on prior information • 95% “confidence” interval [1.34 < θ < 2.97] means what we “want” it to mean: P(1.34 < θ< 2.97) = 95% • P-values mean what we want them to mean: P(Null hypothesis is false)

Estimating P(Heads) for a biased coin • Parameter p • Data: 0, 0, 0, 1, 0, 1, 0, 0, 1, 0 • p = 3/10 = 0.3 • But what if we believe coin is biased in favor of low probabilities? • How to incorporate prior beliefs into model • We’ll see that p-hat = .22

Bayes Theorem

Example • Population has 10% liars • Lie Detector gets it “right” 90% of the time. • Let A = {Actual Liar}, • Let R = {Lie Detector reports you are Liar} • Lie Detector reports suspect is a liar. What is probability that suspect actually is a liar?

More general form of Bayes Theorem

Example • Three urns Urn A: 1 red, 1 blue Urn B: 2 reds, 1 blue Urn C: 2 reds, 3 blues • Roll a fair die. If it’s 1, pick Urn A. If 2 or 3, pick Urn B. If 4, 5, 6, pick Urn C. Then choose one ball. • A ball was chosen and it’s red. What’s the probability it came from Urn C?

Bayes Theorem for Statistics • Let θ represent parameter(s) • Let X represent data • Left-hand side is a function of θ • Denominator on right-hand side does not depend on θ • Posterior distribution Likelihood x Prior distribution • Posterior dist’n = Constant x Likelihood x Prior dist’n • Equation can be understood at the level of densities • Goal: Explore the posterior distribution of θ

A simple estimation example • Biased coin estimation: P(Heads) = p = ? • 0-1 i.i.d. Bernoulli(p) trials • Let be the number of heads in n trials • Likelihood is • For prior distribution use uninformative prior • Uniform distribution on (0,1): f(p) = 1 • So posterior distribution is proportional to f(X|p)f(p) = • f(p|X)

Coin estimation (cont’d) • Posterior density of the form f(p)=Cpx(1-p)n-x • Beta distribution: Parameters x+1 and n-x+1 • http://mathworld.wolfram.com/BetaDistribution.html • Data: 0, 0, 1, 0, 0, 0, 0, 1, 0, 1 • n=10 and x=3 • Posterior dist’n is Beta(3+1,7+1) = Beta(4,8)

Coin estimation (cont’d) • Posterior dist’n: Beta(4,8) • Mean: 0.33 • Mode: 0.30 • Median: 0.3238 • qbeta(.025,4,8), qbeta(.975,4,8) = [.11, .61] gives 95% credible interval for p • P(.11 < p < .61|X) = .95

Prior distribution • Choice of beta distribution for prior

Posterior Likelihood x Prior = [ px(1-p)n-x ] [ pa+1(1-p)b+1 ] = px+a+1(1-p)n-x+b+1 • Posterior distribution is Beta(x+a, n-x+b)

Prior distributions • Posterior summaries: • Mean = (x+a)/(n+a+b) • Mode = (x+a-1)/(n+a+b-2) • Quantiles can be computed by integrating the beta density • For this example, prior and posterior distributions have same general form • Priors which have the same form as the posteriors are called conjugate priors

Data example • Maternal condition placenta previa • Unusual condition of pregnancy where placenta is implanted very low in uterus preventing normal delivery • Is this related to the sex of the baby? • Proportion of female births in general population is 0.485 • Early study in Germany found that in 980 placenta previa births, 437 were female (0.4459) • Ho: p = 0.485 versus Ha: p < 0.485

Placenta previa births • Assume uniform prior Beta(1,1) • Posterior is Beta(438,544) • Posterior summaries • Mean = 0.446, Standard Deviation = 0.016 • 95% confidence interval: [ qbeta(.025,438,544), qbeta(.975,438,544) ] = [ .415, .477 ]

Sensitivity of Prior • Suppose we took a prior more concentrated about the null hypothesis value • E.g., Prior ~ Normal(.485,.01) • Posterior proportional to • Constant of integration is about 10-294 • Mean, summary statistics, confidence intervals, etc., require numerical methods • See S-script: http://www.people.carleton.edu/~rdobrow/courses/275w05/Scripts/Bayes.ssc

Introduction to Bayesian statistics