Introduction to Statistical Inference

1. Introduction to Statistical Inference J. Verducci MBI Summer Workshop August, 2005

2. Recommended Text Fred L. RAMSEY and Daniel W. SCHAFER. The Statistical Sleuth: A Course in Methods of Data Analysis Belmont, CA: Duxbury, 2002, xxvi + 742 pp., $97.95 (H + CD), ISBN: 0-534-38670-9.

3. Outline Underlying philosophies about data Basics of probability and random variables Estimating a population proportion Inferring a difference between two distributions Assumptions about distributional forms Normal Theory Nonparametrics Hypothesis Testing T-test Mann-Whitney-Wilcoxon test Multiple Comparisons Bonferroni False Discovery Rate

4. Affymetrics Mas 5.0 Expression Set

5. Expression Data Matrix30,000 genes x 30 patients

6. Philosophy Frequentist: Observed data X is an imperfect representation of an underlying idealized fixed truth q. Law of Large Numbers: When experiments are repeated faithfully, the average of observations comes closer to their idealization. Bayesian: Observed data X is fixed, and the unknown generating parameter q is random Certainty about q depends on both empirical information X and prior knowledge about q.

7. Examples: q as a Population Percentage or Average Parameter q Percent of population with a particular allele Percent of free throws made by Shaq over his entire career Mean expression level of BRCA1 gene in breast cancer cells Statistic x Percent observed in a sample of 100 people Set of Shaq�s yearly free-throw percentages up to June, 2005 Sample averages from patients in Stages 1-4; patients with high and low HER2 expression

8. Key Terms Population (Sample Space W) � set of all possible outcomes of an experiment Sample � subset of the population (Event) that is observed (Generative / Probability) Model � description of how samples are obtained from the population Parameter: a feature of the population used to describe the model Statistic: a summary of the sample that conveys information about the parameter of interest.

9. Axioms of Probability Needed to specify sampling and modeling Definition: A probability measure P is a function from the set of all possible events into [0,1] such that P(f) = 0 P(W) = 1 P( U Ai ) = S P(Ai) for countable collections of disjoint events {Ai}

10. Random Variables A random variable X is a function from the sample space W into the real numbers R X:W ? R X(w) = x The value x is called a realization of the random variable X. It can also be thought of as a statistic, since it is a function/summary of the sample {w}.

11. Example Experiment: role two dice (one red,one green) W = { (i,j) | i = 1,�,6; j = 1,�,6} Probability Model (based on symmetry) P({(i,j)}) = 1/36 for each ordered pair (i,j) Random variable X((i,j)) = i + j The probability model induces a probability function fX on the possible values x of X. fX(x) = (6 - |x-7|) / 36 , x = 2,�,12

12. Probability Function for Sum of Two Dice

13. Independence Two random variables Y and Z are independent if, for all possible y,z P(Y=y and Z=z) = P(X=x) * P(Y=y) Dice Example: Let Y( (i,j) ) = i Z( (i,j) ) = j Then, for y,z in {1,2,3,4,5,6}, P(Y=y and Z=z) = 1/36 = 1/6 * 1/6 = P(X=x) * P(Y=y)

14. iid Sample iid: independent, identically distributed Model: X1, X2, �, Xn are mutually independent, that is, P(X1 = x1, �, Xn = xn) = P(X1 = x1) * � * P(Xn = xn) X1, X2, �, Xn have the same probability function f(. | q) Xi ~ f(x | q), i = 1,�,n

15. Estimating Population Proportion q Code Xi = 1 if the ith observation has the characteristic of interest = 0 otherwise; i = 1,�,n. Bernoulli Distribution: fX(x | q) = q for x = 1 (1-q) for x = 0 0 otherwise Xi are iid Bernoulli(q), i = 1,�,n.

16. Maximum Likelihood Estimate Y = S Xi has the Binomial(n, q) distribution:

Introduction to Statistical Inference

Introduction to Statistical Inference

Presentation Transcript

Statistical Inference

Statistical Inference

Statistical Inference

Introduction to Statistical Inference

Statistical Inference

Introduction to Statistical Inference

Statistical Inference

Statistical Inference

Statistical Inference

Statistical Inference

Statistical Inference

Ch 6 Introduction to Formal Statistical Inference

Introduction to Probability theory and statistical inference

Statistical inference

Statistical Inference

Statistical Inference

Chapter 8: Introduction to Statistical Inference

Statistical inference

Chapter 6 introduction to Statistical Inference

Introduction to Statistical Inference

Introduction to Statistical Inference

Statistical Inference