Inference

Inference Mary M. Whiteside, Ph.D. Nonparametric Statistics

Two Sides of Inference • Parametric • Interval estimation, xbar • Hypothesis testing, m0 • Nonparametric • Interval estimates, EDF • Hypothesis testing, P(X<Y) > P(X>Y)

Meaning of Nonparametric • Not about parameters • Methods for non-normal distributions • Methods for ordinal data • Data Scales • Nominal, categorical, qualitative • Ordinal • Interval • Ratio - natural zero

Random Sample - Type 1 • Random sample from a finite population • Simple • Stratified • Cluster • Inferences are about the finite population • Audit comprised of a sample from a population of invoices • Public opinion polls • QC samples of delivered goods

Random Sample - Type 2 • Observations of (iid) random variables • Inferences are about the probability distributions of the random variables • Weekly average miles per gallon for your new Lexus • Chi square tests of independence in medical treatment offered men and women • Effect of female literacy on infant mortality worldwide

Transition from data sets to distributions • All random variables, by definition, have probability functions (pmf or pdf) and cumulative probability distributions • Random variables defined on a random sample (Type 1 or 2) are called statistics with probability distributions that are called sampling distributions

Sampling Distributions • Statistics support both sides of inference • Estimators - random variables used to create interval estimates • Test statistics - random variables used to test hypotheses

Consider Xbar - a parametric statistic • Type I sample - subset of invoices where X = sales tax paid on an invoice randomly selected from a finite population • Xbar is the average sales tax of n randomly selected invoices • Xbar is an estimator of m, the average sales tax paid for the population of invoices (with standard deviation s) • Xbar is a test statistic for testing hypotheses H0: m = m0 • Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean m and standard deviation sn

Consider Xbar - a parametric statistic • Type 2 sample - the complete set of miles per gallon observations made by you since buying your Lexus where X = mpg for your Lexus in a given week • Xbar is the average mpg for n observations of X • Xbar is an estimator of the expected value (mX) of the RV X • Xbar is a test statistic for testing hypotheses H0: m = m0 • Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean mX and standard deviationsX/n

X in the Type 1 sample • If X from a Type 1 sample is regarded as a random variable, then it has the discrete uniform distribution • Prob [X = x] = 1/N for all x in the population (where the N values of x are assumed to be unique)

Order statistics of rank k - a nonparametric statistic • the kth order statistic is the kth smallest observation • the first order statistic is the smallest observation in a sample • the nth order statistic is the largest • Large body of literature on sampling distributions of order statistics

Estimation • Definitions • EDF • pth sample quantile • sample mean, variance, and standard deviation • unbiased estimators (S2 and s2)

Intervals for parameter estimation • (point estimate - r*standard error of the estimator, point estimate +q*standard error of the point estimate) where r is the a/2 quantile and q is the (1-a/2) quantile from the sampling distribution of the estimator • r equals -q in symmetric distributions with mean 0 (z = +/- 1.96 or t = +/-2.02581) • r does not equal -q in skewed distributions such as Chi squared and F

Sampling distribution of the estimator • Parametric procedures - Assumed normal or normal based from the Central Limit Theorem and sample size • Xbar is approximately normal if n is large • Xbar is t if X is normal and s is unknown • Xbar’s distribution is unknown if X’s distribution is unknown and n is small

Sampling distribution of the estimator • Nonparametric distribution-free procedures I.e. the sampling distribution of the statistic (estimator or test statistic) is “free” from the distribution of X • rank order statistics • bootstrapped distributions - a/2 and 1-a/2 quantiles

Parametric vs nonparametric sampling distributions • Exact distributions with approximate models • Exact distributions with exact models (but usually small samples) or • Asymptotic distributions with exact models

Inference

Inference

Presentation Transcript

Inference

Inference

Inference

Inference

Inference

INFERENCE

Inference

Inference

INFERENCE

INFERENCE

Inference

Inference

Inference

Inference

Inference

Inference

Inference

Inference

Inference

Inference

Inference

Inference