(Better) Bootstrap Confidence Intervals

(Better)Bootstrap Confidence Intervals Shachar Kaufman Based on Efron and Tibshirani’s “An introduction to the bootstrap”Chapter 14 TAU Bootstrap Seminar 2011Dr. Saharon Rosset

Agenda • What’s wrong with the simpler intervals? • The (nonparametric) BCa method • The (nonparametric) ABC method • Not really

Example: simpler intervals are bad

Example: simpler intervals are bad • Under the assumption that • i.i.d. • Have exact analytical interval • Can do parametric-bootstrap Under the assumption that i.i.d.  Can do nonparametric bootstrap

Why are the simpler intervals bad? • Standard (normal) confidence intervalassumes symmetry around • Bootstrap-t often erratic in practice • “Cannot be recommended for general nonparametric problems” • Percentile suffers from low coverage • Assumes nonp. distribution of is representative of (e.g. has mean like does) • Standard & percentile methods assume homogenous behavior of , whatever is • (e.g. standard deviation of does not change with )

A more flexible inference model Account for higher-order statistics Mean Standard deviation Skewness

A more flexible inference model • If doesn’t work for the data, maybe we could find a transform and constants and for which we can accept that • Additional unknowns • allows a flexible parameter-description scale • allows bias: • allows “” to change with • As we know, “more flexible” is not necessarily “better” • Under broad conditions, in this case it is (TBD)

Where does this new model lead? Assume known and , and initially that , hence Calculate a standard -confidence endpoint from this Now reexamine the actual stdev, this time assuming that According to the model, it will be

Where does this new model lead? Ok but this leads to an updated endpoint Which leads to an updated If we continue iteratively to infinity this way we end up with the confidence interval endpoint

Where does this new model lead? • Do this exercise considering and get • Similarly for with

Enter BCa • “Bias-corrected and accelerated” • Like percentile confidence interval • Both ends are percentiles , of the bootstap instances of • Just not the simple

BCa • Instead • and are parameters we will estimate • When both zero, we get the good-old percentile CI • Notice we never had to explicitly find

BCa • tackles bias (since is monotone) • accounts for a standard deviation of which varies with (linearly, on the “normal scale” )

BCa • One suggested estimator for is via the jackknifewhere and • You won’t find the rationale behind this formula in the book (though it is clearly related to one of the standard ways to define skewness)

Theoretical advantages of BCa • Transformation respecting • If the interval for is then the interval for a monotone is • So no need to worry about finding transforms of where confidence intervals perform well • Which is necessary in practice with bootstrap-t CI • And with the standard CI (e.g. Fisher corrcoeff trans.) • Percentile CI is transformation respecting

Theoretical advantages of BCa • Accuracy • We want s.t. • But a practical is an approximation where • BCa(and bootstrap-t) endpoints are “second order accurate”, where • This is in contrast to the standard and percentile methods which only converge at rate (“first order accurate”)  errors one order of magnitude greater

But BCa is expensive • The use of direct bootstrapping to calculate delicate statistics such as and requires a large to work satisfactorily • Fortunately, BCa can be analytically approximated (with a Taylor expansion, for differentiable ) so that no Monte Carlo simulation is required • This is the ABC method which retains the good theoretical properties of BCa

The ABC method • Only an introduction (Chapter 22) • Discusses the “how”, not the “why” • For additional details see Diciccio and Efron 1992 or 1996

The ABC method • Given the estimator in resampling form • Recall , the “resampling vector”, is an dimensional random variable with components • Recall • Second-order Taylor analysis of the estimate • as a function of the bootstrap resampling methodology

The ABC method • Can approximate all the BCa parameter estimates (i.e. estimate the parameters in a different way) • , where • something akin to a Hessian component but along a specific direction not perpendicular to any natural axis (the “least favorable family” direction)

The ABC method • And the ABC interval endpoint • Where • with • Simple and to the point, aint it?

(Better) Bootstrap Confidence Intervals

(Better) Bootstrap Confidence Intervals

Presentation Transcript

BETTER CONNECTIONS – BETTER OUTCOMES

BETTER CONNECTIONS – BETTER OUTCOMES

BETTER CONNECTIONS BETTER OUTCOMES

BETTER CONNECTIONS BETTER OUTCOMES

Better Companies, Better Societies

BETTER CONNECTIONS BETTER OUTCOMES

Better Choices, Better Health

Better Learning, Better Teaching, Better Schools

Better Writing. Better Outcomes.

Better Learning, Better Teaching, Better Schools

Better Health Better Care Better Value

Better Companies, Better Societies

Better Communications…..Better Care

BETTER COMMUNITY BETTER LIFE

Better Learning, Better Teaching, Better Schools

Better homes, better lives, a better Glasgow

Better Learning, Better Teaching, Better Schools

Better Data, Better Science! [ Better Science through Better Data Management ]

Better Communications…..Better Care

Better Insights…Better Results

Better Datafication, Better Recruits

Better Roads, Better Worlds.