Difficulties in Limit setting and the Strong Confidence approach

Difficulties in Limit setting and the Strong Confidence approach Giovanni Punzi SNS and INFN - Pisa Advanced Statistical Techniques in Particle Physics Durham, 18-22 March 2002

Outline • Motivations for a Strong CL • Summary of properties of Strong CL • Some examples • Limits in presence of systematic uncertainties. G. Punzi - Strong CL

Motivation • The set of Neyman’s bands is large, and contains all sorts of inferences like: “I bought a lottery ticket. If I win, I will conclude then donkeys can fly @99.9999% CL” • I want to get rid of those, but keep being frequentist. G. Punzi - Strong CL

Why should you care ? • Wrong reason: to make the CL look more like p(hypothesis | data). • Right reason:You don’t want to have to quote a conclusion you know is bad. If you think harder, you can do better: • You are drawing conclusions based on irrelevant facts (like a bad fit). • As a consequence, you are not exploiting at best the information you have • Your results are counter-intuitive and convey little information. • You must make sure your conclusions do not depend on irrelevant information G. Punzi - Strong CL

SOLUTION:Impose a form of Likelihood Principle • Take any two experiments whose pdf are equal for some subset c of observable values of x, apart for a multiplicative constant. Any valid Confidence Limits you can derive in one experiment from observing x in c must also be valid for the other experiment. • If you ask the Limits to be univocally determined, there is no solution. G. Punzi - Strong CL

RESULT Non-coverage land Neyman’s CL bands Strong bands Surprise: a solution exists, and gives for any experiment a well-defined, unique subset of Confidence Bands G. Punzi - Strong CL

Construction of CL bands Regular Strong G. Punzi - Strong CL

Strong CL vs. standard CL • A new parameter emerges: sCL. Every valid band @xx% sCL is also a valid band @xx% CL. • You can check sCL for a band built in any other way. • sCL requirement effectively amounts to re-applying the usual Neyman’s condition locally on every subsample of possible results.This ensures uniform treatment of all experimental results, but in a frequentist way. • Strong Band definition is not an ordering algorithm and answer is still not unique. You may need to add an ordering to obtain a unique solution. G. Punzi - Strong CL

It is similar to conditioning, a standard practice in modern frequentist statistics. “There is a long history of attempts to modify frequentist theory by utilizing some form of conditioning. Earlier works are summarized in Kiefer(1977), Berger and Wolpert(1988) […] Kiefer(1977) formally established the conditional confidence approach” “The first point to stress is the unreasonable nature of the unconditional test […] the unconditional test is arguably the worst possible frequentist test […] it is in some sense true that, the more one can condition, the better” “It is sometimes argued that conditioning on non-ancillary statistics will ‘lose information’, but nothing loses as much information as use of unconditional testing” (J. Berger) Strong CL Neyman: (CR(x) is the accepted region for µ given the observation of x. c is an arbitrary subset of x space) G. Punzi - Strong CL

Summary of sCL properties (see CLW proceedings and hep-ex/9912048) • 100% frequentist, completely general. • The only frequentist method complying with Likelihood Principle • Invariant for any change of variables • No empty regions, in full generality • No “unlucky results”, no need for quoting additional information on sensitivity. No pathologies. • Robust for small changes of pdf • More information gives tighter limits • Easier incorporation of systematics • Price tag: • Overcoverage • Heavy computation G. Punzi - Strong CL

Invariance for change of the observable • All classical bands are invariant for change of variable in the parameter (unlike Bayesian limits) • The CL definition is invariant for change of variable in the observable, too. But, most rules for constructing bands break this invariance ! • Strong-CL is also invariant for any change of variable. • Likelihood Ratio is also invariant (non-advertised property?), so it is a natural choice of ordering to select a unique Strong Band. G. Punzi - Strong CL

Effect of changing variables Non-coverage land Neyman’s CL bands Strong bands LR-ordered bands G. Punzi - Strong CL

Poisson+background upper limit @90%CL for n=0 sCL = 90%, or R.-W. LR-ordering background • The upper limit on µ decreases with expected background in all unconditioned approaches. • Often criticized on the basis that for n=0 the value of b should be irrelevant. G. Punzi - Strong CL

Behavior when new observables are added • Do you expect limits to improve when you add extra information ? • A simple example shows that neither PO or LRO have this property (conjecture: no ordering algorithm has it !) • Example: comparing a signal level with gaussian noise with some fixed thresholds • Problem: the limit loosens dramatically when adding an extra threshold measurement. G. Punzi - Strong CL

Example L(µ) LR(µ) • Unknown electrical level µ plus gaussian noise ( =1). Limited to |µ|< 0.5. • Compare with a fixed threshold (2.5 ), get a (0,1) response. • Observe > threshold: • PO: empty region @90%CL • LR: 0.49 < µ < 0.50 @90%CL • sCL: -0.34 < µ < 0.50 @90%sCL • N.B. you MUST overcover unless you want an empty region. G. Punzi - Strong CL

Add another threshold LR(µ) L(µ) 0.27< µ < 0.5 • Now, add a second independent threshold measurement at 0: limit become much looser ! • sCL limit is unaffected • Conjecture: no ordering algorithm can provide a sensible answer in all cases. G. Punzi - Strong CL

Observations • It may be impossible to get sensible results without accepting some overcoverage. Why blame sCL for overcoverage ? • Ordering algorithms alone seem unable to prevent very strange results: the inclusion of additional (irrelevant) information may produce a dramatic worsening of limits. G. Punzi - Strong CL

Adding systematics to CL limits • Problem: • My pdf p(x|µ) is actually a p(x|µ,), where  is an unknown parameter I don’t care about, but it influences my measurement (nuisance) • I may have some info of  coming from another measurement y: q(y|) • My problem is: • p(x,y|µ,) = p(x|µ,)*q(y|) • Many attempts to get rid of : three main routes: • Integration/smearing (a la Bayes) • Maximization (“profile Likelihood”) • Projection (strictly classical) G. Punzi - Strong CL

Hybrid method: Bayesian Smearing • 1) define a new (smeared) pdf: p’(x|µ) =  p(x|µ,)π() d where π() is obtained through Bayes: • π() = q(y| )p()/q(y) • Need to assume some prior p() • 2) Use p’to obtain Conf. Limits as usual • GOOD: • Simple and fast • Used in many places • Intuitively appealing • BAD: • Intuitively appealing • Interpretation: mix Bayes and Neyman. Output results have neither coverage nor correct Bayesian probability => waste effort of calculating a rigorous CL • May undercover • May exhibit paradoxical tightening of limits G. Punzi - Strong CL

A simple example + Bayes systematics LR(µ) LR(µ) µ > 0.272 µ > 0.294 • Introduce a systematic uncertainty on the actual position of the 0 threshold. Assume a flat prior in [-1,1]. • Do smearing => get tighter limits ! • No reason to expect a good behavior G. Punzi - Strong CL

Approximate classical method: Profile Likelihood • 1) define a new (profile) pdf: p prof(x|µ) = p(x,y0|µ,best (µ)) where best(µ) maximizes the value of a) p(x0,y0|µ,best) b) p(x ,y0|µ,best) (best = best(µ,x) !) • This means maximizing the likelihood wrt the nuisance parameters, for each µ • 2) Use p profto obtain Conf. Limits as usual • GOOD: • Reasonably simple and fast • Approximation of an actual frequentist method • BAD: • Flip-flop in case a), non-normalized in case b) !! • Only approximate for low-statistics, which is when you need limits after all. • You don’t know how far off it is unless you explicitly calculate correct limits. • Systematically undercovers G. Punzi - Strong CL

Exact Classical Treatment of Systematics in Limits 1) Use p(x,y|µ,) = p(x|µ,)*q(y|), and consider it as p( (x,y) | (µ,) ) 2) Evaluate CR in (µ,) from the measurement (x0,y0) 3) Project on µ space to get rid of uninteresting information on  • It is clean and conceptually simple. • It is well-behaved. • No issues like Bayesian integralsWhy is it used so rarely ? 1) It produces overcoverage 2) The idea is simple, but computation is heavy. Have to deal with large dimensions 3) Results may strongly depend on ordering algorithm, even more than usual. G. Punzi - Strong CL

“profile method” G. Punzi - Strong CL

“Overcoverage” • Projecting on µ effectively widens the CR  overcoverage. BUT: • You chose to ignore information on  - cannot ask Neyman to give it all back to you as information on µ - the two things are just not interchangeable. •  overcoverage is a natural consequence, not a weakness • Q: can you find a smaller µ interval that does not undercover ? (same situation with discretization) G. Punzi - Strong CL

Optimization issue • You want to stretch out the CR along  direction as far as possible. • BUT: • The choice of band is constrained by the need to avoid paradoxes (empty regions, and the like) ! • No method on the marked allows you to treat µ and  in a different fashion • Strong CL allows you to specify µ as the parameters of interest, and to obtain the narrowest µ interval • The solution does not require constructing a multidimensional region G. Punzi - Strong CL

Strong CL Band with systematics • The solution does not require explicit construction of a multidimensional region • The narrowest µ interval compatible with Strong CL is readily found. G. Punzi - Strong CL

Conclusions • Strong Confidence bands have all good properties you may ask for. • Systematics can be included naturally and rigorously • They can even be actually evaluated G. Punzi - Strong CL

Difficulties in Limit setting and the Strong Confidence approach

Difficulties in Limit setting and the Strong Confidence approach

Presentation Transcript

LECTURE UNIT 5 Confidence Intervals (application of the Central Limit Theorem)

Learning in the Limit

Lenses in the Paraxial Limit

Chapter 7 Confidence Intervals (application of the Central Limit Theorem)

Confidence in The Battle

Difficulties in the Relationship

Dutch approach for setting GEP (and MEP)

The difficulties in measuring perfectionism

Difficulties in Districting. . .

Difficulties in Formalizing

Chapter 19 Confidence Intervals (application of the Central Limit Theorem)

Strong Pitch Angle Scattering and the Kennel-Petschek Limit

Difficulties in Simulating the Internet

Lower Confidence Limit

Hot and dense lattice QCD in the strong coupling limit

Building Strong Partnerships In Multihousing Setting

Watershed TP Limit Setting Process and Modeling

2006 Annual Results “Strong Investor Confidence, Record Results”

Law of large numbers. Central limit theorem. Confidence interval.

Actors and The Difficulties In Reality TV

Strong Pitch Angle Scattering and the Kennel-Petschek Limit

Setting Up A Strong Password