300 likes | 419 Vues
Look elsewhere effect. Ofer Vitells. Statistics miniworkshop at CERN , February 2013. LEE Topics. Introduction D efinition of gaussian & gaussian -related fields Z-dependence of trial factor Variance of m-hat Bayesian comparison Different possibilities for critical region
E N D
Look elsewhere effect OferVitells Statistics miniworkshop at CERN , February 2013
LEE Topics • Introduction • Definition of gaussian & gaussian-related fields • Z-dependence of trial factor • Variance of m-hat • Bayesian comparison • Different possibilities for critical region • Constatnt LR (“Tevatron” test statistic) curves • Leadbetter formula • Location (“energy-scale”) uncertainties • Single channel • Combination • Approximation/estimation problems • Sliding window effect on upcrossings counting • Uncertainty on observed number of upcrossings (poisson?) • When asymptotic formulae break down in practice
Gaussian & Gaussian related fields- The joint distribution of any collection {f(t1),f(t2),…,f(tn)} is multivariate Gaussian- Gaussian related fields are functions of Gaussian fields, e.g.(chi-squared field) Wilks : t f(t)
Z-dependance Trial-factor • Variance of : Example : In the large sample limit
Bayesian estimate There is less posterior probability in the peak as it narrows (~1/Z) With a uniform prior: “Trial factor”
Bayesian estimate There is less posterior probability in the peak as it narrows (~1/Z) Jeffreys Prior: Cancels the Z dependence
Example normalized likelihoods Background Background + signal
Bayesian estimate Integrate m ?
what exactly is meant by 'more impressive than observed in data' ?q(PL) vs. qTEV observed significance : SM Expected significance (sensitivity): At a given mass point the two tests are equivalent (1-to-1 functions of ), But give different answers to what is the “best fit” mass - or* note qTEV = 0 if ZSM=0. max[qTEV] is generally not at the point of largest local significance.
Curves of constant qTEV p-value = Prob( max[qTEV] > c ) A similar signal at 160 GeV would give much smaller global significance (because less consistent with the SM) - same as local 1σ @ 600 GeV
Curves of constant qTEV p-value = Prob( max[qTEV] > c ) Can be estimated with Leadbetter’s formula (upcrossings above a curve) A similar signal at 160 GeV would give much smaller global significance (because less consistent with the SM) - same as local 1σ @ 600 GeV
Energy-scale uncertainties Likelihood at a fixedmass M0 Energy-scale nuisance parameter “local” LEE (Leadbetter)
ATLAS combined Higgs workspace toy sampling at 126.5 GeV with ES uncertainty Leadbetter formula with a parabolic curve (gaussian constraint) Similar to a LEE in the range defined by
combination m2 2D field (trial factor ) m1
u • Sliding windows (mass dependant cuts)==>discontinuity in q(m) due to events getting in/out q(m)
Uncertainty on observed number of upcrossings • Usually assumed Poisson • Effect on significance is logarithmic • When & how asymptotic formulae break down in practice • ?
Example of combination of channels with different mas resolutions • Toy combination of two channels:(both gaussian signal + flat bkg)- channel 1: σm=1 GeV- channel 2: σm=10 GeV
Combination example ^ q0(mH) µ(mH) mH mH Note the effect of the wide bump on the number of upcrossings at 1
Average number of upcrossings Note that the average number of upcrossins in the combination is always smaller than in channel 1 alone 25,000 Toy simulations
2-D exapmle #2: resonance search with unknown width • Gaussian signal on exponential background • Toy model : 0<m<100 , 2<σ<6 • Unbinned likelihood: σ m
2-D exapmle #2: resonance search with unknown width Excellent approximation above the ~2σ level P-value u=0 u=1 (2nd term is dominant for )
m2 m1
To have the distribution well defined for , take , since e.g. If take , such that constant ( ) In this limit is independent of up to e.g.