Program for evaluation of the significance, confidence intervals and limits by direct probabilities calculations

Program for evaluation of the significance, confidence intervals and limits by direct probabilities calculations S.Bityukov (IHEP,Protvino), S.Erofeeva(MSA IECS,Moscow), N.Krasnikov(INR RAS, Moscow), A.Nikitenko(IC, London) September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Introduction Duringplanning or processing of experiment we often consider a statistical hypothesisH0: new physics is present in Nature against hypothesisH1: new physics is absent in Nature. The value of uncertainty in our conclusion is defined by the probabilities = P(reject H0 | H0is true) - Type I error and b= P(accept H0 | H0is false) - Type II error There are many definitions of significance as a measure of excess of signal events above background. Many approaches exist also to methods of construction of intervals and limits: confidence, tolerant, fiducial and so on. During one of the CMS meetings Gunter Quast formulated the problem of practicians “the only remaining problem: make a choice … chosen method should be “as simple as possible, but not wrong!” September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Motivation of the work(significance) • Gaussian limit gives the wrong answer for low value of β (tail of Poisson distribution is heavier than tail of Gaussian) • 2. The statistics like SL (a likelihood-ratio-based test statistic) have poor statistical properties as estimator of significance • (SL = √2•(ln L1-ln L2)= √2•(ln Q), where Q is the ratio of binned/unbinned likelihood fits for hypotheses H0and H1 • H0: signal present and H1: no signal present) The simplest significance is the significance S_cP described at the next slide. The S_cP is quite natural significance , which allows to take into account any uncertainties by direct calculation of probabilities. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Definition of the significance S_cP S_cP - probability from Poisson distribution with mean μ_b to observe equal or greater than Nobs events, converted to equivalent number of sigmas of a Gaussian distribution (see report (page 8) by G. Quast in CMS Physics analysis days, May 9-12, 2005, CERN http://cmsdoc.cern.ch/~bityukov/talks/talks.html also, see I.Narsky, NIM A450(2000)444). The presented program ScPallows to calculate this significance with taking into account experimental systematics with statistical properties (Gaussian approximation) and theoretical systematics without any statistical properties. Also, the program calculates (if option is on) the combining significance of several channels. As is assumed all channels are independent. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Conception Conception. The probability of making a Type II error (β) in hypotheses test about presence of signal in experiment (H0) is used for determination number of sigmas (of background distribution) between expected background and observed number of events Nobs (formula 8 in CMS CR 2002/05). This probability is used for determination of signal significance, i.e. the significance S_cP will be found under resolving of equations , where It can be used in combining of results. Let us consider two possible approaches. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Combining of observed results: Approach 1 Approach 1. Suppose that observed value is greater than expected background Let β_1 be Type II error for channel 1 (event A = background Nobs_1 with P(A)= β_1) and β_2 be Type II error for channel 2 (event B = background Nobs_2 with P(B)= β_2). Because event A is independent from event B then probability of simultaneous appearance of A and B equals β_12 = P(AB) = P(A)*P(B) = β_1* β_2. After determination of β_12 one can calculate the S_cP. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Combining of expected signals & backgrounds: Approach 2 Approach 2. Suppose that Nobs is expected sum of expected signal (μ_s) and expected background (μ_b), i.e. Nobs = μ_s+μ_b (the case of planned experiment). Then the sums of expected numbers of signal (μ_s_i ) and background (μ_b_i ) events in each channel are used as summary μ_s and μ_b for calculation of combined significance. Note that we take into account in this case as fluctuation of expected background and fluctuation of expected signal. After determination of corresponding β one can calculate the S_cP. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Uncertainties • The program takes into account two types of uncertainties: • experimental systematics with statistical properties • (we assume that this systematics has Gaussian distribution • with known variance σ_b**2 in according with formula • μ_b = expected background + N(0,σ _b)). • Appr.2: In the case of the combining of channels the summary • variance σ_b**2 is the sum of partial variance σ_b_i**2. • b) theoretical systematics (δ_b) without any statistical properties (we assume: the worst case takes place when the background is maximal, i.e. μ_b*(1+ δ_b), but we take the signal plus the background as Nobs; more information can be found in S.Bityukov, N.Krasnikov, CMS CR 2002/05 or • S.Bityukov, N.Krasnikov, Mod.Phys.Lett.A 13 (1998)3235) • Appr.2: The combining δ_b isthe sum of partial δ_b_i. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Main input and output parameters Main input parameters: 1. expected background – μ_b 2. signal = observed value (Nobs) - expected background (μ_b) – μ_s 3. experimental uncertainty (r.m.s.) of background with statistical properties – σ_b 4. systematics of theoretical origin in background – δ_b Output parameters: 1. significance S_cP, calculated by formula - dsgnf 2. significance S_cP_MC, calculated by Monte Carlo - dsgnfm September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Auxiliary input parameters 1. switch for choosing of type calculations - iflag iflag = 1 calculations by formula (quick calculations) iflag = 2 Monte Carlo calculations iflag = 12 calculations by formula and by Monte Carlo 2. number of channels for calculations - nchan (from 1 up to 10) 3. number of channels for combined S_cP - ncombi (from 1 up to nchan) 4. parameter for Monte Carlo calculation - over over - parameter for Monte Carlo calculations. It is a number of Monte Carlo trials which will give value of number events over or equal Nobs. This parameter (and internal value dbeta) determines the number of trials for given μ_s, μ_b, σ_b and δ_b in routine SCPMC. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

The structure of program Language: Fortran 77 iflag Three different types 1. SCPFOR - calculations by formula of calculations: 2. SCPMC - Monte Carlo calculations 12. SCPFOR + SCPMC Main program processes the user requirements (defined in operators DATA) and calls routines SCPFOR and/or SCPMC. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Problem and approximation The problem which takes place during calculations is the restricted range of applicability of standard procedure DGAUSN in CERNLIB. For values of S_cP>6.2-7 the procedure gives non correct result. In this case we use as a good approximation the significance (MPL A13 (1998)3235) S_c12 = 2 ((μ_s+μ_b) - μ_b) . The account of the uncertainties is very simple: theoretical systematics (δ_b) S_c12t = 2 ((μ_s+μ_b) - (μ_b(1+δ_b)) . experimental systematics (σ_b**2 ) μ_b S_c12e = 2 ((μ_s+μ_b) - μ_b) ------------------- . (μ_b+σ_b**2) September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Simplest example of program S_cP output Example of G.Quast: bkg=2 sig=5.4. Here S1=3.8 S12=2.6 SL=2.7 Significance S_cP and/or S_cP_MC: NN of channels = 1, Combining channels from 1 up to 1 calculation type = 12 types: (1) S_cP by formula and (2) S_cP_MC by Monte Carlo σ_b-experimental uncertainty, i.e. μ_b = background + N(0, σ_b) δ_b - systematics of theoretical origin without statistical properties #ch backgr. signal σ_b δ_b S_cP S_cP_MC S_c12 1 2.00 5.40 0.0000 0.0000 2.6095 2.5923 2.612 1.4142 0.0000 1.8581 1.8759 1.847 1.4142 .50000E-01 1.8373 1.8677 1.811 September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

NN of channels = 2, Combining channels from 1 up to 2 #ch backgr. signal σ_b δ_b S_cP S_cP_MC S_c12 1 1.00 5.00 0.0000 0.0000 3.2417 3.2363 2.899 1.0000 0.0000 2.2798 2.3116 2.050 1.0000 .50000E-01 2.2547 2.3177 1.990 2 5.00 1.00 0.0000 0.0000 .29489 .30434 .4248 2.2361 0.0000 .24597 .25775 .3018 2.2361 .25000 .17006 .17654 .2210 COMBINING of OBSERVED RESULTS Combined channels(1-2) without errors 3.5051 3.5026 Combined channels(1-2) with stat. errors 2.6078 2.6402 Combined channels(1-2) both types of err. 2.5607 2.6198 COMBINING for EXPECTED SIGNAL and BACKGROUND Sum 6.00 6.00 0.000 0.000 2.052 2.045 2.029 2.449 0.000 1.486 1.500 1.435 2.449 0.300 1.406 1.398 1.333 Program output with combining of channels September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Range of applicability of the program ScP NN of channels = 3, Combining channels from 1 up to 1 calculation type = 12 #ch backgr. signal σ_b δ_b S_cP S_cP_MC S_c12 1 500.00 100.00 0.0000 0.0000 4.3205 4.3144 4.268 22.361 0.0000 3.1131 3.0378 3.018 22.361 25.000 2.3040 2.3390 2.210 2 300.00 120.00 0.0000 0.0000 6.5145 0.0 6.347 17.321 0.0000 4.8231 4.6070 4.488 17.321 15.000 4.1453 4.0227 3.835 3 15000.0 1000.0 0.0000 0.0000 6.2873 0.0 8.033 122.47 0.0000 6.1213 0.0 5.680 122.47 575.00 2.4477 0.0 2.369 September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Motivation of the work(confidence intervals) Suppose f(n;m)describes the Poisson distribution of probabilities andg(m;n)is the density of Gamma-distribution G1,1+nthen (Eq.1) where and n is the observed number of casual events appearing in Poisson flow for certain period of time. This identity shows that in our case the distribution of the probability of a true value of Poisson distribution parameter (the confidence density)for observed value nis the Gamma-distribution with mode nand mean value n+1,i.e. observed value n corresponds to the most probable value of parameter. The Poisson and Gamma distributions are statistically dual distributions. As shown, we for these distributions can reconstruct only single confidence density. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Program Limsb The unique of confidence density allows to construct the confidence intervals by simplest (and correct) way: we reconstruct for observed value n the correspondent confidence density and by direct calculations of probabilities determine the confidence intervals and/or confidence limits. Now the program Limsb constructs the central confidence interval and the confidence interval of minimal length for observed value n. Input: values EPS, CL and array DLAMB. The testing set of observed values is given in data array DLAMB. The value EPS determines the precision of calculations. The value CL determines the confidence level of intervals. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Simplest example of program Limsb output Confidence limits: eps, CL = 9.99999975E-05 0.899999976 central and shortes confidence intervals NN ev left bound right bound left tail upper prob. lenght .10000E-01 Central 0.05308130 3.015071 0.04998137 0.9500018 2.961990 Minimal 1.30385E-08 2.319871 1.082756E-08 0.9000000 2.319871 .10000 Central 0.07074530 3.186507 0.04999527 0.9500008 3.115762 Minimal 6.053597E-09 2.473754 8.71932E-10 0.9000000 2.473754 .50000 Central 0.17588568 3.907293 0.04998513 0.9499978 3.731407 Minimal 0.00512708 3.128773 0.00027532 0.9002753 3.123645 1.0000 Central 0.35530150 4.743777 0.04998505 0.94999635 4.3884754 Minimal 0.08397551 3.932307 0.00333463 0.90290034 3.8483307 10000. Central 9837.07715 10166.06 0.04999995 0.94999993 328.98340 Minimal 9836.24023 10165.22 0.04913389 0.94913387 328.97656 September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Conclusion Programs ScP and Limsb can be found in Web page http://cmsdoc.cern.ch/~bityukov We are ready to include in program Limsb the calculation of the confidence intervals of Poisson distribution parameter for signal events in presence of background (formula O.Helene, which appears in our approach by natural way, see hep-ex/0108020). We are grateful to Vladimir Gavrilov, Vassili Katchanov, and Albert De Roeck for the interest and support of this work. We would like to thank Bob Cousins, Vladimir Obraztsov and Claudia Wulz for discussions and useful comments. September, 2005 PhyStat 2005 Oxford, UK S.Bityukov

Program for evaluation of the significance, confidence intervals and limits by direct probabilities calculations

Program for evaluation of the significance, confidence intervals and limits by direct probabilities calculations

Presentation Transcript

CONFIDENCE INTERVALS

Confidence Intervals

Significance testing and confidence intervals

Confidence Intervals and Significance Testing

Confidence Intervals

Statistical significance using Confidence Intervals

Confidence Intervals

Confidence Intervals for

Confidence Intervals

Calculating Statistical Significance and Confidence Intervals

Confidence Intervals

Measuring Confidence Intervals for MT Evaluation Metrics

Confidence Intervals

Confidence Intervals

Goodness of fit, confidence intervals and limits

Significance testing and confidence intervals

Confidence intervals and bivariate regression The limits of significance

Goodness of fit, confidence intervals and limits

Statistical significance using Confidence Intervals