An update on the Statistical Toolkit

An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto Ribon, Paolo Viarengo July 19th, 2005

G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo “A Goodness-of-Fit Statistical Toolkit” IEEE- Transactions on Nuclear Science (2004), 51 (5): 2056-2063. Release StatisticsTesting-V1-01-00 downloadable from the web: http://www.ge.infn.it/geant4/analysis/HEPstatistics/

Tests based on maximum distance • Kolmogorov-Smirnov test • Goodman approximation of KS test • Kuiper test EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS unbinned distributions Dmn SUPREMUM STATISTICS

Tests containing a weighting function • Fisz-Cramer-von Mises test • Anderson-Darling test EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS QUADRATIC STATISTICS + WEIGHTING FUNCTION Sum/integral of all the distances binned/unbinned distributions

Status of the existing tests 2. New GoF Tests added 3. Description of the power studyphase I 4. Description of the power studyphase II 5. A concrete example: IMRT

Status of the existing tests:Fisz-Cramer-von Mises binned/unbinned distributions • Conover (book) + Darling (1957): • - The two-sample Cramer-von Mises test (Fisz test) has the same asymptotic distribution of the one-sample test (Cramer-von Mises test). • - The equation of the asymptotic distribution is available in the paper • by Anderson and Darling (1952).

Status of the existing tests:two-sample Anderson-Darling binned/unbinned distributions • Scholz and Stephens (1987): • The two-sample Anderson-Darling test can be written in different ways: • - exact formulation (for unbinned distributions only) • - approximated formulation (for binned/unbinned distributions) • The approximated distance is already available in the toolkit. • The asymptotic distributions of both exact and approximated formulations are available in the paper. • The two-sample Anderson-Darling test has the same asymptotic distribution of the one-sample test.

Status of the existing tests:Tiku test binned/unbinned distributions • Tiku (1965): • Cramer-von Mises test in a chi-squared approximation. • Cramer-von Mises test statistics is converted into a central chi-square, bypassing the problem of integrating the weighting function.

2. New GoF Tests:weighted Kolmogorov-Smirnov unbinned distributions • Canner (1975) & Buning (2001): • Canner modified KS test introducing one weighting function identical to the one used in AD test. • Buning modified KS test introducing one weighting function similar to the one used in AD test. • - The equation of the asymptotic distribution is not available in Canner’s paper, only a few critical values for some samples sizes (n=m).

2 2. New GoF Tests:weighted Cramer von Mises unbinned distributions • Buning (2001): • Buning modified CVM test introducing one weighting function similar to the one used in AD test. • - The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.

2 2 2. New GoF Tests:Watson • Watson (1975): • Derives from Cramer-von Mises test statistics. • Like Kuiper test it can be applied in case of cyclic observations. • - The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.

Other news • New user layer dealing with ROOT histograms (Andreas is working on that). • Paper to IEEE-TNS • Next release of the GoF Statistical Toolkit scheduled within summer.

Future developments • Fix some design-related problems. • New design (add uncertainties). • Extend the toolkit to the comparison of: • Experimental data versus theoretical functions, • k-sample problem, • Many dimensional one-, two-, k-sample problem.

Which is the recipe toselect the most suitableGoodness-of-Fit testamong the ones available inthe GoF Statistical Toolkit?

3. Description of thepower study – phase I PARENT 1 PARENT 2 MONTE CARLO REPLICATIONS k=1000 TEST SAMPLE 1 SAMPLE 2 EDF STATISTICS (UNBINNED DATA): KS, KSW, KSA, KUIPER, CVM, ADA “EMPIRICAL” POWER EVALUATION RESULTS: LOCATION-SCALE ALTERNATIVE RESULTS: GENERAL ALTERNATIVE COMPARISON WITH PUBLISHED RESULTS REAL DATA EXAMPLES

Gaussian Uniform Double exponential Cauchy Exponential Contaminated Normal Distribution 1 Contaminated Normal Distribution 2 Parent distributions

Skewness and tailweight Skewness Tailweight

Supremum statistics tests Tests containing a weight function 2 < < Comparative evaluation of tests Tailweight Skewness

4. Description of thepower study – phase II PARENT 1 PARENT 2 MONTE CARLO REPLICATIONS k=1000 TEST SAMPLE 1 SAMPLE 2 BINNED/UNBINNED DATA CHI2, KS, KSW, KSA, KUIPER, CVM, CVMA, ADA, AD “EMPIRICAL” + “MC” POWER EVALUATION RESULTS: LOCATION-SCALE ALTERNATIVE RESULTS: GENERAL ALTERNATIVE LINEAR POWER ISODYNES CORRELATION BETWEEN TESTS

5. A concrete example: IMRT EXAMPLE : unbinned data Lateral profiles Michela Piergentili Which is the most suitable goodness-of-fit test?

GoF test selection Skewness Tailweight T is always greater than 1, the longer the tail the greater the value of T. S = 1: symmetric distribution S < 1: left skewed distribution S > 1: right skewed distribution • Classify the type of the distributions in terms of skewness S and tailweight T

Comparative evaluation of tests power 2. Choose the most appropriate test for the classified type of distribution Tailweight Skewness

GoF test: test selection & results RESULTS: unbinned data ^ X-variable:Ŝ=1.53T=1.36 Y-variable:Ŝ=1.27T=1.34 ^ Moderate skewed – medium tail KOLMOGOROV-SMIRNOV TEST D=0.27 – p>0.05

An update on the Statistical Toolkit

An update on the Statistical Toolkit

Presentation Transcript

An Update on FDA’s Critical Path Initiative Statistical Contributions

An Update on PEPS

An Update on the Connecticut Economy

An Update on KASPER

An Update on KASPER

Statistical Toolkit

Sierra Toolkit update

AN UPDATE ON THE AUSTRALIAN CURRICULUM

“EBHC Statistical Toolkit”

An Actuarial Toolkit

An update on the evaluation

AN UPDATE ON

An Update on Verilog

An update on the Cached BDII

Statistical Update

An Update on the MINOS Experiment

An Update on the Transformation Programme

DLI TOOLKIT: Update

An Update on The Appraisal Foundation

An update on the evaluation

An update on HNTES

PTSD : An Update on the Evidence