Re-testing data when the part exceeds the whole

Re-testing data when the part exceeds the whole Elgin Perry, Ph.D. Statistics consultant and Bill Romano MD Dept. of Natural Resources Analytical Methods and Quality Assurance Workgroup 21September 2007

The issue • On occasion, labs report results where PO4 exceeds TDP, or NO23 exceeds TDN • Labs typically re-test these results if the difference (e.g., PO4 – TDP) exceeds the sum of the method detection limits (PO4 + TDP or NO23 + TDN) indicating that the difference exceeds measurement error • This is equivalent to saying confidence intervals for the two measurements do not overlap.

Alternatives to sum of MDLs • A slightly more powerful test for difference can be obtained by computing a z-score. • The difference can be converted to a z-score by dividing by the standard deviation of the difference (z= x1-x2/ δdiff) • The z-score can be compared to a normal probability table to determine if the difference exceeds measurement error

Important assumption • For both the sum of MDL test and the z-score test, an important assumption is that measurement precision remains constant • If true, then the variance estimate used to create the MDL can be used for either method

MDL calculation • MDLs are calculated from seven aliquots of a low-level sample that are processed through the entire analytical method • The standard deviation (s) of the aliquots is calculated from the analytical results and then multiplied by t(n-1, 0.99), or 3.143 • Using MDLs assumes that the standard deviation of low and high concentration samples is the same for both the sum of MDL method and the z-score test method

A recent example…. • NO23 = 2.712 • TDN = 2.697 • NO23 – TDN = 0.015 • Sum of MDLs = 0.003 + 0.006 = 0.009 • Difference exceeds sum of MDLs • z-score = 7 (a really big number!) • Percent difference = 0.56 (a really small number)

Re-measurement of same sample • NO23 = 2.779 • TDN = 2.788 • TDN is greater than NO23, as it should be, in the replicate analysis of the same sample • Thus the NO23 greater than TDN in the first analysis was probably measurement error

Examine constant variance assumption • The preceding example calls into question the assumption of constant variance • To examine this question use graphical analysis of lab replicate data.

Standard deviation versus the mean of NO23 replicates Standard deviation increases as concentration increases The standard deviation of the replicate mean increases as the mean concentration increases.

Standard deviation versus the mean of TDN replicates Standard deviation fairly constant as concentration increases

Standard deviation versus the mean of PO4 replicates Standard deviation increases as concentration increases

Standard deviation versus the mean of TDP replicates Standard deviation increases as concentration increases

Testing the constant variance assumption • Increasing variance was computed as a step function of the mean concentration • The step point was chosen from the graphs to differentiate between high and low variance groups • The standard deviation of each group was estimated from the root mean square error • An f-ratio was calculated to test for equivalent variance between groups

Testing for constant variance In all cases the degrees of freedom exceeded 100.

Histogram of NO23 residuals for low variance group The distribution of the residuals is symmetric about the mean, but heavy-tailed.

Plot of the standard normal probability density function From: Engineering Statistics Handbook http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm

Normal probability plot of NO23 residuals for low variance group

Histogram of NO23 residuals for high variance group The distribution of the residuals is symmetric about the mean, but heavy-tailed.

Normal probability plot of NO23 residuals for high variance group

Comparing the methods • Using a z-score of 2 would require re-testing the most pairs of data (most conservative, most work) • Using a z-score of 3 would require re-testing fewer pairs of data • Using the sum of MDLs would require re-testing still fewer data • The stratified variance approach appears to require re-testing the least amount of data

Recommendations • MDLs underestimate the measurement error of higher concentrations • Heavy tailed distributions suggest that big differences between reps should be trimmed before estimating precision • Use z-score test based on stratified variance estimates. e. g.,

Re-testing data when the part exceeds the whole

Re-testing data when the part exceeds the whole

Presentation Transcript

When Numbers Don’t Tell the Whole Story Value of Qualitative Data

Akamai Technologies: When Demand Exceeds Capacity

Testing the Re-Engineered Discharge

Whole and Part Practice

Encryption a part of the whole security solution

The Re-Re-Re-Reform of the NHS

Part-Part Whole Bar Modeling

Data Re-Use: the Perspective of the OCO

Part-Whole

Re-engineering Testing

Part 4: The Politics of Testing

(re)Discovering the Gospel, part V

Whole-Part-Whole Learning Model

. .. « Why change ? when the rate of change outside the organization exceeds the rate of change

Part of the Whole

Synergy: Value of the whole exceeds sum of the parts. Could arise from: Operating economies

Testing for whole function

The Whole Numbers

Chapter 10 Re-expressing the data

Part 4: The Testing Process

When Should You Opt For Whole House Re-piping?

Chapter 10 Re-expressing the data