220 likes | 344 Vues
This study examines the deviations observed in Hadlum Jr. compared to normal patterns of pregnancy duration, utilizing a dataset of 13,634 samples. By applying univariate analysis and cumulative probability plots, we illustrate how deviations from a normal distribution can indicate potential outliers or non-normal distributions. We also interpret the findings within the context of skull capacity among the Maoris, exploring hypotheses regarding population differences and the influence of outliers on statistical predictions. This comparative analysis emphasizes the necessity of robust regression techniques to handle deviations effectively.
E N D
Screening (Significant Effects)
Hadlum vs Hadlum A univariate example that illustrates deviation from a normal pattern.
Normal duration Percentage (n=13634) Duration of Pregnancy Bannet (1978) Appl. Statist. 27, 242-250
Comparison of Hadlum Jr. to normal pattern Normal duration Percentage (n=13634) Hadlum Jr.
Deviation = observed value - predicted value residual measurement Model ^ y y Model validation
Normal Population - Cumulative plots Traditional Graphical paper Normal distribution paper
Normal plot 1) Sort the observations in increasing order 2) Let each observation present a percent interval that equals of the normal distribution If the observations are normally distributed, they plot like a straight line in the normal plot! Deviation from straight line implies outlying observations or non-normal distribution
Sculls from a cemetery maximum Karl Pearson (1931) Tables for Statisticans and Biometricans, Biometric Lab., London
Is the largest scull from a Maori? Hypothesis: The Maoris have less scull capacity than the whites - the largest scull is a contaminant shipwrecked sailor or missionary?
Probability plot Scull Capacity
Example P. Garrigues R. De Sury M. L. Angelin J. Bellocq J. L. Oudin M. Ewald Geochemica et Cosmochimica Acta, 52, (1988) 375-384
Data ? ?
Robust regression? Two outliers Useful tool to avoid thinking? Sloppy data analyst can find relief in robust regression
Result of “pooled” regression r=0.995
Observation r=0.865 Two phenomena influencing the ratio (predictor) No prediction possible!
Parallel displacement - perfect result for the one who wants to be “straight-lined”