1 / 22

Internal and External Variance

Break Position Errors in Climate Records Ralf Lindau & Victor Venema University of Bonn Germany. Internal and External Variance. Consider the differences of one station compared to a neighbor reference. Breaks are defined by abrupt changes in the station-reference time series.

york
Télécharger la présentation

Internal and External Variance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Break Position Errors in Climate RecordsRalf Lindau & Victor VenemaUniversity of BonnGermany

  2. Internal and External Variance Consider the differences of one station compared to a neighbor reference. Breaks are defined by abrupt changes in the station-reference time series. Internal variance within the subperiods External variance between the means of different subperiods Break criterion: Maximum external variance 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  3. Decomposition of Variance n total number of years N subperiods ni years within a subperiod The sum of external and internal variance is constant. 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  4. Position errors Two segments of lengths n1 and n2with means x1 and x2. A subsegment of length m with mean x0 is erroneouslyexchanged from segment 2 to segment 1. x1 is strongly reduced, x2 differs slightly. x1 and x2 converge. This reduces the external variance, and the wrong segmentation is rejected. 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  5. Change of external variance with The change of external variance Dv is only a function of the means and lengths of the two segments and the exchanged subsegment . 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  6. Express x0 by x2 plus scatter d depends on the internal variance s2 and the length m, because it is a mean over m random numbers. The mean of the exchanged subsegment x0 is equal to x2, the segment mean where it stem from, plus a random scatter variable d. with 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  7. Quadratic function for Dv Replace x0 by d and normalize by the square of the jump height d. The change of the normalized external variance v*, which is the decision criterion for break detection, is a quadratic function of a random variable e, which depends on the signal to noise ratio and the length of the exchanged segment . 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  8. Zero points If the parabola becomes positive, the shift of the break position by m items leads to increased external variance so that this solution is preferred by mistake. Zero points at: 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  9. Simulateddata 10,000 random time series of length 100. Internal s = 1 Jump height = 2 Data confirm the existence of different parabolae for different m. But data coverage only for scatter near zero, never reaching the negative solution. } SNR = 1 (n Dv) / 4 m=1 m=2 m=3 d 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  10. The negative solution Typical situation: SNR extreme low. A drastically disturbed measurement near the break. Its exchange leads to x1’ < x2 and x2’ > x1. The two means diverge so that the external variance grows. X2’ X2 X1 X1’ 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  11. The positive solution A subsegment adjacent to the true break is randomly lifted by more than half of the jump height. Including it to the neighboring segment will reduce the internal variance. An erroneous break position is concluded. Criterion: Maximum hatched area 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  12. Brownian motion with drift Mathematical formulation of the criterion: d s Drift = - SNR 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  13. Theoretical retrace Parabola equation linear approximation around the zero point inserting known slope and (positive) zero point replacingf1 + f2 by 2m multiplying by signal-to-noise ratio Brownian motion with drift 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  14. Distribution of the time of the maximum of a Brownian motion with drift Strictly valid only for continuousprocesses. Buffet , 2003, J Appl Math Stoch Anal SNR = 0.5 _ _ _ _ _ Buffet, 2003 0 0 0 Numerical simulation of a discrete Brownian motion with drift. + + + Complete break search simulation SNR = 2 SNR = 1 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  15. Two more problems Hit rate is not accurately reproduced Break errors are a two-sided symmetric process. Both, too early and too late breaks are possible. Buffet , 2003 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  16. Hit rate The hit rate h can be estimated for all drifts d by: with true + + + estimated 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  17. Two-sided processes Deviations are caused by random scatter independently on both sides. The hit rate h is reduced to h2. One-sided deviations have the probability: with + without competitor For two-sided deviations the probability is halved, if a competitor occurs on the other side: All other probabilities are reduced by 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  18. Practical application The hit rate drops from from 95% for SNR = 2 to 29% for SNR = 0.5 SNR > 1  becoming quickly very exact. SNR < 1  becoming quickly very inexact. SNR = 2 SNR = 1 SNR = 0.5 true + + + estimated 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  19. Conclusions • Break position errors can be described by the distribution of the time of maximum of a Brownian motion with drift. • The drift parameter is equal to the signal to noise ratio, as given by the half jump height between and the internal standard deviation within homogeneous subperiods. 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  20. Hit rate simulation The hit rate is the probability that the initial value is never exceeded. For realistic drift sizes the value converges after a few steps. 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  21. Preliminary maximum Instead of multiplying with h < 1, we can alternatively stop the summation earlier. k = 2 works well. pik is defined as the probability that the kth member of a Brownian motion is the preliminary maximum after i steps. The probability to be also the absolute maximum is lower by a factor of h. Thus: 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

  22. Hit rate estimate Define the drift-dependent exceeding probability: The preliminary maxima after 1 and after 2 steps are known. 12th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

More Related