1 / 63

Seasonal Adjustment and DEMETRA

2. 04/05/2011ESTP course. Identification of type of OutliersCalendar Effects and its determinantsX-12 ARIMA and TRAMO/SEATS. Session 2 May 4th. Dario Buono, Enrico Infante. 3. Outliers. Outliers are data which do not fit in the tendency of the Time Series observed, which fall outside the range

ismet
Télécharger la présentation

Seasonal Adjustment and DEMETRA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    2. 2 Identification of type of Outliers Calendar Effects and its determinants X-12 ARIMA and TRAMO/SEATS

    3. 3 Outliers Outliers are data which do not fit in the tendency of the Time Series observed, which fall outside the range expected on the basis of the typical pattern of the Trend and Seasonal Components Additive Outlier (AO): the value of only one observation is affected. AO may either be caused by random effects or due to an identifiable cause as a strike, bad weather or war Temporary Change (TC): the value of one observation is extremely high or low, then the size of the deviation reduces gradually (exponentially) in the course of the subsequent observations until the Time Series returns to the initial level Level Shift (LS): starting from a given time period, the level of the Time Series undergoes a permanent change. Causes could include: change in concepts and definitions of the survey population, in the collection method, in the economic behavior, in the legislation or in the social traditions

    4. 4 Outliers

    5. 5 Outliers

    6. 6 Outliers

    7. 7 Outliers

    8. 8 Outliers

    9. 9 TRAMO & X-12 ARIMA use regARIMA models where Outliers are regressors Xt is the raw series, Zt is the “linearized” series (corrected from Outliers and other effects) Zt follows an ARIMA(p,d,q): where ? is the rupture date, ? is the rupture impact and ?(B) models the kind of rupture (its “shape”) Outliers

    10. 10 Outliers Automatic and iterative detection and estimation process are available in TRAMO and X-12 ARIMA

    11. 11 Outliers The smoothness of series can be decided by statisticians and the policy must be defined in advance Consult the users This choice can influence dramatically the credibility Outliers in last quarter are very difficult to be identified Some suggestions: Look at the growth rates Conduct a continuous analysis of external sources to identify reasons of Outliers Where possible always add an economic explanation Be transparent (LS, AO,TC)

    12. 12 Calendar Effects Time Series: usually a daily activity measured on a monthly or quarterly basis only Flow: monthly or quarterly sum of the observed variable Stock: the variable is observed at a precise date (example: first or last day of the month) Some movements in the series are due to the variation in the calendar from a period to another Can especially be observed in flow series Example: the production for a month often depends on the number of days

    13. 13 Calendar Effects Trading Day Effect Can be observed in production activities or retail sale Trading Days (Working Days) = days usually worked according to the business uses Often these days are non public holiday weekdays (Monday, Tuesday, Wednesday, Thursday, Friday) Production usually increases with the number of working days in the month

    14. 14 “Day of the week” effect Example: Retail sale turnover is likely to be more important on Saturdays than on other weekdays Statutory (Public) Holidays and Moving Holidays Most of statutory holidays are linked to a date, not to a day of the week (Christmas) Some holidays can move across the year (Easter, Ramadan) and their effect is not completely seasonal ? Months and quarters are not equivalent and not directly comparables Calendar Effects

    15. 15 Calendar Effects – Trading Days

    16. 16 Calendar Effects – Trading Days

    17. 17 Calendar Effects – Trading Days

    18. 18 Calendar Effects can be partly considered seasonal: The number of Trading Days is almost always smaller in February than in March Some months may have more public holidays (ex: May in France), and therefore less trading days than others Thus, we can: Either estimate the non seasonal Calendar Effects Make the correction of both Seasonal and Calendar Effects Spectra of the “Weekday” series: number of Mondays, Tuesdays, …, Friday per month Calendar Effects – Trading Days

    19. 19 Average length of a year: 365.25 days Average length of a month: Average number of weeks per month: Calendar Effects – Trading Days

    20. 20 Calendar Effects – Trading Days

    21. 21 Calendar Effects – Trading Days

    22. 22 Hypothesis: for the most common models, it is assumed that the effect of a day is constant over the span of the series Basic model: Ni,t = # of Mondays (i=1), …, Sundays (i=7) in month t ?i is the effect of day i (ex: average production for a Monday (i=1)) Calendar Effects – Trading Days

    23. 23 In some cases (particular activities) you could have to make a distinction between working and non working Mondays, Tuesdays etc. You can of course adapt the model to other more simple or more complex cases Calendar Effects – Trading Days

    24. 24 Certain groupings could be more sensible than others: Public holidays Non working weekdays Trading days and weekends Etc. Much better to use contrasts (W. Bell): where: ? is the average effect of a day (mean of the ? and ?) mt is the length of month t Calendar Effects – Trading Days

    25. 25 The variables take into account a part of Seasonality: the length of the month is seasonal, the White Monday is always a Monday etc. If we want to estimate the non seasonal Calendar Effects, we can remove from the variables their long-term average In this case, the explanatory variables do not have any Trend or Seasonality. Thus, the dependant variable must be stationary, an Irregular Component for example (X-11), or a regARIMA model should be used X-12 ARIMA and TRAMO-SEATS Calendar Effects – Trading Days

    26. 26 Coffee-Break!!!

    27. 27 Calendar Effects – Moving Holidays The date of some public holidays (usually religious events) moves across the year and can affect different months or quarters according to the year Examples: Ramadan, Chinese New Year, Easter (Catholic or Orthodox) etc. The effect of the event could be gradual and could affect the days before or after the event itself: Chocolate sales around Easter We focus in the following on the treatment of Easter but the methodology and the models can be easily adapted

    28. 28 Calendar Effects – Moving Holidays Catholic Easter Sunday Easter can fall between March 22 (last time: 1818, next time: 2285) and April 25 (last time: 1943, next time: 2038) i.e. during the first or second quarter Orthodox Easter Between April 4 (last time: 1915, next time: 2010) and May 9 (last time before 1600, next time: 2173) i.e. always during the third quarter Easter affects the production of many economic sectors: Statutory holidays (transportation, hotel frequentation), consumption habits (chocolate, flowers, lamb meat, etc.)

    29. 29 Calendar Effects – Moving Holidays

    30. 30 Calendar Effects – Moving Holidays

    31. 31 Calendar Effects – Moving Holidays Usually models are based on the number of days “affected” by Easter in the various months (March or April for Catholic Easter) Immediate Impact Gradual Impact: effect on the n days preceding Easter Easter can affect the w preceding days; the impact can therefore affect 2 months The impact is supposed constant on the w days

    32. 32 Calendar Effects – Moving Holidays The 2 Easter models are estimated using a regARIMA model: Where: Xt represents a regressor modeling Calendar Effects, Outliers (AO, LS, TS) and other user-specified effects including Easter Effects Dt represents a priori adjustments such as leap year, strikes, etc.

    33. 33 Calendar Effects – Moving Holidays The w-day period, during which the activity is affected, does NOT include the Sunday Easter. The activity level remains constant on the w-day period

    34. 34 Calendar Effects – Moving Holidays If i denotes the year and j the month, let us note and the number of the w days falling in month j of year i The regressor associated to this model is: The regressor values are 0 except for February, March and April depending on the value of w

    35. 35 Calendar Effects – Moving Holidays

    36. 36 Calendar Effects – Moving Holidays

    37. 37 Orthodox Easter falls more in April than in May (83% versus 17%) Catholic Easter falls more in April than in March (78% versus 22%) There is a “Seasonal Effect” one can correct by removing the long-term monthly average: Calendar Effects – Moving Holidays

    38. 38 Calendar Effects Definitions: The Seasonally Adjusted series is logically defined as the raw series from which the Seasonality (St) has been removed The “Working Day” adjusted series is defined as the raw data from which the Calendar Effects (TDt, MHt) have been removed This definition can vary if you include or not the Calendar Effects

    39. 39 X-12 ARIMA VS TRAMO/SEATS Seasonal Adjustment is usually done with an off-the-shelf program. Three popular tools are: X-12 ARIMA (Census Bureau) TRAMO/SEATS (Bank of Spain) DEMETRA+ (Eurostat), interface X-12 ARIMA and Tramo/Seats X-12 ARIMA is Filter based: always estimate a Seasonal Component and remove it from the series even if no Seasonality is present, but not all the estimates of the Seasonally Adjusted series will be good TRAMO/SEATS is model based method variants of decomposition of Time Series into non-observed components

    40. 40 X-12 ARIMA Auto-projective Models What is f ? Under certain hypothesis (Stationary), it is possible to find an “ARMA function” which gives a good approximation of f These models usually depend on a few number of parameters only Box et Jenkins methodology allows to estimate the parameters and gives quality measures of the adjustment

    41. 41 X-12 ARIMA AutoRegressive Model of order p AR(p): Moving Average Model of order q MA(q):

    42. 42 X-12 ARIMA ARMA(p,q): Corner method, triangle method, ODQ etc. ARIMA(p,d,q): Non stationary Time Series, with a Trend SARIMA(p,d,q)(P,D,Q)S: Non stationary Time Series, with Trend and Seasonality

    43. 43 X-12 ARIMA A regARIMA model is a regression model with ARIMA errors. When we use regression models to estimate some of the components in a Time Series, the errors from the regression model are correlated, and we use ARIMA models to model the correlation in the errors. ARIMA models are one way to describe the relationships between points in a Time Series Besides using regARIMA models to estimate regression effects (such as outliers, Trading Day, and Moving Holidays), we also use ARIMA models to forecast the series. Research has shown that using forecasted values gives smaller revisions at the end of the series

    44. 44 X-12 ARIMA X-12 ARIMA runs through the following steps: The series is modified by any user defined prior adjustments The program fits a regARIMA model to the series in order to detect and adjust for outliers and other distorting effects to improve forecasts and Seasonal Adjustment It detect and estimates additional component (e.g. Calendar Effects) and extrapolate forward (forecast) and backwards (backcast) The program then uses a series of Moving Averages to decompose a Time Series into three components. It does this in three iterations, getting successively better estimates of the components. During these iterations extreme values are identified and replaced In the last step a wider range of diagnostic statistics are produced, describing the final Seasonal Adjustment, and giving pointers to possible improvements which could be made

    45. 45 Moving Averages A Moving Average of order p+f+1 and coefficients {?i} is defined by: The value at date t is therefore replaced by a weighted average of p past values, the current value and f future values Examples: simple moving averages of order 3

    46. 46 Moving Averages Let us suppose the following decomposition model: One can want remove the Seasonality and the Irregular by using a Moving Average: So, we want to cancel some frequencies!!

    47. 47 Moving Averages We would like to find a Moving Average which preserve the Trend and removes both Seasonality and Irregular Example (preservation of constants): let us define a Series and a Moving Average M The coefficients must sum to 1 Property: if you want to preserve polynomials of degree d, the Moving Average coefficients must verify:

    48. 48 A Filter is a weighted average where the weights sum to 1 Seasonal Filters are the filters used to estimate the Seasonal Component. Ideally, Seasonal Filters are computed using values from the same month or quarter, for example, an estimate for January would come from a weighted average of the surrounding Januaries The Seasonal Filters available in X-12 ARIMA consist of seasonal Moving Averages of consecutive values within a given month or quarter. An n x m Moving Average is an m-term simple average taken over n consecutive sequential spans X-12 ARIMA

    49. 49 An example of a 3x3 filter (5 terms) for January 2003 (or Quarter 1, 2003) is: 2001.1 + 2002.1 + 2003.1 + 2002.1 + 2003.1 + 2004.1 + 2003.1 + 2004.1 + 2005.1 9 An example of a 3x5 filter for January 2003 (or Quarter 1, 2003) is: 2000.1 + 2001.1 + 2002.1 + 2003.1 + 2004.1 + 2001.1 + 2002.1 + 2003.1 + 2004.1 + 2005.1 + 2002.1 + 2003.1 + 2004.1 + 2005.1 + 2006.1 15 X-12 ARIMA

    50. 50 X-12 ARIMA Trend Filters are weighted averages of consecutive months or quarters used to estimate the trend component An example of a 2x4 filter (5 terms) for First Quarter 2005: 2004.3 + 2004.4 + 2005.1 + 2005.2 2004.4 + 2005.1 + 2005.2 + 2005.3 _______________________________________ 8 Notice that we are using the closest points, not just the closest points within the First Quarter like with the Seasonal Filters above Notice also that every quarter has a weight of 1/4, though the Third Quarter uses values in both 2004 and 2005

    51. 51 X-12 ARIMA Keep in mind that the data used in the Seasonal and the Trend Filters can go back several years. Let's look at an example using X-12 ARIMA's seasonal Moving Average Filters: if the last point in the series is January 2006, and you're using 3x5 Seasonal Filters, the value at January 2006 will effect the estimates for Januaries in 2004, and 2005. You can see the value for January 2003 in the equations below The 3x5 filter for January 2004: 2001.1+2002.1+2003.1+2004.1+2005.1 2002.1+2003.1+2004.1+2005.1+2006.1 2003.1+2004.1+2005.1+2006.1+2007.1 _____________________________________________________________ 15

    52. 52 X-12 ARIMA Step 1.1: Initial Trend Estimate Compute a centred 12 term (13 term) Moving Average as a first estimate of the Trend: The first Trend estimation is always 2x12 or 2x4 Step 1.2: Initial Seasonal-Irregular component or “SI Ratio” The ratio of the original series to the estimated trend is the first estimate of the de-trended series:

    53. 53 X-12 ARIMA Step 1.3: Initial Preliminary Seasonal Factor 5 term weighted Moving Average (3x3) is calculated for each month of the Seasonal-Irregular ratios (SI) to obtain preliminary estimates of the Seasonal Factors: Step 1.4: Initial Seasonal Factor Crude “unbiased” seasonal from step 1.3 via centering: Step 1.5: Initial Seasonal Adjustment

    54. 54 X-12 ARIMA Step 2.1: Intermediate Trend Calculate an intermediate Trend (Henderson) of length 2H+1 for data-determined H where are the (2H+1) term Henderson weights Step 2.2: Final SI Ratios Calculate the de-trended series from Henderson trend:

    55. 55 X-12 ARIMA Step 2.3: Preliminary Seasonal Factor Calculate final “biased” seasonal factors via a “3x5” Seasonal Moving Average: Step 2.4: Seasonal Factor Calculate final “unbiased” seasonal factors via centering: Step 2.5: Seasonal Adjustment

    56. 56 X-12 ARIMA Step 3.1: Final Trend Final Trend from a Henderson Trend Filter determined from data: Step 3.2: Final Irregular Final Irregular Factors as ratios between the Seasonally Adjusted series from Stage 2 and the final Trend from Step 3.1:

    57. 57 X-12 ARIMA

    58. 58 X-12 ARIMA The first Trend estimation (in Stage 1) is always 2x12 or 2x4 The Henderson Trend Filter choices (Step 2.1, 3.1) are based on noise-to-signal ratios, the size of the irregular variations relative to those of the Trend and labelled I/C in the X-12 ARIMA output If I/C<1, the 9 term Henderson Filter (H=4) is used; otherwise, in Stage 2, the 13 term filter (H=6) is used In Stage 3, the 13 term Filter is used when 1<I/C<3.5, but the 23 term Henderson Filter (H=11) is used when I/C=3.5

    59. 59 X-12 ARIMA The criterion for selection of the seasonal Moving Average is based on the global I/S, which measures the relative size of irregular movements and seasonal movements averaged over all months or quarters. It is used to determine what seasonal Moving Average is applied using the following criteria: The global I/S ratio is calculated using data that ends in the last full calendar year available If 2.5<I/S<3.5 or 5.5<I/S<6.5 then the I/S ratio will be calculated using one year less of data to see if the I/S ratio than falls into one of the ranges given above. The year removing is repeated either until the I/S ratio falls into one of the ranges or after five years a 3x5 Moving Average will be used

    60. 60 TRAMO/SEATS The objective of the procedure is to automatically identify the model fitting the Time Series and estimate the model parameters. This includes: The selection between additive and multiplicative model types (log-test) Automatic detection and correction of outliers, eventual interpolation of missing values Testing and quantification of the Trading Day effect Regression with user-defined variables Identification of the ARIMA model fitting the Time Series, that is selection of the order of differentiation (unit root test) and the number of autoregressive and Moving Average parameters, and also the estimation of these parameters

    61. 61 TRAMO/SEATS The application belongs to the ARIMA model-based method variants of decomposition of Time Series into non-observed components The decomposition procedure of the SEATS method is built on spectrum decomposition Components estimated using Wiener-Kolmogorov Filter SEATS assumes that: The Time Series to be Adjusted Seasonally is linear, with normal White Noise innovations If this assumption is not satisfied, SEATS has the capability to interwork with TRAMO to eliminate special effects from the series, identify and eliminate outliers of various types, and interpolate missing observations Then the ARIMA model is also borrowed from TRAMO

    62. 62 TRAMO/SEATS The application decomposes the series into several various components. The decomposition may be either multiplicative or additive The components are characterized by the spectrum or the pseudo spectrum in a non-stationary case: The Trend Component represents the long-term development of the Time Series, and appears as a spectral peak at zero frequency. One could say that the trend is a cycle with an infinitely long period The effect of the Seasonal Component is represented by spectral peaks at the seasonal frequencies The Irregular Component represents the irregular White Noise behaviour, thus its spectrum is flat (constant) The Cyclic Component represents the various deviations from the trend of the Seasonally Adjusted series, different from the pure White Noise

    63. 63 First SEATS decomposes the ARIMA model of the Time Series observed, that is, identifies the ARIMA models of the components. This operation takes place in the frequency domain. The spectrum is divided into the sum of the spectra related to the various components Actually SEATS decides on the basis of the argument of roots, which is mostly located near to the frequency of the spectral peak The roots of high absolute value related to 0 frequency are assigned to the Trend Component The roots related to the seasonal frequencies to the Seasonal Component The roots of low absolute value related to 0 frequency and the cyclic (between 0 and the first seasonal frequency) and those related to frequencies between the seasonal ones are assigned to the Cyclic Component The Irregular Component is always deemed as white noise TRAMO/SEATS

    64. 64 Questions?

More Related