1 / 22

Some Implementation Issues of Scanner Data

Some Implementation Issues of Scanner Data. Muhanad Sammar, Anders Norberg & Can Tongur. Some Background. 3 major outlet chains in Sweden Statistics Sweden has received scanner data since 2009 First principal issue to decide how to use S.D.

marrim
Télécharger la présentation

Some Implementation Issues of Scanner Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Implementation Issuesof Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

  2. Some Background • 3 major outlet chains in Sweden • Statistics Sweden has received scanner data since 2009 • First principal issue to decide how to use S.D. • The Swedish CPI Board approved the use of scanner data in 2011 • Second principal issue how to aggregate data

  3. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control

  4. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control

  5. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and productsSample of 32 supermarket and local shops and 4 hypermarkets3 negatively coordinated samples of 500 products, identified by EAN for productsA. is the Swedish idea

  6. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control

  7. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information Index(M.C.P.) Index(S.D.) Index = * Index(S.D.) big sample small sample

  8. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information big sample Index(S.D.) Index = Index(M.C.P.) * Index(S.D.) small sample

  9. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control

  10. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets)Problems; - COICOP-classification of all products - Products with deposits must be identified - New products might hide price changes

  11. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control.

  12. The First Principal Issue – How to Use Scanner Data • Replace the manually collected price data with scanner data for the sample of outlets and products • Use scanner data as auxiliary information • Compute index based on scanner data all products (and outlets) • Use scanner data for auditing and quality control. We have seen variation between price collectors as regards quality of delivery

  13. The Second Principal Issue – Data Aggregation • Scanner data are weekly aggregates of datafor each product and outlet in the sample • Each week has ca. 8 500 price observations • Weekly data requires aggregation to month Natural choices of aggregation:i. Unweighted Geometric Mean value or ii. Quantity-Weighted Arithmetic Mean value Motives • In line with rest of CPI for daily necessities • In line with data

  14. The Two Mean Values • The geometricmeanvalue: • The weightedarithmeticmeanvalue: • Wecompared the twomethodsirrespective of theirinhabiteddifferences

  15. Some Statistics • 2% Geometric > Arithmetic in base while Geometric=Arithmetic in Jan, Feb, Mar • 3% Geometric = Arithmetic in base while Geometric > Arithmetic in Jan, Feb, Mar • > 98% of observations (weekly prices) without variations between days • Ca. 9% of monthly prices had variations between weeks

  16. Figure 5.1 in the paper: Logarithmic ratios of mean prices in current month relative to base period. Unweighted geometric mean on vertical axis and quantity-weighted arithmetic mean on horizontal axis. Eight sectors are numbered for analysis purposes.

  17. Figure 5.2 in the paper. Monthly price indices for product groups in supermarkets and hypermarkets, based on geometric and arithmetic mean prices per month.

  18. Indices by Different Methods Quantity weigthing seems to impact a bit…

  19. Figure 5.3 in the paper. Distribution of price changes during January – April 2012 with base in December 2011. Unweighted geometric mean.

  20. Data Quality Variation between outlets for scanner data (left) and manually collected data (right). Individual prices on vertical axis and monthly average prices per product on horizontal axis. The year 2010.

  21. Data Quality (2) Scanner Data (S.D.) and Manually Collected Prices (M.C.P.) in comparison. Product-offers, outlets and weeks. January – December, 2009 and 2010. Number of comparable product-offers is 36 102 and 38 786 respectively.

  22. EAN code maintenance • S.D = Vast Amounts of Data ≠ Large Samples • Data extraction = EAN code probing • Yearly EAN survival rate (base-to-base) 70-80% • Some 500 products identified and maintained • Until now, 35 of 538 products changed EAN code during 2012 (=6.5%) • Fixed basket implication - Always up to date with S.D.!

More Related