1 / 16

Misinterpretation of data, the importance of metadata and STC math

Misinterpretation of data, the importance of metadata and STC math. DLI Atlantic Training April 2005. Data Misinterpretation: Crime Rates. Ebert & Roeper review of Michael Wilson movie “Michael Moore hates America” Ebert doubted claim that Canadian crime rate 2X the USA rate

drew-bray
Télécharger la présentation

Misinterpretation of data, the importance of metadata and STC math

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Misinterpretation of data, the importance of metadataand STC math DLI Atlantic Training April 2005

  2. Data Misinterpretation: Crime Rates • Ebert & Roeper review of Michael Wilson movie “Michael Moore hates America” Ebert doubted claim that Canadian crime rate 2X the USA rate • Moorelies.com | News: Whoa; Stuart Didn't See That One Coming • Ebert conceded that the statistics supported claim - figures were right • BUT - comparison of STC and US Bureau of Justice website shows how statistics misinterpreted

  3. Comparative Crime Rates Simplistic comparison • Similar category titles on violent and property crimes but different definitions • Violent crime 2-3 times higher in US, property crimes close • Bureau of Justice Statistics Crime & Justice Data Online • Canadian Statistics - Crimes by type of offence

  4. US Crime Data

  5. Canadian Crime Data

  6. Data Misinterpretation:Drinking Habits of Canadians • Initial analysis of the 1990 Health Promotion Survey, indicated Canadians enjoyed an average 60 drinks per day….

  7. Data Misinterpretation:Importance of Metadata 1990 Health Promotion Survey there were a series of questions about alcohol consumption. First they asked if the respondent EVER drank alcohol, and if YES asked if they drank within the last 12 months and if YES asked for number of drinks for each day for the past 7 days. The code book showed number of drinks per day as: 81 F4MON 2 0096‑0097 HOW MANY DRINKS DID YOU HAVE ON: MONDAY 00 NONE 4651 7334907 01:40 NUMBER OF DRINKS 403 2585080 41 MORE THAN 40 DRINKS 1 106 98 QUESTION NOT ASKED 7648 0567910 99 NOT STATED 89 155377 82 F4TUE 2 0098‑0099 HOW MANY DRINKS DID YOU HAVE ON: TUESDAY 00 NONE 4608 7306101 01:40 NUMBER OF DRINKS 1447 2613991 98 QUESTION NOT ASKED 764810567910 99 NOT STATED 89 155377 (Raw Weighted)

  8. Metadata for PUMFS • With Public Use Microdata Files, the code book is very important • Gives questions asked and codes used for responses • “Missing values”, “refusals”, “don’t know” and “not applicable” numeric codes are often assigned • Not consistent in the numeric codes used • Numeric codes that to most software would seem to be valid response

  9. Metadata STC Policy on Informing Users of Data Quality • In place since 1978 • Tightened up 2000 in response to 1999 AG report • Recognition that “All statistics are to some extent estimates” • Statistics to be used with awareness of strengths and weaknesses – “fitness for use” • Key tool is the Integrated Meta Database (Definitions, data sources and methods)

  10. Metadata • Important to find STC metadata and use it • Definitions, Data Sources and Methods • Questionnaire and reporting guides • Survey Description • Data sources and methodology • Data Accuracy • Documentation • Contact us

  11. Definitions, Data Sources and Methods

  12. Online CatalogueCanadian Community Health Survey: public use microdata file: Product main page

  13. DLI WebsiteDLI - Canadian Community Health Survey Cycle 1.1 • DLI listserv: Ask and we will find out from the Division!

  14. Data Quality Symbols

  15. Use metadata to avoid key pitfalls • Collection methodology • Questionnaire • Data quality: sample size, response rates • Definitions • Conceptual changes • Survey coverage • Reweighting/rebasing

  16. STC Math • Random rounding • Percentages and percentage points • Central tendencies (mean, median and mode) • Current vs constant dollars • Raw vs seasonally adjusted

More Related