1 / 41

<odesi> project? Microdata? Say what?

<odesi> project? Microdata? Say what?. TRY Conference May 5, 2008 Suzette Giles, Data Librarian, Ryerson University Laine Ruus, Data Librarian, University of Toronto. Acronym of “Ontario Data Documentation, Extraction Service and Infrastructure Initiative”

lyle-bates
Télécharger la présentation

<odesi> project? Microdata? Say what?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. <odesi> project? Microdata? Say what? TRY Conference May 5, 2008 Suzette Giles, Data Librarian, Ryerson University Laine Ruus, Data Librarian, University of Toronto

  2. Acronym of “Ontario Data Documentation, Extraction Service and Infrastructure Initiative” • A collaborative project between OCUL (Council of Ontario University Libraries and Ontario Buys • A new product delivered through Scholars Portal • Provides web-based resource discovery to a growing collection of Canadian data • 3rd generation statistics and data extraction system

  3. What we’ll cover: • Why are “data” so important that Ontario Buys and OCUL (Ontario Council of Ontario Libraries) are investing over $1million to provide access? • How will<odesi> support teaching and research in quantitative methods and contribute to statistical literacy? • How will <odesi> help me at the Reference Desk?

  4. Why are Ontario Buys and OCUL investing over $1million? • 23 universities and colleges in Ontario belong to Statistics Canada’s DLI which is the source of the majority of survey (microdata) available at the moment. Less than half have full-time staff “doing” data • <odesi> equalizes access for all Ontario universities with centralized storage and interface for these data • The metadata format used in <odesi>is an international standard that should reduce technological obsolescence

  5. How does <odesi> support teaching and research in quantitative methods and contribute to statistical literacy? • By the ability to search across collections at a more detailed level than Statistics Canada provides • By access to resources that have not previously been readily accessible e.g. Canadian Gallup Polls • Has capacity to display descriptive statistics, whether the resource is aggregate statistics (tables) or microdata (survey data)

  6. How does <odesi> support teaching and research in quantitative methods and contribute to statistical literacy? • Web-based access encourages in-classroom use by faculty to support learning • Access 24x7 supports research on and off campus • A centralised resource supports intra- and inter-university research projects

  7. How can <odesi> help me at the Reference Desk? • Access to a collection of statistics and data in a uniform interface • Searchable by keyword to see if there are statistics or data on a topic • Enables the creation, on the fly, of aggregate statistics (tables) that have not been published elsewhere. • Blurs the distinction between aggregate statistics and microdata

  8. What are microdata and why do I need to know about them? • Microdata are the actual responses that survey or census respondents give to a questionnaire • Usually translated into a numeric format so that one can do arithmetic with them For example: • Income: high, medium, low  Average = ??? • Income: $13,725, $118,297, $63,958  Average = $65,327

  9. From microdata are generated descriptive statistics • Microdata - A person is working or not working • Aggregate statistics (table) - A count of the number of persons not working (in a geographic area) - A count of the number of persons not working divided by the number of persons in the labour force = the unemployment rate

  10. … and more descriptive statistics • Microdata - A family has a gross annual income in year 2005 • Aggregate (descriptive) statistics - Families in a geographic area have an average income - 50% of families in a geographic area have an income above (or below) the median income - LICO is the % of families in a geographic area that have an income below the low income cut-off for that geographic area and family size

  11. Let’s look at an example of what you can do with microdata

  12. Table in Statistics Canada website High blood pressure, by sex Fixed content

  13. Information on where data were derived from: Statistics Canada, Community Health Survey (CCHS 3.1), 2005

  14. Let’s find the survey using <odesi>www.odesi.ca

  15. But first let’s look at this page: Help Simple Search (one word) Tutorials

  16. List of datasets

  17. List of Datasets

  18. Looking for CCHS 3.1

  19. Need to select the variables we are interested in

  20. Switch to TABULATION option

  21. Put the mouse over the selected variable. Choose where in the table you want the variable to go

  22. Does high blood pressure vary by gender?

  23. So what happens if we look at BMI in relation to sex and high blood pressure?

  24. And those who are underweight……..

  25. Compared with those who are normal weight…

  26. But if we bring BMI in as a column, the table looks like …

  27. But Wait - Weight • All the tables produced so far have been based on the people who answered the survey – all 4,165 of them • These counts do not reflect the distribution in the whole population of Canada – only in the sample • To describe the population of Canada, the results must be weighted

  28. …the weighting icon

  29. …and the counts are now very different!

  30. With weighting off……..counts reflect the sample With weighting on…..counts reflect the population

  31. Which can also be graphed………… Graph icon

  32. <odesi> Available in all Ontario universities Searching metadata across files Descriptive statistics only Best for novice users Download system files (user needs to have software) SDA (hosted at U. of T) 10 universities subscribe including Ryerson & York More advanced statistical analysis functions Best for advanced users Download raw data & syntax files (user needs to create system files) Two systems

  33. What’s changed? • In the “Bad old days” - Statistics were published in books/periodical, data were “published” mainly as files of microdata or very extensive aggregate statistics - You needed access to a mainframe or PC or Mac - You needed special software (SAS, SPSS, Stata) - You needed training in the production of descriptive statistics (weighting, types of data and appropriate types of descriptive statistics)

  34. But nowadays! • Statistics are published as Excel files, Beyond 20/20 files, and microdata are available in 3rd generation interfaces • All you need is a computer and a web browser • Generate descriptive statistics from microdata with a few mouse clicks – tho’ you still need to know how to interpret them (and knowing about weighting is a good idea too!) • Users can download data 24x7 for further processing on their own workstations with appropriate software.

More Related