1 / 23

How is data generated?

How is data generated?. Recap ‘What is data’ Methods of data collection, setting down, storage Varies across disciplines. How to researchers get data?. Sliding scale of accessibility and formality How to guide – example cheat sheet. Defining research data. Collection method

baina
Télécharger la présentation

How is data generated?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How is data generated? • Recap ‘What is data’ • Methods of data collection, setting down, storage • Varies across disciplines How to researchers get data? • Sliding scale of accessibility and formality • How to guide – example cheat sheet

  2. Defining research data Collection method → Quantitative → Qualitative (& quant) → Quant/Qual → Quant/Qual → Quantitative → Qualitative → Qualitative → ? → Qualitative → ? → ? → ? Other terms Observational? Mixed methods? Secondary? Case study? Cross-sectional? Longitudinal? ‘Big data’ Examples • Numbers • Words/texts • Survey results • Interviews • Machine readings • Voice recordings • Voice transcripts • Images • Video • Sound • Artifacts • Specimens • Samples (medical, paleo, geo, …) • …

  3. Defining research data Different data formats (should be documented!) • Raw • Transcribed • Converted (in format, by analysis) • Derived (e.g., confidentialised, de-sensitised) • Physical or Digitised • Single, multiple, combined datasets • Same ‘research input’ may have multiple data outputs (e.g., ancient/historical scripture – image, digital image, transcription, interpretation)

  4. Common features of data • ‘building blocks of information’ • As information varies with discipline, so do the main kinds of data and methods of collection • E.g., Medical science: bloods + readings = disease presence • E.g., Anthropology: recorded interviews + observations = cultural practices http://www.dcc.ac.uk/sites/default/files/documents/publications/DCC_Howto_Discover_Requirements.pdf

  5. What is data, recap • Formats: Can be physical/analog (e.g. paper) or digital (e.g., Papyrology can be both) • Original or transcribed/described/representative • Methodology – cross-sectional vs. longitudinal, survey vs. administrative • Can be created by and for a range of people and services Data questions?

  6. How and where data is stored • Data storage vs. metadata • Continuums of data storage • *Does not necessarily relate to accessibility Formal (conventions around capture, vocab) Informal (much variability) Individual researcher Stores/repositories Screenshot from: ada.edu.au [Accessed 28/04/2014].

  7. Who manages stored data? • Cultural institutions • Researchers • On institutional file storage networks or portable media  • Captured by third parties - storage or social media service providers, e.g. DropBox or Flickr, Figshare, or data repositories, e.g. Australian Data Archive (NCI, RDSI), VicNode (RDSI) • More examples of databases/repositories after lunch

  8. Continuums of metadata storage Formal Informal Project website Registries/Commons Screenshot from: researchdata.ands.org.au [Accessed 28/04/2014]. Screenshot from: rsha.anu.edu.au [Accessed 28/04/2014].

  9. Accessibility & quality of metadata and data don’t align http://libguides.library.curtin.edu.au/

  10. Accessing dataWhen this might be harder – Sharing and accessing sensitive data

  11. Getting data How do people Find/Discover data? • Movable feast / changing beast • No established methods like other scholarly outputs • No standard practice or vocab • Databases are non-exhaustive • Methods for searching and terms driven by why people are looking (e.g., may start with direct contact from a project website) • and subject matter as well as methodology, accessibility etc.

  12. Finding data • Have you already identified the data or exploring? • Search formal databases (public/private mix): • Research Data Australia (RDA), Australian Bureau of Statistics (ABS), Australian Data Archive (ADA), Figshare, Trove • data.gov.au, data.gov, data.gov.uk • http://databib.org/index.php • Think about search terms by data topics AND characteristics • Informal searching: • ‘Googling’ • From publications • Peer networks • Cold calling Why metadata counts!

  13. Case study • Student approaches ANU library staff to access Child and Adolescent Component (1998) of the National Survey of Mental Health and Wellbeing after reading an study that uses the data • Google locates researcher in WA… • ….who says data is in Australian Data Archive….in Canberra • (but have to know to look there! – not found via google search) • Link to request permission for license (once register with ADA)

  14. Accessing data • So you’ve found an interesting dataset. How do you GET it? • Repository catalogue entries (derived from metadata) will typically provide info about how to obtain the data • …or at least a contact… • Access varies depending on access policy of the owner Why metadata counts! Open Access (public/public) No access Conditional/Mediated (public/sort-of private) Highly sensitive data (e.g., not de-identified medical records) Download from website May need to pay fee and/or sign contract

  15. Conditional or mediated access to data May be held by: • Custodian of data • Login or approval required (e.g., ADA) • Licenced = reuse is (legally) conditional • AusGoal • Organisational licenses (or repository or data manager) What is a license?

  16. AusGoal licences • Australian Government Open Access and Licensing Framework • Ready-made licences with legal surety. Endorsed by CAUL. • Apply least restrictive • 6 levels of Creative Commons license • Least restrictive = CC BY (Default Licence for AustGovt) • Most restrictive = CC BY-NC-ND • Restricted License (template) - for data that contains personal or other confidential information

  17. Sensitive data • Sensitive data is data that can be used to identify an individual or object to place them at risk of discrimination/harm or unwanted attention • Invokes law (Privacy Act) and research ethics • Examples: • Survey data including names and criminal records • Hospital records • Location of endangered species • * sensitive by context

  18. Can sensitive data be shared? • Typically, Yes! • But How? When? • When consent is explicitly given, and/or • When data is de-sensitised (‘de-identified’) • When data is modified • When an appropriate license is applied • Different issues when data is new vs. existing

  19. Stay tuned… ANDS Guide to Sharing Sensitive Data Safely is on the way

  20. Case Study A group of researchers at University of Timbuktoo were interested in the links between mental health, activity, and internet use in young people. They surveyed 986 young people aged 16-20 years. The survey asked about their age (DOB), school, physical and mental health, eating habits, physical activity, computer/internet use, educational achievement, family structure and parents’ cultural background. Paper surveys were used and then destroyed when the data was entered into an electronic database. The researchers would like to make their data available to other researchers – particularly to forms new collaborations and link with similar datasets on young people.

  21. Is the data sensitive? • Barriers to sharing/publishing • What can be done now towards sharing?

More Related