1 / 14

Collaborative Data Management for Longitudinal Studies

Collaborative Data Management for Longitudinal Studies. Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported by National Institute on Aging Grant P01 AG18911-01A1). Agenda. 1. Background on Study. 2. Problem – Data Management Deficiencies.

gaye
Télécharger la présentation

Collaborative Data Management for Longitudinal Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaborative Data Management for Longitudinal Studies Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported by National Institute on Aging Grant P01 AG18911-01A1)

  2. Agenda 1. Background on Study 2. Problem – Data Management Deficiencies 3. Solution – Collaborative Data Management 4. STATA Programs – maketest & makedata

  3. Background on Study • NIH-funded Longitudinal Study • Loneliness & Health • Thousands of Measures • Loneliness • Depression • 230 subjects • Repeated Yearly

  4. Problem – Data Management Deficiencies • Code Not Modular …Difficult to manage the data cleaning code …Limited code reuse from year to year …Difficult to collaborate among interns • No Established Set of Data Cleaning Steps …Difficult for research assistants (turn-over) …Inconsistent data cleaning techniques …Data cleaning code difficult to read

  5. Problem – Data Management Deficiencies Research Assistant Research Assistant Research Assistant Core File Set Research Assistant Research Assistant

  6. Solution – Collaborative Data Management • Process • Established Steps • File System Layout • Automated Tests • Collaboration • Concepts • Module • Batch • “Data Certification” • STATA Programs • maketest • makedata

  7. Solution – Collaborative Data Management • Process • Established Steps • File System Layout • Automated Tests • Collaboration • Concepts • Module Ex:loneliness • Batch • “Data Certification” • STATA Programs • maketest • makedata

  8. Solution – Collaborative Data Management • Process • Established Steps • File System Layout • Automated Tests • Collaboration • Concepts • Module Ex:loneliness • Batch Ex:yr1, yr2, yr3 • “Data Certification” • STATA Programs • maketest • makedata

  9. Solution – Collaborative Data Management Set of Files for Each Module acquire-[module].do & fix-[module].do test-[module].do derive-[module].do label-[module].do Year-Specific 60% Code Reuse – Files Shared Between Years Acquire & Fix Test Derive Label

  10. STATA Program – maketest • Purpose: • Auto-generation of Data Certifying Tests • Functionality: • Tests Variable Type • Checks Consistency of Value Labels • Verifies Existence of Variable

  11. STATA Program – maketest • Syntax: • maketest [varlist] using, [REQuire(varlist) append replace] • Example: • maketest using filename.do, replace • Options: • using: specifies file to write • REQ: requires presence of variables in list • append: add to existing test .do file • replace: overwrite existing .do file

  12. STATA Program – makedata “Bringing it all together”

  13. STATA Program – makedata • Syntax: • makedata [namelist], Pattern(string) [replace clear Noisily Batch(namelist) TESTonly] • Example: • makedata ats, p("acquire-*.do") b(yr1) clear replace • Options: • p: pattern – file naming convention • replace: overwrite existing data file • clear: clear current data in memory • Noisily: full output (default = summary) • b: batch – year, wave, center • TESTonly: only run tests step

  14. Other Applications • Beyond Longitudinal Data • Teaching Data Cleaning with STATA • Contact Information • Stephen Brehm: sbrehm@uchicago.edu • L. Philip Schumm: pschumm@uchicago.edu • Ronald A. Thisted: thisted@health.bsd.uchicago.edu • Supported by National Institute on Aging Grant P01 AG18911-01A1

More Related