330 likes | 568 Vues
Epi 202: Designing Clinical Research Data Management for Clinical Research. Thomas B. Newman, MD,MPH Professor of Epidemiology & Biostatistics and Pediatrics, UCSF September 4, 2012. Outline. Data management steps Advantages of database vs spreadsheet entry REDCap demonstration
E N D
Epi 202: Designing Clinical ResearchData Management for Clinical Research Thomas B. Newman, MD,MPH Professor of Epidemiology & Biostatistics and Pediatrics, UCSF September 4, 2012
Outline • Data management steps • Advantages of database vs spreadsheet entry • REDCap demonstration • Take-home message: Pretest should include data entry and analysis
Data Management Steps • Design data collection form • Capture data • Enter data • Clean data • Then can do data analysis
Traditional Paper method • Data collection form design -- Word • Data capture – Pen • Data entry -- keyboard transcription into Excel • Data cleaning -- painful
Oophorectomy • Advantage of paper form: ability to write in answers you had not anticipated • Subject might leave it blank or guess if forced to chose
Race coding: Problems • Free text for “other”: hispanic, latina • “Asian” and “asian” are different values for a string variable
Data cleaning before transcription- study staff Person making changes identified Different color ink
Data cleaning (Stata example) replace race = “Asian” if race == “asian” replace weightchange = 7.5 if weightchange == “5-10 pounds”
Exercise These variables will be hard to analyze. This is what we are trying to avoid.
Data cleaning before transcription- study staff Simple coding
Advantages of paper • Rapid data entry anywhere • Readily understood • Permanent record • Allows ready annotation
Disadvantages of paper • No immediate quality control • Branching logic harder • Data entry required • Allows you to postpone thinking about data analysis when you should be thinking about it now!
Consider data analysis early • Restrict options • Provide range and logic checks • Include coding on the paper form • PRETEST data entry and analysis!
Data Dictionary • Variable name • Type of variable (binary, integer, real, string, etc.) • Variable label (longer name) • Value labels (e.g., 0 = No, 1 =Yes) • Permitted values • Notes
Research Electronic Data Capture (REDCap) • Design survey or data collection form • Creates data dictionary • Can track subjects and responses • Exports to statistical packages • Available with MyResearch account • Other options: Access (PC), Epi-Info (PC), FilemakerPro
REDCap Creates a Stata do file clear insheet participant_id redcap_survey_timestamp redcap_survey_identifier mas_or_ticr want_attend_review dates_available___1 dates_available___2 dates_available___3 dates_available___4 field comments survey_complete using "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV", nonames label data "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV” label define mas_or_ticr_ 1 "No" 2 "Yes ===> Exit this survey" label define want_attend_review_ 1 "No ====> Exit this survey" 2 "Yes" label define dates_available___1_ 0 "Unchecked" 1 "Checked" label define field_ 1 "Clinical pharmacology" 2 "Community medicine" 3 "Dentistry" 4 "Dermatology" 5 "Emergency medicine" 6 "Endocrinology" 7 "Epidemiology/environmental health" 8 "Family medicine" 9 "Global health" 10 "Hospital medicine" 11 "Infectious disease" 12 … label variable mas_or_ticr "Are you in either the Masters Degree in Clinical Research program or the ATCR (Advanced Training in Clinical Research) program?"
Most Important Message: • Pretest!
Main decisions • Electronic capture vs paper • Optical form reading vs keyboard transcription • Enter data into database, spreadsheet or statistical package Highly recommended!
Advantages of database vs Spreadsheet • Restricts choices • Error checking • Can track study progress, produce reports, export to statistical package • Safer – harder to accidentally alter data