1 / 45

Introduction to Statistical Computing in Clinical Research

Introduction to Statistical Computing in Clinical Research. Biostatistics 212 Lecture 1. Today. Course overview Course objectives Course details: grading, homework, etc Schedule, lecture overview Where does Stata fit in? Basic data analysis with Stata Stata demos Lab. Course Objectives.

Télécharger la présentation

Introduction to Statistical Computing in Clinical Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Statistical Computing in Clinical Research Biostatistics 212 Lecture 1

  2. Today... • Course overview • Course objectives • Course details: grading, homework, etc • Schedule, lecture overview • Where does Stata fit in? • Basic data analysis with Stata • Stata demos • Lab

  3. Course Objectives • Introduce you to using STATA and Excel for • Data management • Basic statistical and epidemiologic analysis • Turning raw data into presentable tables, figures and other research products • Prepare you for Fall courses • Start analyzing your own data

  4. Course Objectives • NOT Statistical theory • You’ll get a bit today and later in the course, but we don’t focus on this. • I’m a clinical epidemiologist, not a statistician

  5. Course details • Biostats 212 • 1 Unit Course • Satisfactory/Unsatisfactory vs. Grades • 7 Sessions – Lecture + Lab • Online Office Hours • Online Forum

  6. Course details • Course Teaching Staff • Brief introductions • Mark Pletcher, MD MPH • Jen Cocohoba, PharmD, MAS • Mohammad Al Komser, MD • Sanoj Punnen, MD • Elizabeth Rogers, MD • Melissa Rosenstein, MD • Mandana Khalili, Barbara Grimes, Nancy Hills

  7. Course details • Lectures • Tuesdays 1:15-2:45, but most will be shorter • Simulcast to 6704! (new this year) • Rationale (why not move to a lecture hall?) • 30 second delay • “Pop over” to 6702 to ask a question? • Both didactic and “demo’s”, time for questions • Jen gives last lecture – special format?

  8. Course details • Recorded Lectures • Audio + video of lecturer + video of screen • Available same day for viewing • Links posted on website syllabus

  9. Course details • Labs • Tuesdays 3:00-4:00 officially, but usually starts earlier • 6702 and 6704 • TA’s staff from start to 4:00 • Lab instructors staff from 3:00-4:00 • Most important part of the course!

  10. Course details • Online Office Hours: • Thursdays, 8:00-9:30AM • Jennifer Cocohoba (Asst Course Director) will lead • Serves as the lab session for off-site (online only) students • Will use GoToMeeting • See instructions posted on the Syllabus • Drop in if you need help with lab!

  11. Course details • Forum • Demo • Post all questions here! • TA turnaround time • Before you post, see if it’s already there and answered • Consider turning ON all your alerts around lab time? • Quick demo

  12. Course details • Course Requirements • Hand in all six Labs (even if late) • Satisfactory Final Project • Not required • Reading • Attendance

  13. Course details • Grading (not relevant for all students) • Letter grades: Standard cutoffs • 90-100% A • 80-89% B • 70-79% C • 60-69% D • <60% or Course Requirements not met: F • Satisfactory/Unsatisfactory • >80% Satisfactory

  14. Overview of lecture topics • 1- Introduction to STATA • 2- Do files, log files, and workflow in STATA • 3- Generating variables and manipulating data with STATA • 4- Using Excel • 5- Basic epidemiologic analysis with STATA • 6- Making tables and figures with STATA • 7- Advanced Programming Topics

  15. Overview of labs • Lab 1 – Load a dataset and analyze it • Lab 2 – Learn how to use do and log files • Lab 3* – Import data from excel, generate new variables and manipulate data, document everything with do and log files. • Lab 4 – Using and creating Excel spreadsheets • Lab 5* – Epidemiologic analysis using Stata • Lab 6 – Making a figure with Stata Last lab session will be dedicated to working on the Final Project * - Labs 3 and 5 are significantly longer and harder than the others

  16. Overview of labs, cont • Official In-Person Lab time is 3:00-4:00 on Tuesday, but we will start right after lecture, and you can leave when you are done.

  17. Overview of labs, cont • Labs are due the following week prior to lecture. Labs turned in late (less than 1 week) will receive only half credit; after that, no points will be awarded. However, ALL labs must be turned in to pass the class (even if no points are awarded). • Lab 1 is paper • Labs 2-6 are electronic files, and should be emailed to your section leader’s course email address: biostat212_section1@yahoo.com (Melissa/Sanoj) or biostat212_section2@yahoo.com (Elizabeth/Mohammed)

  18. Final Project • Create a Table and a Figure using your own data, document analysis using Stata. • Due 1 week after last lab session, 20 points docked for each 1 day late. • See 1-page description in Syllabus • Start looking for data!

  19. Course Materials • Online Syllabus (http://www.epibiostat.ucsf.edu/courses/schedule/biostat212.html) • Lectures and Labs/Datasets (“just in time”) • Miscellaneous handouts • Final Project • Short demo

  20. Getting started with STATA Session 1

  21. Types of software packages used in clinical research • Statistical analysis packages • Spreadsheets • Database programs • Custom applications • Cost-effectiveness analysis (TreeAge, etc) • Survey analysis (SUDAAN, etc)

  22. Software packages for analyzing data • STATA • SAS • S-plus, and R • SPS-S • SUDAAN • Epi-Info • JMP • MatLab • StatExact

  23. Why use STATA? • Quick start, user friendly • Immediate results, response • You can look at the data • Menu-driven option • Good graphics • Log and do files • Good manuals, help menu

  24. Why NOT use STATA? • SAS is used more often? • SAS does some things STATA does not? • Programming easier with S-plus and R? • R is free • Complicated data structure and manipulation easier with SAS? • Epi-info is free and even easier than STATA?

  25. STATA – Basic functionality • Holds data for you • Stata holds 1 “flat” file dataset only (.dta file) • Listens to what you want • Type a command, press enter • Does stuff • Statistics, data manipulation, etc • Shows you the results • Results window

  26. Demo #1 • Open the program • Entering vs. loading data • Look at data • Run a command • Orient to windows and buttons

  27. Two basic windows Command Results Optional windows Variable list Properties History of commands Other functions Data browser/editor Variables Manager Do file editor Viewer (for log, help files, etc) STATA - Windows

  28. STATA - Buttons • The usual – open, save, print • Log-file open/suspend/close • Do-file editor • Browse and Edit • Break

  29. STATA - Menus • Almost every command can be accessed via menu  dialog box

  30. Menu advantages Browse for commands you don’t know already See the options for each command in dialog boxes Good way to learn syntax for complex commands Command line advantages MUCH faster ONLY way to write “do” files Document and repeat analyses Menu vs. Command line

  31. Demo #2 • Load a STATA dataset • (intro to CARDIA) • Explore the data • Describe the data • Answer some simple research questions • Variables: male sex, smoking, binge drinking, BMI and systolic blood pressure

  32. STATA commandsDescribing your data • describe [varlist] • Displays variable names, types, labels • list [varlist] • Displays the values of all observations • codebook [varlist] • Displays labels and codes for all variables

  33. STATA commandsDescriptive statistics – continuous data • summarize [varlist] [, detail] • # obs, mean, SD, range • “, detail” gets you more detail (median, etc) • ci [varlist] • Mean, standard error of mean, and confidence intervals • Actually works for dichotomous variables, too.

  34. STATA commandsGraphical exploration – continuous data • histogramvarname • Simple histogram of your variable • graph box varlist • Box plot of your variable • qnorm varname • Quantile plot of your variable to check normality

  35. STATA commandsDescriptive statistics – categorical data • tabulate [varname] • Counts and percentages • (see also, table - this is very different!)

  36. STATA commandsAnalytic statistics – 2 categorical variables

  37. STATA commandsAnalytic statistics – 2 categorical variables • tabulate [var1] [var2] • “Cross-tab” • Descriptive options , row (row percentages) , col (column percentages) • Statistics options , chi2 (chi2 test) , exact (fisher’s exact test)

  38. Getting help • Try to find the command on the pull-down menus • Help menu • If you don’t know the command – “Search...” • If you know the command – “Stata command...” • Try the manuals • PDF files with more detail, theoretical underpinnings, etc • Accessed through the help menu

  39. STATA commandsAnalytic statistics – 1 categorical, 1 continuous

  40. STATA commandsAnalytic statistics – 1 categorical, 1 continuous • bysortcatvar: summarize [contvar] • mean, SD, range of one in subgroup • ttest [contvar], by(catvar) • t-test • oneway [contvar] [catvar] • ANOVA • table [catvar] [, contents(mean [contvar]…) • Table of statistics

  41. STATA commandsAnalytic statistics – 2 continuous

  42. STATA commandsAnalytic statistics – 2 continuous • scatter [var1] [var2] • Scatterplot of the two variables • pwcorr [varlist] [, sig] • Pairwise correlations between variables • “sig” option gives p-values • spearman [varlist] [, stats(rho p)] • lowessyvarxvar

  43. In Lab Today… • Expect some chaos! • IT will be here to help with wireless, logins, etc • All ATCR and MAS students need logins for our network • Familiarize yourself with Stata • Load a dataset • Use Stata commands to analyze data and fill in the blanks

  44. Next week • Do files, log files, and workflow in Stata • Start looking for a dataset

  45. Website addresses • Course website • http://www.epibiostat.ucsf.edu/courses/schedule/biostat212.html • Computing information • http://www.epibiostat.ucsf.edu/courses/ChinaBasinLocation.html#computing • Download RDP for Macs (for Stata Server) • http://www.microsoft.com/mac/remote-desktop-client • Citrix Web Server • http://apps.epi-ucsf.org/ • Stata 12 Server • 65.175.48.75

More Related