1 / 38

Introduction to Statistical Computing in Clinical Research

Introduction to Statistical Computing in Clinical Research. Biostatistics 212 Lecture 1. Today. Course overview Course objectives Course details: grading, homework, etc Schedule, lecture overview Where does Stata fit in? Basic data analysis with Stata Stata demos. Course Objectives.

nubia
Télécharger la présentation

Introduction to Statistical Computing in Clinical Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Statistical Computing in Clinical Research Biostatistics 212 Lecture 1

  2. Today... • Course overview • Course objectives • Course details: grading, homework, etc • Schedule, lecture overview • Where does Stata fit in? • Basic data analysis with Stata • Stata demos

  3. Course Objectives • Learn how to use STATA and Excel for • Data management • Basic epidemiologic analysis • Turning raw data into presentable tables, figures and other research • Prepare you for Fall courses • Start analyzing your own data

  4. New this year! • Summer (not Fall) • Every week (not every other) • Biostat I in Fall will use Stata • Two TA’s • Implications • Faster pace, shift burden off Fall to Summer • You need your own data SOON!

  5. Course details Introduction to Statistical Computing - 1 unit Schedule – 7 lectures, 7 labs Dates: August 1,8,15,22,29, September 5,12 Lectures 1:15-2:45 Labs 3:00-4:00 All in China Basin (CBL 6702, 6704) Final Project Due 9/19/06

  6. Course details Introduction to Statistical Computing - 1 unit Grading: Satisfactory/Unsatisfactory Requirements: -Hand in all five Labs (even if late) -Satisfactory Final Project -80% of total points Reading: Optional

  7. Course Director Mark Pletcher, MD, MPH, 514-8008 mpletcher@epi.ucsf.edu Teaching Assistants Diana Antoniucci, MD dantoniucci@psg.ucsf.edu biostat212_section1@yahoo.com Grace Lin, MD glin@medsfgh.ucsf.edu biostat212_section2@yahoo.com Lab Instructors Mandana Khalili, MD, MAS mandana.khalili@ucsf.edu Alan Bostrom, PhD Course details, contFaculty

  8. Overview of lecture topics • 1- Introduction to STATA • 2- Do files, log files, and workflow in STATA • 3- Generating variables and manipulating data with STATA • 4- Using Excel • 5- Basic epidemiologic analysis with STATA • 6- Organizing a project, making a table • 7- Making a figure with STATA or Excel

  9. Overview of labs • Lab 1 – Load a dataset and analyze it • Lab 2 – Learn how to use do and log files • Lab 3* – Import data from excel, generate new variables and manipulate data, document everything with do and log files. • Lab 4 – Using and creating Excel spreadsheets • Lab 5* – Epidemiologic analysis using Stata * - Labs 3 and 5 are significantly longer and harder than the others Note: Course is front-loaded –last 2 lab sessions are dedicated to Final Project

  10. Overview of labs, cont • Official Lab time is 3:00-4:00, but we will start right after lecture, and you can leave when you are done. • Lab sections led by Diana Antoniucci and Grace Lin • Diana’s section is designated as the “Mac” section • Labs also staffed by Alan Bostrom, Mandana Khalili, and I

  11. Overview of labs, cont • Labs are due the following week prior to lecture. Labs turned in late (less than 1 week) will receive only half credit; after that, no points will be awarded. However, ALL labs must be turned in to pass the class (even if no points are awarded). • Lab 1 is paper • Labs 2-5 are electronic files, and should be emailed to your section leader’s course email address: biostat212_section1@yahoo.com (Diana) or biostat212_section2@yahoo.com (Grace)

  12. Final Project • Create a Table and a Figure using your own data, document analysis using Stata. • Due 1 week after last lab session, 20 points docked for each 1 day late.

  13. Orientation to binder • Course Overview • Final Project • Lectures and Labs (just in time) • Other handouts

  14. Getting started with STATA Session 1

  15. Types of software packages used in clinical research • Statistical analysis packages • Spreadsheets • Database programs • Custom applications • Cost-effectiveness analysis (TreeAge, etc) • Survey analysis (SUDAAN, etc)

  16. Software packages for analyzing data • STATA • SAS • S-plus, and “R” • SPS-S • SUDAAN • Epi-Info • JMP • MatLab • StatExact

  17. Why use STATA? • Quick start, user friendly • Immediate results, response • You can look at the data • Menu-driven option • Good graphics • Log and do files • Good manuals, help menu

  18. Why NOT use STATA? • SAS is used more often? • SAS does some things STATA does not • Programming easier with S-plus • Complicated data structure and manipulation easier with SAS • Epi-info (free) is even easier than STATA?

  19. STATA – Basic functionality • Holds data for you • Stata holds 1 “flat” file dataset only (.dta file) • Listens to what you want • Type a command, press enter • Does stuff • Statistics, data manipulation, etc • Shows you the results • Results window

  20. Demo #1 • Open the program • Load some data • Look at it • Run a command

  21. Two basic windows Command Results Optional windows Variable list History of commands Other functions Data browser/editor Do file editor Viewer (for log, help files, etc) STATA - Windows

  22. STATA - Buttons • The usual – open, save, print • Log-file open/suspend/close • Do-file editor • Browse and Edit • Break

  23. STATA - Menus • Almost every command can be accessed via menu

  24. Demo #2 • Enter in some data • Look at it • Run a couple of commands

  25. Menu advantages Look for commands you don’t know about See the options for each command Complex commands easier – learn syntax Command line advantages Faster (if you know the command!) “Closer” to the program Only way to write “do” files Document and repeat analyses Menu vs. Command line

  26. STATA commandsDescribing your data • describe [varlist] • Displays variable names, types, labels • list [varlist] • Displays the values of all observations • codebook [varlist] • Displays labels and codes for all variables

  27. STATA commandsDescriptive statistics – continuous data • summarize [varlist] [, detail] • # obs, mean, SD, range • “, detail” gets you more detail (median, etc) • histogramvarname • Simple histogram of your variable • ci [varlist] • Mean, standard error of mean, and confidence intervals • Actually works for dichotomous variables, too.

  28. STATA commandsDescriptive statistics – categorical data • tabulate [varname] • Counts and percentages • (see also, table - this is very different!)

  29. STATA commandsAnalytic statistics – 2 categorical variables

  30. STATA commandsAnalytic statistics – 2 categorical variables • tabulate [var1] [var2] • “Cross-tab” • Descriptive options , row (row percentages) , col (column percentages) • Statistics options , chi2 (chi2 test) , exact (fisher’s exact test)

  31. Getting help • Try to find the command on the pull-down menus • Help menu • If you don’t know the command - Search... • If you know the command - Stata command... • Try the manuals • more detail, theoretical underpinnings, etc

  32. STATA commandsAnalytic statistics – 1 categorical, 1 continuous

  33. STATA commandsAnalytic statistics – 1 categorical, 1 continuous • bysortcatvar: sum [contvar] • mean, SD, range of one in subgroup • ttest [contvar], by(catvar) • t-test • oneway [contvar] [catvar] • ANOVA • table [catvar] [, contents(mean [contvar]…) • Table of statistics

  34. STATA commandsAnalytic statistics – 2 continuous

  35. STATA commandsAnalytic statistics – 2 continuous • scatter [var1] [var2] • Scatterplot of the two variables • pwcorr [varlist] [, sig] • Pairwise correlations between variables • “sig” option gives p-values

  36. Demo #3 • Load a STATA dataset • Explore the data • Describe the data • Answer some simple research questions

  37. In Lab Today… • Familiarize yourself with Stata • Load a dataset • Use Stata commands to analyze data and fill in the blanks

  38. Next week • Do files, log files, and workflow in Stata • Find a dataset!

More Related