1 / 28

Teaching with Stata

Teaching with Stata. Peter A. Lachenbruch & Alan C. Acock Oregon State University peter.lachenbruch@oregonstate.edu alan.acock@oregonstate.edu. First Course Requirement—Data Entry. I want a first course to be able to do the things I want students to do:

marcm
Télécharger la présentation

Teaching with Stata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Teaching with Stata Peter A. Lachenbruch & Alan C. Acock Oregon State University peter.lachenbruch@oregonstate.edu alan.acock@oregonstate.edu

  2. First Course Requirement—Data Entry • I want a first course to be able to do the things I want students to do: • Enter and edit data--must be “want to know topic” • Students can do a small survey to get data on topics of interest to them. • Voter poll • Attitudes toward diversity issues on campus • Beliefs about regulating the internet • Learn how to create a codebook, use codebookandcodebook, compact • Where possible use “real” data WCSUG Presentation

  3. First Course Requirement—Data Management • Balance statistical content with proper data management content—hard decision • Storing original dataset and creating a working dataset • Keeping a record of every data modification they make using do-file • Menu system is an aid • Do-files are the requirement • Missing values--distinguish types • Variable names, labels, and value labels WCSUG Presentation

  4. First Course Requirements—Data Management • Transformations – log, , exp • Logical editing – beware of logical transformations when missing values are present (gen y = x < 10 leads to “.” transforming to 0) • Appending • Append student generated datasets • Merging • Merging two waves of data WCSUG Presentation

  5. First Course Requirements—Data Management • Constructing Measures • When to use egen newvar =rowtotal(var1, var2, var3) • When to use egen newvar =rowmean(var1, var2, var3) • When to use misschk command, what it does • Suppose the variable category is 0 or 1 • If there are missing values in category, there is a difference between • gen y = 1 if category • gen y = 1 if (category==1) • gen y = 1 if (category>0) • The first and third will give scores of 1 for missing values. The second will give a score of 0 for missing values - BEWARE WCSUG Presentation

  6. First Course Requirements—Data Management • edit command, insheet input, infile(csv files) • gen newvar = ln(oldvar) • Rarely use replace oldvar = sqrt(oldvar) – only when correcting an error – don’t replace data • merge ptid assessment using file, update (need for data to be sorted) WCSUG Presentation

  7. First Course Requirement (2) • Data presentation, numerical summary measures – summarize, detail; list; browse; edit; describe; codebook; codebook, compact • Graphic presentation--bar chart, histogram, box plot seem minimum • Probability computations – binomial, binomialtail, chi2, chi2tail, F, Ftail, normal – use of the inverse functions for these. WCSUG Presentation

  8. Examples • summarize sp,detail; list sp; describe s*; codebook s* • display binomial(10,3,0.1) for cumulative or display Binomial(10,3,.1) for reverse cumulative; Note disp 1-binomial(10,2,.1) gives the same result (also binomialtail(10,3,.1) • display normal(1.2) • gen y = invnormal(uniform())*5+20 WCSUG Presentation

  9. First Course Requirement (3) • Confidence intervals • Binomial – ci—ci variable • Normal – ci—ci variable • Poisson – ci—ci variable, poisson • Percentiles – • summarize,d • centile price, c(10(10)90) WCSUG Presentation

  10. Examples • cii 20 4; • cii 20 4, agresti • Sometimes we want to use the Agresti formulation. The exact is usually preferable • ci varname, level(99) • summarize weakness, detail • Can use su weakn,d (i.e. abbreviate commands, options and variables) • centile weakness,c(20,40,60,80) • Or centile weakness,c(20(20)80) WCSUG Presentation

  11. First Course Requirements (4) • Hypothesis Testing: • Normal r.v.s • One sample (including paired data) - • Two sample - ttest • K samples – ANOVA • Binomial variables • One sample – proportion • Two samples – tabulate, chi2 WCSUG Presentation

  12. Examples • ttest sp = 120 [one-sample] • ttest spmen = spfem [paired] • ttest spmen = spfem, unpaired unequal welch • ttest sp, by(sex) [unequal welch etc.] • Also immediate form – see help • anova sp agegrp WCSUG Presentation

  13. Examples • bitest success = 0.8[one sample binomial] • tabulate success group, chi2 row col • prtest success, by(group)[two sample binomial] WCSUG Presentation

  14. First Course Requirements (5) • Hypothesis Testing (cont.) • Power considerations – sampsi (or spreadsheet – nice exercise for some good ones) • Nonparametric methods – sign, signrank, ranksum • Contingency tables – tabulate, epitab WCSUG Presentation

  15. Examples • sampsi 132.86 127.44, p(0.8) r(2) sd1(15.34) sd2(18.23) • ranksum sp, by(survive) • signrank before = after • When should we supplement Stata with other software such as G*power 3 that is free and more flexible than sampsi or other software such as PASS or nQuery Advisor? WCSUG Presentation

  16. First Course Requirements (6) • Simple linear regression – regress, rvfplot, other diagnostics • Correlation – corr, spearman, ktau – I tend not to use corr because of the sensitivity to the normality assumption for tests and confidence intervals • Only pwcorr and not corr provide test of significance WCSUG Presentation

  17. Examples • regress mpg weight • rvfplot • Stata’s “type a little, get a little” very different from other packages • correlate mpg weight or pwcorr mpg weight (especially when you have more than 2 variables – can specify sig and obs—Note that these only work with pwcorr) • spearman mpg weight – would be nice to have Stata produce a Spearman correlation matrix WCSUG Presentation

  18. Examples • It’s easy to use permutation tests . permute anyhcq t=r(t):ttest ald7 if adult==1 & assnum==1,by(anyhcq) (running ttest on estimation sample) Monte Carlo permutation results Number of obs = 97 command: ttest ald7, by(anyhcq) t: r(t) permute var: anyhcq --------------------------------------------------------------------------- T | T(obs) c n p=c/n SE(p) [95% Conf. Interval] -------------+------------------------------------------------------------- t | 1.648305 13 100 0.1300 0.0336 .071073 .2120407 --------------------------------------------------------------------------- Note: confidence interval is with respect to p=c/n. Note: c = #{|T| >= |T(obs)|} • One can do similar things with the bootstrap • These are easy to use and intuitive for students WCSUG Presentation

  19. Use of Stata in the Classroom • Use Stata sparingly • It’s not easy to follow commands typed or used from menus – students will get confused • Have handouts of what you do – make spacing large enough that students can annotate – even if only to write nasty things about the instructor • Balancing coverage of Stata, e.g. data management with coverage of Statistics is a constant issue • Remember – it’s a course in statistics, not in Stata WCSUG Presentation

  20. Data Sets • Place data sets on a LAN or common drive or available for copying to flash drive or CD • Use real data • Not too many variables • May have missing values – but should not affect main analyses – unless you want to demonstrate the problems with missing values WCSUG Presentation

  21. In the Classroom • Using CD rather than flash drive is better(?) • Many desktops have USB port located inconveniently (darn you Dell!) • Sometimes newer PCs have USB port on monitor, and laptops usually have an easy slot for the flash drive • Light level in the room should allow students to read easily • Days of dim projectors are over WCSUG Presentation

  22. In the Classroom (2) • Enlarge the Stata font by using right mouse button • I have found that 14 point is pretty good • Be careful about wraparound of output – if needed, reduce point size temporarily • Don’t ever use red on blue font • See what I mean? It’s more difficult to read • Show how to move and fix windows WCSUG Presentation

  23. In the Classroom (2) • Optimizing visibility with projector • Use rich color background • EditPreferencesGeneral preferences. Blue background option good but it relies on red for errors, green for Standard text, and doesn’t bold fonts. • Custom may be better because you can make fonts bold and pick colors that do not disadvantage students who are colorblind. WCSUG Presentation

  24. Virtual Lab • A server supporting 30 simultaneous sessions of Stata is remarkably inexpensive. • A department can require students to have laptops or provide a cart with enough laptops • Because laptops are really “dumb” terminals with server, the laptops can be cheap and not updated very often • Any room becomes a lab • Students should have 24/7 access to the server WCSUG Presentation

  25. Handouts and Data Sets • Have handouts of your lecture notes • Have handouts of your data analysis demonstrations • Include commands as well as output! • Data sets • On line – LAN or CD or Floppy disk --Lots of laptops don’t have floppy drives any more, flash drives are inexpensive • Include • Student generated datasets • Datasets with large Ns and relatively few variables WCSUG Presentation

  26. Emphasis in Course • Lectures devoted to statistics • Labs to learning Stata and working on homework and discussion • Proper printing of output • Don’t split output between two pages if possible (at least, find a good break point) • Always use a monotype font (such as Courier New) WCSUG Presentation

  27. Some Final Issues • Multiple testing can distort inference (i.e. doing 100 tests guarantees some significant results – but they may be meaningless) – Worry about this • Controlling the digits in the output. Use outreg, estout, esttab WCSUG Presentation

  28. The End WCSUG Presentation

More Related