290 likes | 424 Vues
This guide by Anke Huss provides essential techniques for generating automatic tables in Stata using do-files. It emphasizes the benefits of programming tables, such as reducing writing time and minimizing errors with data updates. The article covers the process of designing tables, storing results, and exporting them to text or Excel files, with a focus on using the Caerphilly Prospective Study dataset. Practical examples illustrate how to create variables, replace contents, and format results efficiently. Streamlining your statistical reporting has never been easier!
E N D
Tricks in Stata Anke Huss Generating „automatic“ tables in a do-file
Why programming tables? • It‘s much more writing in the do-file! • BUT: once you have done it, the next one will be faster (copy & paste...) • No more troubles with updates of your data • No more copying mistakes, because Stata does it for you
Caerphilly castleUsed data: Caerphilly Prospective study (CAPS)download at: www.blackwellpublishing.com/ essentialmedstats/datasets.htm
Basic idea • Use the Stata data sheet for your table-to-be
Stored results in r() and e() • Use stored results usually from r-class: results after general commands such as summarize are saved in r() and generally must be used before executing more commands. For an overview type: return list e-class: results from estimation commands (regress/logictic…) are saved in e() until the next model is fitted. Overview: ereturn list
Steps • DESIGN TABLE FIRST: what do I want my table to look like? • generate a new variable for each column • replace cell with number of interest • use „outsheet“ to write your new variables in text/ excel file
Example 1 1. DESIGN FIRST: what do I want my table to look like? E.g.:
Example 1 2. Generate a new variable for each column gen str illness = ““ gen percent =.
Example 1 3. Replace cell with contents/ number of interest: first column sort id replace illness = “myocardial inf“ in 1 replace illness = “diabetes“ in 2
Example 1 3. Replace cell with contents/ number of interest: second column sum mi sort id replace percent = r(mean)*100 in 1 sum diabetes sort id replace percent = r(mean)*100 in 2 format percent %9.2f
Example 1 4.use „outsheet“ to write your new variables in text/ excel file outsheet illness percent in 1/2 using textres/illns.txt For further *comment 1: this works only if you have set STATA to work in a specific STATA folder. Eg: cd "d:/Statistisches/automatic_tables/STATA„ *comment 2: you can also export as excel file (*.xls), but automatic import of new textfile lets graphics survive...
Example 1 *Alternative way to do the same: program a small loop: gen str name = "" gen percent = . local i = 1 foreach var of varlist mi diabetes { replace name = “`var'“ in `i' sum `var' sort id replace percent = r(mean)*100 in `i' local i = `i' + 1 } format percent %9.2f
Example 2 1. DESIGN TABLE FIRST:
Example 2 2. Generate a new variable for each column gen str category = "" gen percent = .
Example 2 3. Replace cell with contents/ number of interest: first column sort id replace category = "underweight" in 1 replace category = "normal" in 2 replace category = "overweight" in 3 replace category = "obese" in 4
Example 2 3. Replace cell with numbers: second column ta bmicat, gen (bminew) *4 lines with percentages *4 variables with ending in numbers from 1 to 4 --- LOOP! forvalues i = 1/4 { sum bminew`i' sort id replace percent = r(mean)*100 in `i' } format percent %9.2f
Example 2 4. Outsheet ...same as in example 1
Less writing... label list bmicat capture drop percent category bminew* ta bmicat, gen (bminew) gen category =. gen percent = . forvalues i = 1/4 { sum bminew`i' sort id replace category = `i' in `i' replace percent = r(mean)*100 in `i' } label values category bmicat format percent %9.2f
Example 3 1. THINK FIRST: table after logistic reg.
Example 3 2. Generate a new variable for each column gen str currsmok = "" gen OR = . gen uci = . gen lci = . gen pval =.
Example 3 3. Replace cell with contents/ number of interest: first column sort id replace currentsm = "current smoking" in 1 replace currentsm = "current smoking + age" in 2 replace currentsm = "current smoking + age + bmi" in 3
Example 3 3. Replace cell with numbers: second column logistic mi cursmoke sort id replace OR = exp(_b[cursmoke]) in 1 replace lci = exp(_b[cursmoke] - 1.96*_se[cursmoke]) in 1 replace uci = exp(_b[cursmoke] + 1.96*_se[cursmoke]) in 1 est store A logistic mi est store B lrtest A B sort id replace pval = r(p) in 1 ... In lines 2 and 3
Example 3 4. outsheet ...as in example 1
Other way to save results after estimation commands • Use the statsby command: eg: statsby "logistic mi diabetes smoking" _b _se, saving (D:\Statistisches\automatic_tables\STATA\data\caerphillystatsby.dta) replace Statsby will collapse your dataset! Store results in a new dataset and open the original file again. Rerun "statsby" with next variables and append data to first stored results.