70 likes | 199 Vues
This guide provides best practices for troubleshooting log files in Stata, emphasizing saving logs in text format for better readability. It details handling missing values effectively, illustrating how Stata excludes observations marked with a period (.) from statistical analysis by default. The guide advises on safely recoding responses to missing values, ensuring observations aren't dropped inadvertently. It also introduces useful Stata commands and loops, enhancing data processing efficiency. Gain insights into effective workflows for managing missing data and optimizing log file usage.
E N D
Problem Set 1 Troubleshooting
Log Files Save in text format for readability: log using ps1.log, replace or: log using ps1, text
Handling Missing Values • By default, Stata excludes all observations marked with a period (.) from subsequent statistical analysis. • Best practices: • Recode appropriate survey responses to missing. Safest: replace v1 = . if v1 == 6 • Do not drop observations with missing values. • Be careful not to recode missing values by accident.
Handling Missing Values Problematic: gen dummy1 = 0 replace dummy1 = 1 if v1 == 4 | v1==5 Safe: gen dummy1 = . replace dummy1 = 1 if v1 == 4 | v1==5 replace dummy1 = 0 if v1 <= 3 gen dummy1 = 1 if v1 == 4 | v1==5 replace dummy1 = 0 if v1 <= 3
Handling Missing Values Stata handles ‘.’ as higher than any integer value. Will recode missing observations: replace dummy1 = 1 if v1 > 6 Safe: replace dummy1 = 1 if v1 > 6 & v1 != .
Optional: PS2 Time Saver Stata supports loops: foreach x of numlist1800 1850 1870 1900 1920 1950 1970 1975 { sum spending if year==`x', detail } foreach x of varlistgdpcap pop taxhead race* { gen log_`x’ = log(`x’) sum `x’, detail hist `x’, name(`x’) }