1 / 65

CIQLE Workshop: Introduction to longitudinal data analysis with stata panel models and event history analysis Silke Ais

CIQLE Workshop: Introduction to longitudinal data analysis with stata panel models and event history analysis Silke Aisenbrey, Yale University. Goals for the workshop: -Intro to stata -Modeling Change over time: Panel Regression Models (fixed, between and random)

betty_james
Télécharger la présentation

CIQLE Workshop: Introduction to longitudinal data analysis with stata panel models and event history analysis Silke Ais

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIQLE Workshop: Introduction to longitudinal data analysis with stata panel models and event history analysisSilke Aisenbrey, Yale University

  2. Goals for the workshop: -Intro to stata -Modeling Change over time: Panel Regression Models (fixed, between and random) -Modeling whether and/or when events occur: Event History Analysis (Data management for event history data, kaplan-meier, cox, piecewise constant)

  3. open stata: VARIABLES of open file RESULTS results and syntax REVIEW of syntax: commands or menu COMMAND

  4. open data, with menu (stata data--> eventex.dta)

  5. to see real data to make changes directly in data erase variables, cases, make single changes in cases -->

  6. basic descriptive commands • relational and logical operators in stata: == is equal to ~= is not equal (also !=) > greater than < less than >= greater than or equal <= less than or equal & and | or ~ not (also!)

  7. basic descriptive commands • sum var • tab var1 var2 • tab var1 var2, col • combine with: …… if var1==2 & var3>0 • by var1: …………… • sort ………… • exercise: • e.g.: • tab abitur sex, col • tab abitur sex if cohort==1930, col • sort cohort • by cohort: tab abitur sex

  8. basic commands for data management help “command” gen var1 = var2 recode var1 (0=.) (1/8=2) (9=3) rename var1 var100 **use the following variables: cohort (indicator of cohort membership) sex (1=male, 2=female) agemaryc (age @ first marriage) exercise: e.g.: sum agemaryc recode age @ married in groups -generate a new variable -recode new variable into groups -recode if marcens==0

  9. possible break

  10. Intro to panel regression with stata: -panel data -fixed effects -between effects -random effects -fixed or random?

  11. panel data (panelex1.dta)

  12. Panel data: Panel data, also called cross-sectional time series data, are data where multiple cases (people, firms, countries etc) were observed at two or more time periods. Cross-sectional data: only information about variance between subjects Panel data: two kinds of informationbetween and within subjects --> two sources of variance

  13. Janet: Basics of panel regression models

  14. cross sectional vs. panel analysesopen panelex1.dtaignore the fact that we have repeated measures: regress childrn income conclusion: more children --> higher income

  15. Fixed effects model Answers the question: What is the effect of x when x changes within persons over time e.g. Person A has two children at first point of time and three children at second, what effect has this change on income? Information used: fixed effects estimates using the time-series information in the data Variance analyzed: within Problems: only time variant variables

  16. Fixed effects exercise:separate regression for each unit and then average it: regress income childrn if id==1 regress income childrn if id==2

  17. ) + ( _____________________________ 2 = - 2.5 conclusion: more children --> lower income exercise: generate dummy variable for person and regress with dummy variable tab id, g(iddum) reg income childrn iddum1 iddum2

  18. Fixed effects-define data set as panel data tsset id t-regression with fixed effects commandxtreg income chldrn, fe

  19. Between effects modelAnswers the question: What is the effect of x when x is different (changes) between persons: Person A has “on the average” three children and Person B has “on the average” five children, what effect has this difference on their income? In the between effects model we model the mean response, where the means are calculated for each of the units.Information used: cross-sectional information (between subjects)Variance analyzed: between varianceTime variant and time invariant variables

  20. Between effects average ---> regress income childrn conclusion: more children --> more income define data as panel data xtreg dependent independent, be

  21. Random effects model:Assumption: no difference between the two answers to the questions:1) what is the effect of x when x changes within the person: Person A has two children at first point of time and three children at second, what effect does this change have on their income?2) what is the effect of x when x is different (changes) between persons: Person A has two children and Person B has three children children, what effect does this difference have on their income? Information used: panel and cross-sectional (between and within subjects)Variance analyzed: between variance and within varianceTime variant and time invariant variables

  22. Random effects model:-matrix-weighted average of the fixed and the between estimates. -assumes b1 has the same effect in the cross section as in the time-series -requires that individual error terms treated as random variables and follow the normal distribution.use:xtreg dependent independent if var==x, re

  23. possible break

  24. open data: panelex2.dtavarlist:

  25. tell stata the structure of the data: tsset X Y X= caseid Y=time/wave summary statistics: xtdesxtsum

  26. use the effectsxtreg dependent independent if sex==1, fextreg dependent independent if sex==1, bextreg dependent independent if sex==1, reexercise: compare/discuss modelse.g.: xtreg indvar1 indvar2 … if sex==1, fetry to include time invariant variablestry to make theoretical/empirical argument why you use which model

  27. Problems/Tests/Solutions: What’s the right model: fixed or random effects? Test: Hausman Test Null hypothesis: Coefficients estimated by the efficient random effects estimator are same as those estimated by the consistent fixed effects estimator. If same (insignificant P-value, Prob>chi2 larger than .05) --> safe to use random effects. If significant P-value --> use fixed effects. xtreg y x1 x2 x3 ... , fe estimates store fixed xtreg y x1 x2 x3 ... , re estimates store random hausman fixed random

  28. Problems/Tests/Solutions: Autocorrelation? What is autocorrelation: Last time period’s values affect current values test: xtserial Install user-written program, type findit xtserial or net search xtserial xtserial depvar indepvars

  29. Significant test statistic indicates presence of serial correlation. Solution: use model correcting for autocorrelation xtregar instead of xtreg

  30. possible break

  31. different data structure panel -waves -number of children @ wave1 / 2/ 3/ 4 -employed @ wave1 / 2/ 3/ 4 -income @ wave1 / 2/ 3/ 4 regression models: dependent variable continuous event -dates of events -birth of first child @ 1963 -birth of second child @ 1966… -start of first employment @… -start of unemployment @… -start of second employment @… time information in event data more precise: dependent variable event happens 0/1

  32. Different Faces of Event History Data

  33. Types of censoring • Subject does not experience event of interest • Incomplete follow-up • Lost to follow-up • Withdraws from study • Left or right censored

  34. open data eventex.dta

  35. tell stata that our data is “survival data” • stset stset X, failure(Y) id(Z) X= time at which event happens or right censored, this is always needed Y= 0 or missing means censored, all other values are interpreted as representing an event taking place/ failure • Z= id • three examples: • stset ageendsch • event: end of school • time: age @ end of school • stset agemaryc, failure (marcens) id (caseid) event: marriage • stset agestjob, failure (stjob) id (caseid) event: first job

  36. DATA MANGAGEMENT HANNAH

  37. Different Models of Event History

  38. Survivor function, S(t) defines the probability of surviving longer than time t Survivor and hazard functions can be converted into each other Hazard (instantaneous hazard, force of mortality), is the risk that an event will occur during a time interval (Δ(t)) at time t, given that the subject did not experience the event before that time survivor function and hazard function

  39. non-parametric: kaplan-meier List the Kaplan-Meier survivor function . sts list . sts list, by(sex) compare Graph the Kaplan-Meier survivor function . sts graph . sts graph, by(sex)

  40. non-parametric: kaplan-meier exercise: stset your data for marriage, endschool or first job e.g.: 1) sts list 2) sts graph 3) sts list, by (…) compare 4) sts graph, by (..)

  41. non-parametric: Nelson-Aalen List the Nelson-Aalen cumulative hazard function . sts list, na . sts list, na by(sex) compare Graph the Nelson-Aalen cumulative hazard function . sts graph, na . sts graph, na by(sex)

  42. non-parametric: Nelson-Aalen exercise: stset your data for marriage, endschool or first job 1) sts list, na 2) sts graph, na 3) sts list, na by (…) compare 4) sts graph, na by (..)

  43. non-parametric: kaplan-meier • Comparing Kaplan-Meier curves • Log-rank test can be used to compare survival curves Hypothesis test (test of significance) • H0: the curves are statistically the same • H1: the curves are statistically different Compares observed to expected cell counts for age@marr:

  44. non-parametric: kaplan-meier Comparing Kaplan-Meier curves exercise: Test equality of survivor functions e.g.: sts test abitur

More Related