1 / 38

Simple Logit Simple Logit with Corrected Se’s

Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.

ghita
Télécharger la présentation

Simple Logit Simple Logit with Corrected Se’s

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved HeterogeneityDavid K. Guilkey

  2. Focus of this talk:Binary dependent variablesUnordered categorical dependent variablesModels will be logit based – will not discuss probit,poisson or negative binomial models although STATA has methods for these estimators as wellEmpirical example uses data from the Indonesian Family Life Survey:Two outcomes:Binary indicator for whether the respondent uses contraceptionUnordered categorical variable for method choice

  3. Data Set OverviewFour waves of data: 1993, 1997, 2000, and 2007Individual level information on fertility, education, migrationCommunity and facility level data on health and family planning providersData from 321 enumeration areas – we will consider these communities

  4. Basic Model for Longitudinal Logit:Where:Yti: observed binary variable (respondent i from time period t)Xti: time varying explanatory variables (age and education level)Pti: time varying program variable (posyandus)Zi: time invariant regressors (Muslim)i=1,2,…N (individuals)t=1,2,…Ti (observations per individual -- unbalanced panel)

  5. Assumptions:for the parametric logit in STATA (xtlogit, melogit, and one variant of GLLAMM)and:Note that observations for the same individual will be correlated because of the time invariant error – sometimes referred to as unobserved heterogeneityGiven the assumptions, estimation options are:1. Simple logit yields consistent point estimates but incorrect SE’s2. Simple logit with cluster option corrects SE’s3. Parametric or semi-parametric maximum likelihood

  6. The likelihood function for this model is derived as follows:This is the probability that individual i at time t is using contraception conditional on time invariant heterogeneity.For individual i, we observe Ti binary responses that we can write as:Yi = (1,0,0,1) for a woman that is observed for 4 time periods andused contraception at times 1 and 4.

  7. Let Yi be the set of observed outcomes for individual i, then:Joint probability must be approximated -- approximatingthe area under a curve. With the assumption of normalitythe approximation method is Gaussian Quadrature or HermiteintegrationPoints:1. More accurate with more Hermite points – but execution time is longer.2. You need more points as Ti gets larger.

  8. Hermite integration replaces the integral with a sum:where the weights (wm’s) and the masspoints (μm’s) are known because of the assumption of normality Alternative:The discrete factor approximation searches over weights and mass points along with the other parameters of the model.Must impose a normalization;1. Weights sum to one2. Either set one mass point to zero (fortran program) or set mean of distribution to zero (GLLAMM)

  9. Simple LogitSimple Logit with Corrected Se’s

  10. Parametric Maximum Likelihood

  11. Semi-Parametric Maximum Likelihood

  12. Multilevel Panel ModelsBasic Form of the model:wherej=1,2,…,J (communities)i=1,2,…,Nj(individuals from community j)t=1,2,…,Tij (observations for person i for community j)

  13. Xtij: individual level variables (some could be fixed through time)Ptij: time varying program variableZj: time invariant community level variablesμij: time invariant individual level unobserved heterogeneityλj: time invariant community level unobserved heterogeneityThis model allows observations on the same individual to be correlated and observations from the same community to be correlated.

  14. Assumptions:1. Simple logit yields consistent point estimates but incorrect SE’s2. Simple logit with cluster option corrects SE’s (at community level)3. Parametric or semi-parametric maximum likelihoodMaximum likelihood estimator is a straight forward extension of the longitudinal data model:

  15. You need the unconditional joint probability of the observedset of outcomes for the set of individuals in each community:Conditional on the unobservables at the community level, the probability of the set of observed outcomes for person i from community j are:The unconditional joint probability of the set of observed outcomes for all individuals in community j is then:We then either use Hermite integration or the discrete factor method to approximate the integral.

  16. Simple logitSimple Logit with Corrected SE’s

  17. Parametric Maximum Likelihood

  18. Non-parametric Maximum Likelihood

  19. Testing for Program TargetingPrograms may target high need areas or areas where they feelresidents would be receptive to family planningFor example: family planning programs may concentrate on highfertility areasResult is that simple methods may understate or overstate program impactStatistical Implication of program targeting:

  20. Solutions:Explicitly model program placement and estimate placement simultaneously with program impact equations (Angeles, Guilkey,and Mroz, 1998)Treat as fixed effects and include dummies for communitiesor some other fixed effects method (Gertler and Molyneau, 1994)Angeles, Guilkey, and Mroz show that the joint modeling approach yields smaller standard errors in Tanzania but the two methodsgave similar results

  21. Example (fixed effects) plus Hausman Test for endogenous placement:Efficient estimator under the null of no endogeneity (random effects):

  22. Consistent estimator under the alternate (fixed effects):

  23. Hausman test results:

  24. State Dependence and Unobserved HeterogeneityConsider the simple model:Note:Implies:Unless (no time invariant unobserved heterogeneity)Now consider:Now:Very difficult to distinguish between the two models

  25. Same problem would exist if the unobserved heterogeneitywere at the community levelSolution is to estimate a comprehensive model:Initial conditions problem:Must either be able to set or jointly estimate the equation of interest with an equation of theform:

  26. Often it is reasonable to set the initial value:Observations start at the beginning of the woman’s child bearing yearsIn this example, it is not since women enter the year one data set atdifferent agesJoint estimation is basically a simultaneous equations problem subjectto standard identification issues.However, time varying exogenous variables provide identification (age and education in this case)Example follows:

  27. Estimation with no controls for unobserved heterogeneity and initial conditions:

  28. Estimation with Controls:

  29. Estimation with Controls (continued)

  30. Basic Model Longitudinal Multinomial Logit with 3 Choices:Individual i at time t time makes choice 3 (for example) if :If we assume that the ε’s follow independent extreme value distributions and impose the restriction that:

  31. So that the probabilities sum to one then:for k=2,3.The discrete factor model allows a more general pattern of correlation:for m=1,2…,M and a common set of weights: allows for correlation in the μ’s

  32. Unfortunately, GLLAMM estimates a needlessly restrictive version of the model:Parametric:If there are more than 3 choices, all ρ’s are restrictedNon-parametric:for all m.

  33. Extension to Multilevel Panel Model:Parametric:Semi-parametric:

  34. The empirical example estimates a model with four choices:1= Non use2=Temporary Methods (pill, condom, injection)3=Long Lasting Methods (IUD, sterilization)4=Traditional MethodsWe show the complete results for the most general model and then report partial results for other models:

  35. Comparison of Posyandu effects across estimation methods:

More Related