1 / 29

次数依变量模型 ( Models for Count Outcomes)

次数依变量模型 ( Models for Count Outcomes). Models for Count Outcomes ( 计 次变量模型 ) Count variables indicate how many times something has happened. 美国总统否决法案的次数 某教授发表论文的篇数 非洲国家发生政变的次数. 2. Estimates from the linear regression models are inefficient, inconsistent, and biased Functional form

pete
Télécharger la présentation

次数依变量模型 ( Models for Count Outcomes)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 次数依变量模型(Models for Count Outcomes)

  2. Models for Count Outcomes (计次变量模型) • Count variables indicate how many times something has happened. • 美国总统否决法案的次数 • 某教授发表论文的篇数 • 非洲国家发生政变的次数 2

  3. Estimates from the linear regression models are inefficient, inconsistent, and biased • Functional form • Nonsensical predictions 3

  4. A frequently adopted remedy for linear regression model is to make a natural logarithmic transformation of the dependent variable so that a log-linear function is acquired • Because zero is one of the observed values, a constantc is often added to the dependent variableYi, i.e., ln(Yi +c) 4

  5. Example: Article Counts(论文篇数) example (file name:couart2): the data on the number of publications produced by Ph.D. biochemists are used 5

  6. Count Models • Poisson Regression Model (PRM泊松模型) • Negative Binomial Regression Models(负二项模型) 6

  7. 泊松分布(Poisson Distribution) • 若依变数 y 是计数(count)在某个时段内感兴趣的事件(event)共发生了几次,,其值为包含0在内之正整数,且在学理上并无上限,这类型变量的分布属于泊松分布(Poisson distribution)

  8. 泊松分布的一大特色是:期望值,其变异量亦为泊松分布的一大特色是:期望值,其变异量亦为 • 泊松分布的连接函数为对数函数(log link)

  9. 泊松分布的变异量是随平均数之大小而定,此一特性常称为「变异量与期望值相等」(equidispersion)泊松分布的变异量是随平均数之大小而定,此一特性常称为「变异量与期望值相等」(equidispersion)

  10. Poisson Regression Model (PRM泊松回归模型):将GLM之「系统部分」设为自变数的线性组合 后,代入连接函数中:

  11. Interpretation of PRM • the expected value of the count variable (rate of occurrence):listcoef, prchange • the probability of counts:prvalue • predicted count:prtab 11

  12. Interpretation of PRM 1. Change in for changes in the independent variables • factor ( or percent) change in expected count usinglistcoef • 在其他变数固定不变的情形下,女性科学家的平均论文数是男性科学家的女性科学家的0.8倍(或,少20%) 12

  13. 在其他变数固定不变的情形下,指导教授的论文数增加一个标准差,科学家的平均论文数会增加27%在其他变数固定不变的情形下,指导教授的论文数增加一个标准差,科学家的平均论文数会增加27% For a standard deviation increase in the mentors’ productivity, a scientist's mean productivity increases by 27 percent, holding all other variables constant 13

  14. Marginal and Discrete change in (predicted rate) using prchange 在一般情形下(其他变数保持在平均值),女性科学家的平均论文数会比男性少0.36篇 14

  15. 2. creating ideal types withprvalue andprtab: 15

  16. Negative Binomial Model(负二项模型 ) • 变异量过大问题 • 泊松回归在理论模型中均设定变异量等于期望值 16

  17. 实际上,经验资料的变异量往往大于理论的预期,即实际上,经验资料的变异量往往大于理论的预期,即 ,称为变异量过大(overdispersion)问题 • 若不校正,系数之标准误会被低估,使得检定比实际更容易在统计上显著,造成推论上的误判

  18. 造成变异量过大的诸多原因之一,就是事件发生率 除了受已观测到的引数影响之外,还有研究者「未观测到的异质」(unobserved heterogeneity)

  19. 处理方式有二: • 不采用泊松回归本身的标准误,而另行计算不会低估的变异量及共变数矩阵(variance-covariance matrix of the estimator, VCE),以估计强韧标准误(robust standard error)

  20. 设定事件发生率本身亦为随机变数,呈迦玛(gamma)概率分布,将之代回泊松分布后,二者合成新的「负二项」概率模型设定事件发生率本身亦为随机变数,呈迦玛(gamma)概率分布,将之代回泊松分布后,二者合成新的「负二项」概率模型

  21. 重估泊松回归之强韧标准误 • 在Stata,于poisson 指令后,加上vce(robust) 之次指令,即可估算系数强韧之标准误: poisson y x1 x2 x3, vce(robust)

  22. 两个「负二项」回归模型 • (Negbin 2或NB2) 上式显示负二项分布的条件期望值与泊松回归模型相同;但条件变异量则不同

  23. (Negbin 1或NB1) 上式显示负二项分布的条件期望值与泊松回归模型相同;但条件变异量则不同

  24. 检定: • 当时,负二项分布的变异量等于泊松分布本身的变异量,则泊松模型适用 • 但只要是,负二项分布的变异量就大于泊松分本身的变异量(过度离散),则负二项模型适用

  25. Stata内建负二项回归模型指令: • nbreg y x1 x2 x3 • 在报表下方有变异量参数(alpha )的估计值及LR的检定值。如拒斥H0,表示变异量在统计上显著地大于期望值,故应采负二项回归。 25

  26. Stata之nbreg指令是设为NB2模型。若要以NB1模型估计,则需在加上dispersion(constant)的次指令Stata之nbreg指令是设为NB2模型。若要以NB1模型估计,则需在加上dispersion(constant)的次指令 26

  27. Interpretation of NBM • the expected value of the count variable (rate of occurrence): listcoef, prchange • the probability of counts: prvalue • predicted count: prtab 27

  28. Interpretation of NBR 1. Change in for changes in the independent variables • factor ( or percent) change in expected count usinglistcoef 在其他变量固定不变的情形下,女性科学家的平均论文数是男性科学家的0.8倍(或,少20%) 28

  29. Marginal and Discrete change in (predicted rate) usingprchange 在一般情形下(其他变量保持在平均值),女性科学家的平均论文数会比男性少0.34篇 29

More Related