1 / 22

LOGISTIC REGRESSION

LOGISTIC REGRESSION. A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate the effect of a risk on the occurrence of a disease event. Example: Framingham Heart Study Coronary heart disease and blood pressure.

hewitt
Télécharger la présentation

LOGISTIC REGRESSION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate the effect of a risk on the occurrence of a disease event. Example: Framingham Heart Study Coronary heart disease and blood pressure

  2. LOGISTIC REGRESSION: AN EXAMPLE Event: Coronary Heart Disease Occurrence is the dependent variable, which takes 2 values: Yes or No. Risk factor: Blood pressure Systolic blood pressure is the independent variable X, a continuous measurement. The probability of getting coronary heart disease depends on blood pressure.

  3. DATA

  4. SCATTER PLOT

  5. LINEAR REGRESSION FOR Prob.(CHD):NOT A GOOD IDEA!

  6. PROPORTION WITH CHD BY SBP GROUP Systolic BP RangeProportion 130-149 mmHg 0/3 0.00 150-169 mmHg 2/4 0.50 170-189 mmHg 3/3 1.00

  7. LOGISTIC REGRESSION PROBABILITY MODEL 1 p(X) = ----------------------------- 1 + exp (- b0 - b1X) The probability of the event varies as an S-shaped function of the risk factor X: the logistic curve.

  8. LOGISTIC CURVE MODEL: OCCURRENCE OF CHD AS A FUNCTION OF SBP

  9. LOGISTIC MODEL: LOG ODDS p (X) log ----------- = b0 + b1X 1 - p (X) The log of the odds of the event is a linear function of X. Log(odds of CHD) = - 6.08 + 0.0243(SBP)

  10. ODDS The odds of an event is the chance that the event occurs divided by the chance of its not occurring: Odds = p/(1 - p) = p/q

  11. b1: KEY PARAMETER OF THE LOGISTIC MODEL p (X) log ----------- = b0 + b1X 1 - p (X) The parameter b1 is like the slope of a linear regression model. b1= 0 indicates that X has no effect on the probability, e.g., a man’s chance of CHD does not depend on his SBP.

  12. b1: KEY PARAMETER p (X) log ----------- = b0 + b1X 1 - p (X) The coefficient b1 measures the amount of change in the log of the odds per unit change in X.

  13. b1: KEY PARAMETER log odds(X+1) = b0 + b1(X+1) = b0 + b1X+ b1 log odds(X) = b0 + b1X Difference in log odds = b1 E.g., the log of the odds of getting CHD increases by 0.0243 for an increase of 1 mmHg of systolic blood pressure. (Hard to explain to a patient!)

  14. THE COEFFICIENT b1AND THE ODDS RATIO Difference in log odds given by b1 translates into the odds ratio (OR). exp(b1) = OR = ratio of odds at risk level of X+1 to the odds when risk level is X b1 = 0  OR = 1.

  15. THE COEFFICIENT $1AND THE ODDS RATIO For example, the odds of CHD are multiplied by the factor exp(0.0243) = 1.025 for every increase of 1 mmHg in SBP. A difference of 10 mmHg multiplies the odds of CHD by (1.025)10, or 1.275.

  16. ESTIMATION OF THE PARAMETERS Technique: Maximum likelihood estimation For large sample sizes, the normal distribution is used to put a confidence interval around the estimate of the coefficient b1.

  17. HYPOTHESIS TESTING Ho: b1 = 0 No difference in risk at different levels of the risk factor X. No association between risk factor X and probability of occurrence.

  18. HYPOTHESIS TESTING Ha: b1 =/= 0 or b1 > 0 (risk increases with X) or b1 < 0 (risk goes down as X increases)

  19. HYPOTHESIS TESTING Ho: OR = 1 Ha: OR =/= 1 or OR > 1 (risk increases with X) or OR < 1 (X is protective)

  20. RESULTS OF LOGISTIC REGRESSION OR with confidence interval and p value indicate whether there is a significant association between level of the risk factor and chance of occurrence OR = 1.025 (1.015, 1.034), p < 0.001

  21. RESULTS OF LOGISTIC REGRESSION Can be used to predict an individual’s risk: prob. of CHD when SBP = 180: p/q = exp{-6.082 + 0.0243(180)} Solve for p: prob. of CHD = 0.125

  22. MULTIVARIATE LOGISTIC REGRESSION Model with additional risk factors: p (X) log ----------- = b0 + b1X + b2X 1 - p (X) Log(odds of CHD) = b0+ b1(SBP) + b2(CHOL) + b3(smoker)

More Related