Logistic Regression

Logistic Regression Database Marketing Instructor: N. Kumar

Logistic Regression vs TGDA • Two-Group Discriminant Analysis • Implicitly assumes that the Xs are Multivariate Normally (MVN) Distributed • This assumption is violated if Xs are categorical variables • Logistic Regression does not impose any restriction on the distribution of the Xs • Logistic Regression is the recommended approach if at least some of the Xs are categorical variables

Data

Contingency Table

Basic Concepts • Probability • Probability of being a preferred stock = 12/24 = 0.5 • Probability that a company’s stock is preferred given that the company is large = 10/11 = 0.909 • Probability that a company’s stock is preferred given that the company is small = 2/13 = 0.154

Concepts … contd. • Odds • Odds of a preferred stock = 12/12 = 1 • Odds of a preferred stock given that the company is large = 10/1 = 10 • Odds of a preferred stock given that the company is small = 2/11 = 0.182

Odds and Probability • Odds(Event) = Prob(Event)/(1-Prob(Event)) • Prob(Event) = Odds(Event)/(1+Odds(Event))

Logistic Regression • Take Natural Log of the odds: • ln(odds(Preferred|Large)) = ln(10) = 2.303 • ln(odds(Preferred|Small)) = ln(0.182) = -1.704 • Combining these relationships • ln(odds(Preferred|Size)) = -1.704 + 4.007*Size • Log of the odds is a linear function of size • The coefficient of size can be interpreted like the coefficient in regression analysis

Interpretation • Positive sign  ln(odds) is increasing in size of the company i.e. a large company is more likely to have a preferred stock vis-à-vis a small company • Magnitude of the coefficient gives a measure of how much more likely

General Model • ln(odds) = 0 + 1X1 + 2X2 +…+ kXK (1) • Recall: • Odds = p/(1-p) • ln(p/1-p) = 0 + 1X1 + 2X2 +…+ kXK (2) • p = • p =

Logistic Function

Estimation • Coefficients in the regression model are estimated by minimizing the sum of squared errors • Since, p is non-linear in the parameter estimates we need a non-linear estimation technique • Maximum-Likelihood Approach • Non-Linear Least Squares

Maximum Likelihood Approach • Conditional on parameter , write out the probability of observing the data • Write this probability out for each observation • Multiply the probability of each observation out to get the joint probability of observing the data condition on  • Find the  that maximizes the conditional probability of realizing this data

Logistic Regression • Logistic Regression with one categorical explanatory variable reduces to an analysis of the contingency table

Interpretation of Results Look at the –2 Log L statistic • Intercept only: 33.271 • Intercept and Covariates: 17.864 • Difference: 15.407 with 1 DF (p=0.0001) • Means that the size variable is explaining a lot

Do the Variables Have a Significant Impact? • Like testing whether the coefficients in the regression model are different from zero • Look at the output from Analysis of Maximum Likelihood Estimates • Loosely, the column Pr>Chi-Square gives you the probability of realizing the estimate in the Parameter estimate column if the estimate were truly zero – if this value is < 0.05 the estimate is considered to be significant

Other things to Look for • Akaike’s Information Criterion (AIC), Schwartz’s Criterion (SC) – this like Adj-R2 – so there is a penalty for having additional covariates • The larger the difference between the second and third columns – the better the model fit

Interpretation of the Parameter Estimates • ln(p/(1-p)) = -1.705 + 4.007*Size • p/(1-p) = e(-1.705) e(4.007*Size) • For a unit increase in size, odds of being a favored stock go up by e4.007 = 54.982

Predicted Probabilities and Observed Responses • The response variable (success) classifies an observation into an event or a no-event • A concordant pair is defined as that pair formed by an event with a PHAT higher than that of the no-event • Higher the Concordant pair % the better

Classification • For a set of new observations where you have information on size alone • You can use the model to predict the probability that success = 1 i.e. the stock is favored • If PHAT > 0.5 success = 1else success=2

Logistic Regression with multiple independent variables • Independent variables a mixture of continuous and categorical variables

Data

General Model • ln(odds) = 0 + 1Size + 2FP • ln(p/1-p) = 0 + 1Size + 2FP • p = • p =

Estimation & Interpretation of the Results • Identical to the case with one categorical variable

Summary • Logistic Regression or Discriminant Analysis • Techniques differ in underlying assumptions about the distribution of the explanatory (independent) variables • Use logistic regression if you have a mix of categorical and continuous variables

Logistic Regression