Forecasting Choices

Forecasting Choices

Types of Variable Continuous Quantitative Discrete (counting) Variable Ordinal Qualitative Nominal

Nominal or Ordinal Dependent Variable • Indicating “choices” of a decision maker, say a consumer. • Response categories: • Mutually exclusive • Collectively exhaustive • Finite Number • Desired regression outputs • Probability that the d.m. chooses each category • Coefficient of each independent variable

Generalized Linear Models (GLM) • Regression model for a continuous Y: Y = b0 + b1X1 + b2X2 + e , e following N(0, s) • GLM Formulation: • Model for Y: Y is N(m, s) • Link Function (model for the predictors) m = b0 + b1X1 + b2X2

Estimation of Parameters of GLM • Maximum Likelihood Estimation • For normal Y, MLE is the LS estimation • Maximize: • Sum of log (likelihood function), Li of each observation

MLE for Regression Model • Y is N(m, s) • MLE: Maximize

GLM for Binary Dependent Variable, Y • Model for response: Y is B (n, p) • Model for predictors (Link Function) logit(p) = b0 + b1X1 + b2X2 +… bKXK = g • Probability p = exp(g) / (1+exp(g))

X : Covariates • Independent variables are often referred to as “covariates.” • Example: • SPSS binary logistic regression routine • SPSS multinomial logistic regression routine

A. Logistic Regression For Ungrouped Data (ni=1) • Model of Observation for the i-th observation Yi = 1: Choose category 1 with probability pi Yi = 0: Choose category 2 with probability 1- pi • Log Likelihood Function for the i-th observation

MLE • Maximize:

Link Function, gi Parameters of the Likelihood ln(Likelihood) Li Setting Up a Worksheet for MLE • Define an array for storing parameters of the link function. Enter an initial estimate for each parameter. Then for each observation: • Sum the likelihood and invoke the solver to maximize by changing the parameters. • Multiply –2 to the maximized value for test of significance of the regression

Test of Significance • Hypotheses: H0: b1 = b2 …. bK = 0 H1: At least one bj = 0 • Test statistic: • The Distribution Under H0: c2(DF = K)

Standard Errors of Logistic Regression Coefficients (optional) • Estimate of Information Matrix, I (K=2)

Deviance Residuals and Deviance for Logistic Regression (Optional) • Deviance (corresponds to SSE) • Deviance Residual

B. Logistic Regression for Grouped Data Using WLS • The observation for the i-th group: -> -> ->

WLS for Logistic Regression • Regress: on X1i, …, XKi with

WLS for Unequal Variance Data 2 * Y * * 1 * Observation 2 is subject to a larger variance than observation 1. So, it makes sense to give a lower weight. In WLS, the weight is proportional to 1/variance. * X

Modeling of Forecasting Choices - GLM • Model for Observation of the Dependent Variable. A probability distribution • Link Function (Model for Independent Variables) A mathematical function

Forecasting Choices Binomial Distr. 2 # of Choices Multinomial Distr. > 2 Unordered Ordered

Multinomial Logit Regression • Multinomial Choice (m=3) , Ungrouped Data: • Y1=1: Choose category 1 with probability p1 • Y1=0: Choose category 2 or 3 with probability 1- p1 • Y2=1: Choose category 2 with probability p2 • Y2=0: Choose category 1 or 3 with probability 1- p2 • Y3=1: Choose category 3 with probability p3 • Y3=0: Choose category 1 or 2 with probability 1- p3

Log Likelihood Function • Log Likelihood Function of the i-th ungrouped observation • MLE: Maximize

Y3 and p3 can be omitted • Multinomial Choice (m=3) , Ungrouped Data: • Y1=1: Choose category 1 with probability p1 • Y1=0: Choose category 2 or 3 with probability 1- p1 • Y2=1: Choose category 2 with probability p2 • Y2=0: Choose category 1 or 3 with probability 1- p2

Log Likelihood Function • Log Likelihood Function of the i-th (ungrouped) observation • MLE: Maximize

1: Formulating “Link” Functions: Unordered Choice Categories • Category 3 as the baseline category.

From Link Functions to Probabilities

Test of Significance • Hypotheses: H0: b11 = b21 = … bK1 = b12 = b22 = … bK2 = 0 H1: At least one bij = 0 • Test statistic • The Distribution Under H0: c2(DF = 2 K)

Interpreting Coefficients • Not easy, as a change of probability for one category affects probabilities for other (two) categories.

2: Formulating Link Functions: Ordered Choice Categories Category 1 Category 2 Category 3 g1 g2 Underlying Variable Defining Categories

Choices for Probability Distribution of U a. Ordered Probit Model for the i-th DM Ui = follows N(mi, s=1) b. Ordered Logit Model for the i-th DM Ui follows Logistic Distribution(mi) • mi = b1X1i + b2X2i (no const)

a. Ordered Probit Model

b. Ordered Logit Model

Types of Variable Continuous Quantitative Discrete (counting) Variable Ordinal Qualitative Nominal

Poisson Regression for Counting • Model of observations for Y • Link Function • Log Likelihood Function

Forecasting Choices

Forecasting Choices

Presentation Transcript

Choices

Choices

CHOICES

Consumer Choices - Payment Choices

Choices

Choices

Choices, Choices, Choices

Choices

Choices

Choices

Choices

Choices

Choices

Choices

Choices

CHOICES

Choices

Choices

Choices

Choices, Choices

Choices, Choices

CHOICES