Marketing Research

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides

Chapter Twenty Discriminant and Canonical Analysis

Used to classify individuals into one of two or more alternative groups on the basis of a set of measurements Used to identify variables that discriminate between naturally occurring groups Discriminant Analysis Major Uses Prediction Description

Determining linear combinations of the predictor variables to separate groups by measuring between-group variation relative to within-group variation Developing procedures for assigning new objects, firms, or individuals, whose profiles, but not group identity are known, to one of the two groups Testing whether significant differences exist between the two groups based on the group centroids Determining which variables count most in explaining inter-group differences Objectives of Discriminant Analysis

Basic Concept If we can assume that two populations have the same variance, then the usual value of C is where X1 and XII are the mean values for the two groups, respectively. Distribution of two populations

Where Z = discriminant score b = discriminant weights X = predictor (independent) variables Discriminant Function Zi = b1 X1 + b2 X2 + b3 X3 + ... + bn Xn In a particular group, each individual has a discriminant score (zi) Σ zi = centroid(group mean); where i = individual Indicates most typical location of an individual from a particular group

Discriminant Function – A Graphical Illustration

Criterion against which each individual’s discriminant score is judged to determine into which group the individual should be classified Cut-off Score For equal group sizes For unequal group sizes

Null Hypothesis: In the population, the group means the discriminant function are equal Ho : μA = μB Generally, predictors with relatively large standardized coefficients contribute more to the discriminating power of the function Canonical or discriminant loadings show the variance that the predictor shares with the function Determination of Significance

Holdout Method Uses part of sample to construct classification rule; other subsample used for validation Uses classification matrix and hit ratio to evaluate groups classification Uses discriminant weights to generate discriminant scores for cases in subsample Classification and Validation

U - method or Cross Validation Uses all available data without serious bias in estimating error rates Estimated classification error rates P1 = m1/ n1 P2 = m2 / n2 where m1 and m2 = number of sample observations mis-classified in groups G1 and G2 Classification and Validation (Contd.)

Steps in Discriminant Analysis • Form groups 2. Estimate discriminant function 3. Determine significance of function and variables 4. Interpret the discriminant function 5. Perform classification and validation

Export Data Set Respid Will(y1) Govt(y2) Train(x5) Size(x1) Exp(x6) Rev(x2) Years(x3) Prod(x4) 1 4 5 1 49 1 1000 5.5 6 2 3 4 1 46 1 1000 6.5 4 3 5 4 1 54 1 1000 6.0 7 4 2 3 1 31 0 3000 6.0 5 5 4 3 1 50 1 2000 6.5 7 6 5 4 1 69 1 1000 5.5 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4 3 1 45 1 2000 6.0 6 116 5 4 1 44 1 2000 5.8 11 117 3 4 1 46 0 1000 7.0 3 118 3 4 1 54 1 1000 7.0 4 119 4 3 1 49 1 1000 6.5 7 120 4 5 1 54 1 4000 6.5 7 Marketing Research 8th Edition Aaker, Kumar, Day

Description of Variables

Export Data Set – Discriminant Analysis Results

Discriminant Analysis Results (Contd.)

Number of possible discriminant functions = Min (p, m-1) Where M = number of groups P = number of predictor variables Multiple Discriminant Analysis • Assumptions Underlying the Discriminant Function • The p independent variables must have a multivariate normal • distribution • 2. The p x p variance–covariance matrix of the independent variables in each of the two groups must be the same

Canonical correlation analysis is a multivariate statistical model that helps the study of interrelationships among sets of multiple dependent variables and multiple independent variables. Sets of variables on each side are combined to form linear composites such that the correlation between these linear composites (canonical variates) is maximized Canonical Correlation Analysis Y1 + Y2 + ………+ Yn = X1 + X2 + ………+ Xn

To determine whether two sets of variables are independent of one another and estimate the magnitude of the relationship between the two sets. Derive a set of weights for each set (dependent and independent) of variables so that the linear combinations of each set are maximally correlated. Explain nature of relationships among sets of variables by measuring the relative importance of each variable to the canonical functions (relationships). Objectives of Canonical Correlation Analysis

Canonical loadings or canonical structure coefficients measure the simple correlation between an original observed variable in the dependent or independent set and the set’s canonical variate or the linear composite. reflects the variance that the original variable shares with the canonical variate or the relative contribution of each of the variable to the canonicalfunction. • Canonical roots or the eigenvaluesare the squared canonical correlations (i.e. correlation between dependent and independent canonical variate) reflects the percentage of variance in the dependent canonical variate that can be explained by the independent canonical variate. Canonical Loadings and Roots

Sign and magnitude of canonical weights (standardized coefficients) on each of the canonical functions help to identify the relative importance of each of the variables in deriving the canonical relationships. Maximum number of canonical function that can be extracted equals the number of variables in the smallest data set (independent set or dependent set). Redundancy index (the amount of variance in canonical variate explained by the other canonical variate in the canonical function obtained by multiplying the shared variance of the variate with the squared canonical correlation) helps to overcome the bias and uncertainty in using canonical roots as a measure of shared variance. Interpreting Canonical Functions

Procedures that maximize the correlation do not necessarily maximize interpretation of the pairs of canonical variates; therefore canonical solutions are not easily interpretable. • Rotation of canonical variate (like in factor analysis) to improve interpretability is not a common practice and not available in most computer programs. • If a non-linear relationship between dimensions in a pair is suspected, use of canonical correlation may be inappropriate unless the variables are transformed or combined to capture the non-linear relationship. • Only orthogonal solution is normally available. • Changing variable in one set alters the composition of canonical variate in the other set significantly. • There is no causal relationship but is only a correlational technique. Limitations of Canonical Correlation Analysis

Export Data Set – Canonical Correlation Results The CANCORR Procedure Attitude to Exporting 2 Firm characteristics 6 Observations 120 Adjusted Approximate Squared Canonical Canonical Standard Canonical Correlation Correlation Error Correlation 1 0.857700 0.850646 0.024233 0.735649 2 0.434392 0.405915 0.074372 0.188697

Canonical Correlation Results (Contd.) Raw Canonical Coefficients for the Attitude to Exporting attitude1 attitude2 y1 y1 0.663025751 -0.825828605 y2 y2 0.1747547312 1.1757781282 Raw Canonical Coefficients for the Firm characteristics demographics1 demographics2 x1 x1 0.0590789526 0.03138617 x2 x2 0.0001734106 0.0009537723 x3 x3 -0.372885396 0.1278689212 x4 x4 0.1427469498 -0.150119835 x5 x5 0.1194923096 0.4450507388 x6 x6 0.0015015543 -0.164606455

Canonical Correlation Results (Contd.) Standardized Canonical Coefficients for the Attitude to Exporting attitude1 attitude2 y1 y1 0.8531 -1.0625 y2 y2 0.2003 1.3478 Standardized Canonical Coefficients for the Firm characteristics demographics1 demographics2 x1 x1 0.6122 0.3252 x2 x2 0.1641 0.9028 x3 x3 -0.3222 0.1105 x4 x4 0.3605 -0.3791 x5 x5 0.0550 0.2048 x6 x6 0.0007 -0.0738

Canonical Correlation Results (Contd.) Correlations Between the Attitude to Exporting and Their Canonical Variables attitude1 attitude2 y1 y1 0.9891 -0.1470 y2 y2 0.7798 0.6261 Correlations Between the Firm characteristics and Their Canonical Variables demographics1 demographics2 x1 x1 0.8771 0.0208 x2 x2 0.0223 0.9038 x3 x3 -0.4618 0.4067 x4 x4 0.7944 -0.1369 x5 x5 0.4331 0.3525 x6 x6 0.5672 -0.1114 Correlations Between the Attitude to Exporting and the Canonical Variables of the Firm characteristics demographics1 demographics2 y1 y1 0.8484 -0.0639 y2 y2 0.6688 0.2720 Correlations Between the Firm characteristics and the Canonical Variables of the Attitude to Exporting attitude1 attitude2 x1 x1 0.7523 0.0090 x2 x2 0.0191 0.3926 x3 x3 -0.3961 0.1767 x4 x4 0.6814 -0.0595 x5 x5 0.3714 0.1531 x6 x6 0.4865 -0.04

Marketing Research