1 / 12

Discriminant Analysis

Discriminant Analysis. Useful to classify a sampling unit in one or other group. The discriminant function is a linear combination of several predictor variables. The discriminant function maximizes the between-group variation and minimize the within-group variation. .

zed
Télécharger la présentation

Discriminant Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discriminant Analysis • Useful to classify a sampling unit in one or other group. • The discriminant function is a linear combination of several predictor variables. • The discriminant function maximizes the between-group variation and minimize the within-group variation.

  2. A test on discriminant function • When the value of the discriminant model is significant, we reject Ho: the group means are equal.

  3. What else we get out of discrimiant analysis? • Which predictors are different? • Which groups are different?

  4. When is Discriminant Analysis useful? • If you want to do profile analysis of the sampling units, use it. • When you want to predict “bank failures”, use it. • When you want assess who are the credit risk customers. • When you want to screen out the susceptible women for breast cancer for example. • This idea increases the correct classifications.

  5. How does it work? • In Seven stages.

  6. Stage 1: objectives • Determine the statistically different groups. • Identify severely causing predictors. • Establish classifying rules. • Develop the discriminant functions.

  7. Stage 2: Designing the analysis • The “response variable” must be categorical. • Categories must be mutually exclusive. • Decide on • Number of categories, • Predictor variables, • Sample size [using a rule of thumb: sample size > 20 times the number of predictors]; at least 20 observations should be in each category. • Divide the data into two segments; Build the discriminant model using one segment and validate it using the other segment.

  8. Stage 3: Check out the validity of the assumptions. • At least two groups for the “response variable”. • Data follow multivariate normality. • The unknown covariances of the groups should be equal [using Box’s M test]. • Apply remedies for any violation of the assumption: • Increase sample size, • Transform the data, • Consider quadratic rather than linear discriminant function, • Eliminate multicollinearity among predictors, • Remove outliers or overly influencing observations.

  9. Stage 4: Estimate the model and assess its fit. • Use either simultaneous [which uses all predictors even if some are weak] or stepwise [uses only the best predictors] method of estimation. • Use one of the criteria to assess the fit: • Wilks’ lambda • Hotelling’s trace, • Pillai’s score, • Roy’s greatest eigenvalue, • Mahalanobi’s distance, • Rao’s V measure • Press’s Q statistic based on # correct classifications [works well for large sample].

  10. Stage 5: Interpretation • Interpret the “best” predictor for classification using: • Discriminant weights, • Discriminant loadings [=correlation between the discriminant function and a predictor], • Larger F values signifying the greater discriminating ability. • Like in factor analysis, use rotation of the axes on several discriminant functions. • Use “potency index” to identify which predictor discriminates better in several discriminant functions. • Longer the “vector” from origin to a discriminant loading, the predictor is more important.

  11. Stage 6: Validation methods • Splicing the data into 2 segments. • Collect new data and see how the discriminant function performs in there. • Profile the groups with several new non-considered predictors.

  12. What follow? • An example, • Comments, • Questions. • Thank you!!

More Related