260 likes | 620 Vues
Flexible Discriminant Analysis by Optimal Scoring. 張志豪. Outline. Introduction Generalization Linear Discriminant Analysis Example. Introduction. Mind : Linear discriminant analysis is equivalent to multi-response linear regression using optimal scorings to represent the groups.
E N D
Outline • Introduction • Generalization Linear Discriminant Analysis • Example
Introduction • Mind : • Linear discriminant analysis is equivalent to multi-response linear regression using optimal scorings to represent the groups. • In this way, any multi-response regression technique can be post-processed to improve their classification performance. • We obtain nonparametric versions of discriminant analysis by replacing linear regression by any nonparametric regression method. • 迴歸分析: • 探討各變數之間的關係, 並找出一適當的數學方程式表示其關係, 進而藉由該方程式預測未來 • 根據某變數來預測另一變數. 迴歸分析是以相關分析為基礎, 因任何預測的可靠性是依變數間關係的強度而有所不同
Linear Discriminant Analysis • A test observation with predictor X is classified to the class with centroid closest to X. • Assuming : • Class distribution is multivariate Gaussian distribution. • Different mean but common covariance matrix • The class prior probabilities are the same.
Linear Discriminant Analysis • Characteristics : • All the relevant distance information is contained in the at most J-1 dimensional subspace of Rp spanned by the J group centroids. • The decision boundaries are linear. • The dimension-reduced model can show better classification performance.
Linear Discriminant Analysis • Review : • 在前幾維上含有較多的classification information
Linear Discriminant Analysis • Drawback : • In practice linear decision boundaries are often too crude, and nonlinear boundaries can be more effective. • The Gaussian assumptions are rarely met, and sometimes a group might even be disjoint. • Using different class-covariance matrices in the Bayesian procedure results in quadratic decision boundaries, adds O(Jp2) parameters to the model, far too many if p is large.
Linear Discriminant Analysis • The transformed variables UTx have identity covariance within groups, and the Mahalanobis distance from x to the jth centroid in this space is simply the Euclidean distance
Linear Discriminant Analysis • Suppose is a function that assigns scores to the classes, such that the transformed class labels are optimally predictedby linear regression on X. This produces a one dimensional separation between the classes. • More generally, we can find K sets of independent scorings for the class labels, ,and K corresponding linear maps , k=1,...,K, chosen to be optimal for multiple regression in RK. If traning sample has the form (gi, xi), i=1,2,...N, then the scores and the maps are chosen to minimize the average squared residual :
Linear Discriminant Analysis • The set of scores are assumed to be mutually orthogonal and normalized with respect to an appropriate inner product to prevent trivial zero solutions. • It is well known that the sequence of LDA vectors uk are identical to the sequence up to a constant (Mardia, Kent and Bibby, 1979). • The standard way of carrying out a canonical correlation analysis is by way of a suitable singular value decomposition.
Linear Discriminant Analysis • Let Y be the N*J indicator matrix corresponding to the dummy variable coding for the class. • Let be a matrix of K score vectors for the J classes. • Let be the N*K matrix of transformed values of the classes with ikth element , then .
Linear Discriminant Analysis • Looking at (1), it is clear that if the scores were fixed we could minimize ASR by regressing on x. If we let project onto the column space of the predictors, this says • If we assume the scores have mean zero, unit variance, and are uncorrelated for the N observations , minimizing (2) amounts to finding the K largest eigenvectors with normalization , where , a diagonal matrix of the sample class proportions Nj/N.
Linear Discriminant AnalysisAlgorithm • The final coefficient matrix B is, up to a diagonal scale matrix, the same as the discriminant analysis coefficient matrix. is the kth largest eigenvalue computed in step 3 above. LDA transformation matrix
Linear Discriminant AnalysisExample 不同的observation有相同的class label 得到相同的mean Feature space內的某一個基底 Diagonal matrixvaule is Nj Symmetric matrix, observation與observation的關係每個值是一對observation的內積
Linear Discriminant AnalysisExample 統計class與維度的關係 平均後即為mean. Symmetric matrix, class與class的關係每個值是一對class mean的內積 第二個class的第三個維度經由observation的累積為12. 取其eigenvalue最大的前K個eigenvectors, 感覺好像只考慮between越大越好, 而不管with-in
ClassificationFlexible Discriminants Analysis • Nonparametric version • We replace the linear-projection operator by a nonparametric regression procedure, which we denote by the linear operator S. • One simple and effective approach to this end is to expand X into a larger set of basis variables h(X), and then simply use in place of . 凡是有內積運算都可以套用kernel fuction
Examplevowel recognition data • The eleven vowel sounds were uttered once by each of the fifteen speakers. Four male and four female speakers were used to train the networks, and the other four male and three female speakers were used for testing the performance. • 12 order LPC