Review

Review Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005

PatReco: Introduction Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005

PatReco:Applications • Speech/audio/music/sounds • Speech recognition, Speaker verification/id, • Image/video • OCR, AVASR, Face id, Fingerpring id, Video segmentation • Text/Language • Machine translatoin, document class., lnag mod., text underst. • Medical/Biology • Disease diagnosis, DNA sequencing, Gene disease models • Other Data • User modeling (books/music), Ling analysis (web), Games

Basic Concepts • Why statistical modeling? • Variability: differences between two examples of the same class in training • Mismatch: differences between two examples of the same class (one in training one in testing) • Learning modes: • Supervised learning: class labels known • Unsupervised learning: class labels unknown • Re-inforced learning: only positive/negative feedback

Basic Concepts • Feature selection • Separate classes, Low correlation • Model selection • Model type, Model order • Prior knowledge • E.g., a priori class probability • Missing features/observations • Modeling of time series • Correlation in time (model?), segmentation

PatReco: Algorithms • Parametric vs Non-Parametric • Supervised vs Unsupervised • Basic Algorithms: • Bayesian • Non-parametric • Discriminant Functions • Non-Metric Methods

PatReco: Algorithms • Bayesian methods • Formulation (describe class characteristics) • Bayes classifier • Maximum likelihood estimation • Bayesian learning • Estimation-Maximization • Markov models, hidden Markov models • Bayesian Nets • Non-parametric • Parzen windows • Nearest Neighbour

PatReco: Algorithms • Discriminant Functions • Formulation (describe boundary) • Learning: Gradient descent • Perceptron • MSE=minimum squared error • LMS=least mean squares • Neural Net generalizations • Support vector machines • Non-Metric Methods • Classification and Regression Trees • String Matching

PatReco: Algorithms • Unsupervised Learning: • Mixture of Gaussians • K-means • Other not-covered • Multi-layered Neural Nets • Stochastic Learning (Simulated Annealing) • Genetic Algorithms • Fuzzy Algorithms • Etc…

PatReco: Problem Solving • Data Collection • Data Analysis • Feature Selection • Model Selection • Model Training • Classification • Classifier Evaluation

Evaluation • Training Data Set • 1234 examples of class 1 and class 2 • Testing/Evaluation Data Set • 134 examples of class 1 and class 2 • Misclassification Error Rate • Training: 11.61% (150 errors) • Testing: 13.43% (18 errors) • Correct for chance (Training 22%, Testing 26%) • Why?

PatReco: Discriminant Functions for Gaussians Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005

PatReco: Problem Solving • Data Collection • Data Analysis • Feature Selection • Model Selection • Model Training • Classification • Classifier Evaluation

Discriminant Functions • Define class boundaries (instead of class characteristics) • Dualism: • Parametric class description  Bayes classifier  Decision boundary  Parametric Discriminant Functions

Normal Density • 1D • Multi-D • Full covariance • Diagonal covariance • Diagonal covariance + univariate • Mixture of Gaussians • Usually diagonal covariance

Gaussian Discriminant Functions • Same variance ALL classes • Hyper-planes • Different variance among classes • Hyper-quadratics (hyper-parabolas, hyper-ellipses etc.)

Hyper-Planes • When the covariance matrix is common across Gaussian classes • The decision boundary is a hyper-plane that is vertical to the line connecting the means of the Gaussian distributions • If the a-priori probabilities of classes are equal the hyper-planes cuts the line connecting the Gaussian means in the middle  Euclidean classifier

Gaussian Discriminant Functions • Same variance ALL classes • Hyper-planes • Different variance among classes • Hyper-quadratics (hyper-parabolas, hyper-ellipses etc.)

Hyper-Quadratics • When the Gaussian class variances are different the boundary can be • hyper-plane, multiple hyper-planes, hyper-sphere, hyper-parabola, hyper-elipsoid etc. • The boundary in general in NOT vertical to the Gaussian mean connecting line • If the a-priori probabilities of classes are equal the resulting classifier is a Mahalanobois classifier

Conclusions • Parametric statistical models describe class characteristics x by modeling the observation probabilities p(x|class) • Discriminant functions describe class boundaries parametrically • Parametric statistical models have an equivalent parametric discriminant function • For Gaussian p(x|class) distributions the decision boundaries are hyper-planes or hyper-quadratics

PatReco: Detection Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005

Detection • Goal: Detect an Event • Hit (Success) • False Alarm • Miss (Failure) • False Reject

PatReco: Estimation/Training Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005

Estimation/Training • Goal: Given observed data (re-)estimate the parameters of the model e.g., for a Gaussian model estimate the mean and variance for each class

Supervised-Unsupervised • Supervised training: All data has been (manually) labeled, i.e., assigned to classes • Unsupervised training: Data is not assigned a class label

Observable data • Fully observed data: all information necessary for training is available (features, class labels etc.) • Partially observed data: some of the features or some of the class labels are missing

Supervised Training(fully observable data) • Maximum likelihood estimation (ML) • Maximum a posteriori estimation (MAP) • Bayesian estimation (BE)

Training process • Collected data used for training consists of the following examples D = {x1, x2, … xN} • Step 1: Label each example with the corresponding class label ω1, ω2, ... ωΚ • Step 2: For each of the classes separately estimate the model parameters using ML, MAP, BE and the corresponding training examples D1, D2..DK

Training Process: Step 1 D = {x1, x2, x3, x4, x5, … xN} Label manually ω1, ω2, ... ωΚ D1 = {x11, x12, x13, … x1N1} D2 = {x21, x22, x23, … x2N2} ………… DK = {xK1, xK2, xK3, … xKNk}

Review

Review

Presentation Transcript

Review

Review

Review

Review

Review

Review, REVIEW!

Review Notes Lecture Review

REVIEW, REVIEW, REVIEW!!

Review

Review

Review

Review

ACT Review Paragraphs Review

Review

review

Review

Geometry Review CRCT Review

review

Review Trust Review

Review