1.23k likes | 1.39k Vues
Computacion inteligente. Introduction to Classification. Outline. Introduction Feature Vectors and Feature Spaces Decision boundaries The nearest-neighbor classifier Classification Accuracy Accuracy Assesment Linear Separability Linear Classifiers Classification learning. Introduction.
E N D
Computacion inteligente Introduction to Classification
Outline • Introduction • Feature Vectors and Feature Spaces • Decision boundaries • The nearest-neighbor classifier • Classification Accuracy • Accuracy Assesment • Linear Separability • Linear Classifiers • Classification learning
? Classification • Classification: • A task of induction to find patterns from data • Inferring knowledge from data
Learning from data • You may find different names for Learning from data identification estimation Regression classification pattern recognition Function approximation, curve or surface fitting etc…
Sample data Composition of mammalian milk
Classification • Classification is an important component of intelligent systems • We have a special discrete-valued variable called the Class, C • C takes values in {c1, c2, ......, cm}
Classification • Problem is to decide what class an object is • i.e., what value the class variable Y is for a given object • given measurements on the object, e.g., x1, x2, …. • These measurements are called “features” we wish to learn a mapping from Features -> Class
Classification Functions • Notation • Input space: X – ( X Rn) • Output domain: Y • binary classification: C = {-1, 1} • m-class classification: Y = {c1, c2, c3, …, cm} • Training set : S Each class is described by a label
Classification Functions • We want a mapping or function which: • takes any combination of values x = (a, b, d, .... z) and, • produces a prediction C, • i.e., a function C = f(a,b, d, ….z) which produces a value c1 or c2, etc The problem is that we don’t know this mapping: we have to learn it from data!
a b C Classifier d z Classification Functions Feature Values (which are known, measured) Predicted Class Value (true class is unknown to the classifier)
Classification Algorithms Training Data Training Data Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classification: Model Construction
Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured? Classification: Prediction Using the Model
Applications of Classification • Medical Diagnosis • classification of cancerous cells • Credit card and Loan approval • Most major banks • Speech recognition • IBM, Dragon Systems, AT&T, Microsoft, etc • Optical Character/Handwriting Recognition • Post Offices, Banks, Gateway, Motorola, Microsoft, Xerox, etc • Email classification • classify email as “junk” or “non-junk” • Many other applications • one of the most successful applications of AI technology
Examples: Classification of Galaxies Class 2 Class 1
Original Image Classified Image Examples: Image Classification • Remote sensing
Feature Vectors and Feature Spaces • Feature Vector: Say we have 2 features: • we can think of the features as a 2-component vector (i.e., a 2-dimensional vector, [a b]) • So the features correspond to a 2-dimensional space • We can generalize to d-dimensional space. This is called the “feature space”
Feature Vectors and Feature Spaces • Each feature vector represents the “coordinates” of a particular object in feature space • If the feature-space is 2-dimensional (for example), and the features a and b are real-valued • we can visually examine and plot the locations of the feature vectors
Additive RGB color model HSV Color Space Data with 3 Features • Given a set of of balls • Classify it by “color”
Data from Two Classes • Classes: data sets D1 and D2: • sets of points from classes 1 and 2 • data are of dimension d • i.e., d-dimensional vectors If d = 2 (2 features), we can plot the data
Data from Multiple Classes • Now consider that we have data from mclasses • e.g., m=5
Data from Multiple Classes • Now consider that we have data from mclasses • e.g., m=5 • We can imagine the data from each class being in a “cloud” in feature space
Composition of mammalian milk Proteins (%) Classes Fat (%) Example of Data from 5 Classes
Decision Boundaries • What is a Classifier? • A classifier is a mapping from feature space to the class labels {1, 2, … m} • Thus, a classifier partitions the feature space into mdecision regions • The line or surface separating any 2 classes is the decision boundary
Decision Boundaries • Linear Classifiers • a linear classifier is a mapping which partitions feature space using a linear function • it is one of the simplest classifiers we can imagine In 2 dimensions the decision boundary is a straight line
Class Overlap • Consider two class case • data from D1 and D2 may overlap • features = {age, body temperature}, • classes = {flu, not-flu} • features = {income, savings}, • classes = {good/bad risk}
Class Overlap • common in practice that the classes will naturally overlap • this means that our features are usually not able to perfectly discriminate between the classes • note: with more expensive/more detailed additional features (e.g., a specific test for the flu) we might be able to get perfect separation If there is overlap => classes are not linearly separable
TWO-CLASS DATA IN A TWO-DIMENSIONAL FEATURE SPACE 6 Decision Region 1 Decision 5 Region 2 4 3 Feature 2 2 1 0 Decision Boundary -1 2 3 4 5 6 7 8 9 10 Feature 1 Solution: A More Complex Decision Boundary
Training Data and Test Data • Training data • examples with class values for learning. Used to build a classifier • Test data • new data, not used in the training process, toevaluate how well a classifier does on new data
Some Notation • Feature Vectors • x(i) is the ith training data feature vector • in MATLAB this could be the ith column of an dxN matrix • Class Labels • c(i) is the class label of the ith feature vector • in general, c(i) can take m different class values, (e.g., c = 1, c = 2, ...)
Some Notation • Training Data • Dtrain = {[x(1), c(1)] , [x(2), c(2)] , ……, [x(N), c(N)]} • N pairs of feature vectors and class labels • Test Data • Dtest = {[x(1), c(1)] , [x(2), c(2)] , ……, [x(M), c(M)]} • M pairs of feature vectors and class labels Let y be a new feature vector whose class label we do not know, i.e., we wish to classify it.
Nearest Neighbor Classifier • y is a new feature vector whose class label is unknown • SearchDtrain for the closest feature vector to y • let this “closest feature vector” be x(j) • Classifyy with the same label as x(j), i.e. • y is assigned label c(j)
Nearest Neighbor Classifier • How are “closest x” vectors determined?. We have to define a distance • Euclidean distance • Manhatan distance
Nearest Neighbor Classifier • How are “closest x” vectors determined?. We have to define a distance • Mahalanobis distance
Nearest Neighbor Classifier • How are “closest x” vectors determined? • We have to define a distance • typically use minimum Euclidean distance dE(x, y) = sqrt(S (xi - yi)2) • Side note: this produces a called “Voronoi tesselation” of the d-space • each point “claims” a cell surrounding it • cell boundaries are polygons Analogous to “memory-based” reasoning in humans
1 2 Feature 2 1 2 2 1 Feature 1 Geometric Interpretation of Nearest Neighbor