Computacion inteligente

Computacion inteligente Introduction to Classification

Outline • Introduction • Feature Vectors and Feature Spaces • Decision boundaries • The nearest-neighbor classifier • Classification Accuracy • Accuracy Assesment • Linear Separability • Linear Classifiers • Classification learning

Introduction

? Classification • Classification: • A task of induction to find patterns from data • Inferring knowledge from data

Learning from data • You may find different names for Learning from data identification estimation Regression classification pattern recognition Function approximation, curve or surface fitting etc…

Sample data Composition of mammalian milk

Classification • Classification is an important component of intelligent systems • We have a special discrete-valued variable called the Class, C • C takes values in {c1, c2, ......, cm}

Classification • Problem is to decide what class an object is • i.e., what value the class variable Y is for a given object • given measurements on the object, e.g., x1, x2, …. • These measurements are called “features” we wish to learn a mapping from Features -> Class

Classification Functions • Notation • Input space: X – ( X Rn) • Output domain: Y • binary classification: C = {-1, 1} • m-class classification: Y = {c1, c2, c3, …, cm} • Training set : S Each class is described by a label

Classification Functions • We want a mapping or function which: • takes any combination of values x = (a, b, d, .... z) and, • produces a prediction C, • i.e., a function C = f(a,b, d, ….z) which produces a value c1 or c2, etc The problem is that we don’t know this mapping: we have to learn it from data!

a b C Classifier d z Classification Functions Feature Values (which are known, measured) Predicted Class Value (true class is unknown to the classifier)

Classification Algorithms Training Data Training Data Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classification: Model Construction

Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured? Classification: Prediction Using the Model

Applications of Classification • Medical Diagnosis • classification of cancerous cells • Credit card and Loan approval • Most major banks • Speech recognition • IBM, Dragon Systems, AT&T, Microsoft, etc • Optical Character/Handwriting Recognition • Post Offices, Banks, Gateway, Motorola, Microsoft, Xerox, etc • Email classification • classify email as “junk” or “non-junk” • Many other applications • one of the most successful applications of AI technology

Examples of Features and Classes

Examples: Classification of Galaxies Class 2 Class 1

Original Image Classified Image Examples: Image Classification • Remote sensing

Feature Vectors and Feature Spaces

Feature Vectors and Feature Spaces • Feature Vector: Say we have 2 features: • we can think of the features as a 2-component vector (i.e., a 2-dimensional vector, [a b]) • So the features correspond to a 2-dimensional space • We can generalize to d-dimensional space. This is called the “feature space”

Feature Vectors and Feature Spaces • Each feature vector represents the “coordinates” of a particular object in feature space • If the feature-space is 2-dimensional (for example), and the features a and b are real-valued • we can visually examine and plot the locations of the feature vectors

Data with 2 Features

Additive RGB color model HSV Color Space Data with 3 Features • Given a set of of balls • Classify it by “color”

Data from Two Classes • Classes: data sets D1 and D2: • sets of points from classes 1 and 2 • data are of dimension d • i.e., d-dimensional vectors If d = 2 (2 features), we can plot the data

Example of Data from 2 Classes

Another Example: Red Blood Cells

Data from Multiple Classes • Now consider that we have data from mclasses • e.g., m=5

Data from Multiple Classes • Now consider that we have data from mclasses • e.g., m=5 • We can imagine the data from each class being in a “cloud” in feature space

Composition of mammalian milk Proteins (%) Classes Fat (%) Example of Data from 5 Classes

Decision Boundaries

Decision Boundaries • What is a Classifier? • A classifier is a mapping from feature space to the class labels {1, 2, … m} • Thus, a classifier partitions the feature space into mdecision regions • The line or surface separating any 2 classes is the decision boundary

Decision Boundaries • Linear Classifiers • a linear classifier is a mapping which partitions feature space using a linear function • it is one of the simplest classifiers we can imagine In 2 dimensions the decision boundary is a straight line

2-Class Data with a Linear Decision Boundary

Class Overlap • Consider two class case • data from D1 and D2 may overlap • features = {age, body temperature}, • classes = {flu, not-flu} • features = {income, savings}, • classes = {good/bad risk}

Class Overlap • common in practice that the classes will naturally overlap • this means that our features are usually not able to perfectly discriminate between the classes • note: with more expensive/more detailed additional features (e.g., a specific test for the flu) we might be able to get perfect separation If there is overlap => classes are not linearly separable

Classification Problem with Overlap

TWO-CLASS DATA IN A TWO-DIMENSIONAL FEATURE SPACE 6 Decision Region 1 Decision 5 Region 2 4 3 Feature 2 2 1 0 Decision Boundary -1 2 3 4 5 6 7 8 9 10 Feature 1 Solution: A More Complex Decision Boundary

The Nearest Neighbor Classifier

Training Data and Test Data • Training data • examples with class values for learning. Used to build a classifier • Test data • new data, not used in the training process, toevaluate how well a classifier does on new data

Some Notation • Feature Vectors • x(i) is the ith training data feature vector • in MATLAB this could be the ith column of an dxN matrix • Class Labels • c(i) is the class label of the ith feature vector • in general, c(i) can take m different class values, (e.g., c = 1, c = 2, ...)

Some Notation • Training Data • Dtrain = {[x(1), c(1)] , [x(2), c(2)] , ……, [x(N), c(N)]} • N pairs of feature vectors and class labels • Test Data • Dtest = {[x(1), c(1)] , [x(2), c(2)] , ……, [x(M), c(M)]} • M pairs of feature vectors and class labels Let y be a new feature vector whose class label we do not know, i.e., we wish to classify it.

Nearest Neighbor Classifier • y is a new feature vector whose class label is unknown • SearchDtrain for the closest feature vector to y • let this “closest feature vector” be x(j) • Classifyy with the same label as x(j), i.e. • y is assigned label c(j)

Nearest Neighbor Classifier • How are “closest x” vectors determined?. We have to define a distance • Euclidean distance • Manhatan distance

Nearest Neighbor Classifier • How are “closest x” vectors determined?. We have to define a distance • Mahalanobis distance

Nearest Neighbor Classifier • How are “closest x” vectors determined? • We have to define a distance • typically use minimum Euclidean distance dE(x, y) = sqrt(S (xi - yi)2) • Side note: this produces a called “Voronoi tesselation” of the d-space • each point “claims” a cell surrounding it • cell boundaries are polygons Analogous to “memory-based” reasoning in humans

1 2 Feature 2 1 2 2 1 Feature 1 Geometric Interpretation of Nearest Neighbor

Computacion inteligente