Investigating the Role of Individual Neurons as Outlier Detectors

Investigating the role of individual neurons as outlier detectors Carlos López-Vázquez Laboratorio LatinGEO SGM+Universidad ORT del Uruguay September 15th, 2015 carloslopez@uni.ort.edu.uy

Agenda • Motivation for Outlier detection stage • ANN as a regression tool • Formulation of the rule • Case 1 of application: small dataset • Case 2 of application: large dataset

Why to worry with outliers? • Outliers are unusual (in some sense) events • Might adversely affect further calculations OR • Might be the most valuable result! • Usually ANN produces an output given an input • Always! • What about the consequences? • We might want to detect spurious inputs

Example #1: Medical From Lucila Ohno-Machado, 2004 • Given some inputs, detect/classify a possible coronary disease

Myocardial Infarction Network Intensity Duration Elevation Pain Smoker ECG: ST Pain Age Male Answer: just a number y=“Probability” of MI 0.8 No room for I DON'T KNOW!

Example #2: Autonomous Land Vehicle • NN learns to steer an autonomous vehicle. • 960 input units, 4 hidden units, 30 output units • Driving at speeds up to 70 miles per hour ALVINN System Image of a forward - mounted camera Weight values for one of the hidden units

Coming soon?

Goal • Identify unlikely coming events • And thus (maybe) refuse to estimate outputs! • Supplement ANN answer (numerical, categorical) with some credibility flag • How? • Showing unlikely events during training (supervised) • Relying on already trained ANN (unsupervised)

Multi Layer Perceptron (MLP) y=18.4*v1-22.1*v2+10.2*v3 y=10.4*v1+5.12*v2+8.9*v3 y=20.2*v1+0.18*v2-9.1*v3 X1 adjustable weights 18.4 -22.1 10.2 10.4 5.12 8.9 20.2 0.18 -9.1 X2 v1 X3 v2 y v3 X4 X5

Why weights are so different? • Conjecture: • It might denote a specific role for the neuron • Such role can be connected to outliers • Wow! Which one are candidates? • Large weights? Small weights? • Preliminary analysis suggested that Large WeightsOutlier detectors But... convince me!

Two different problems • 1) Does the rule indeed works? If so: • 2) How it performs when compared with other outlier detection procedures?

Example #3: Iris Flower Classification • 3 species of Iris – SETOSA, VERSICOLOR, VIRGINICA • Each flower has parts called sepal & petal • Length and Width of sepal & petal can be used to determine iris type • Data collected on large number of iris flowers • For example, in one flower petal length=6.7mm and width=4.3mm also sepal length=22.4mm & sepal width =62.4mm. Iris type was SETOSA • An ANN can be trained to determine specie of iris for given set of petal and sepal width and length

Iris training and testing data

sepal length sepal width petal length petal width Classification using regression Somewhat unusual paper: used regression with a single output instead of the common three binary outputs! Unusual paper: internal weights of the ANN were published! v1 v2 y v3 From Benítez et al., 1997

sepal length sepal width petal length petal width Classification using regression From Benítez et al., 1997 v1 v2 y v3 The ANN can be simplified...

Pruned ANN and the classification is still good despite not exact sepal length sepal width petal length petal width Which role had the other two?

Modified version z=“credibility flag” y=2.143v3 All misclassifications now announced by z=1!

Example #4: daily rain dataset • Weather records typically have missing values • Many applications require complete databases • Well established linear methods for interpolate spatial observations exist • Their performance is poor for daily rain records • Why not ANN?

Data and test area description • 30 years of daily records for 10 stations were available • 30 % of the events have missing values • More than 80% of the readings are of zero rain, evenly distributed in the year • Annual averages ranges from 1600 to 500 mm/day; time correlation is low

Non-linear interpolants: ANN • We used ANN as interpolators, with 9 inputs and 1 output • The training was performed with one third of the dataset using backpropagation and minimizing the RMSE • Some different architectures were considered (one and two hidden layers; different number of neurons, etc.) as well as some transformations of the data

Skipping other details… • We applied our rule to each of the 10 ANN • Run a Monte Carlo experiment, seeding known outliers at random and locating them afterwards • Thorough comparison against state-of-the-art alternatives (details in the paper) • The ANN-based outlier detection tool performed very well • Best, when outlier size (Mozilla effect) was ignored • Satisfactory otherwise

Pros… • Training stage is as usual; no special routine is required • We inspect the internal weights; no need to retraining • Unsupervised classifications: outliers are not declared as such in advance • Might offer an objective criteria to suspect underfitting

Cons… • Weights might be sensible to outliers (masking effect) which in turn might prevent to detect them • Which outliers are located? Only some suitable ones?

Questions? Carlos López-Vázquez Laboratorio LatinGEO SGM+Universidad ORT del Uruguay September 15th, 2015 carloslopez@uni.ort.edu.uy

Investigating the Role of Individual Neurons as Outlier Detectors

Investigating the Role of Individual Neurons as Outlier Detectors

Presentation Transcript

Outlier Detection

Outlier removal

The Leader as an Individual

The Role of the individual in communication.

Investigating Ratios As Instructional Tasks

Class 6: Role of individual emissions

Networks of Neurons

The failures of public policy and the role of the individual…

Individual responsibility and my role as university teacher

Outlier

Large underground detectors as

The Leader as an Individual

The Consumer as an Individual

Role of the Key Individual

Investigating Individual Software Development: An Evaluation of the Leap Toolkit

Animals as detectors of bio-events

THE LEADER AS AN INDIVIDUAL

The role of neurons in perception

Outlier removal

The Cytology of Neurons

Characteristics of neurons

Amazing Role of Individual Therapy in the Treatment of an Individual