ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION Weifeng Liu, P. P. Pokharel, J. C. Principe CNEL, University of Florida weifeng@cnel.ufl.edu Acknowledgment: This work was partially supported by NSF grant ECS-0300340 and ECS-0601271.

Outlines • Maximization of correntropy criterion (MCC) • Minimization of error entropy (MEE) • Relation between MEE and MCC • Minimization of error entropy with fiducial points • Experiments

Supervised learning • Desired signal D • System output Y • Error signal E

Supervised learning • The goal in supervised training is to bring the system output ‘close’ to the desired signal. • The concept of ‘close’, implicitly or explicitly employs a distance function or similarity measure. • Equivalently, to minimize the error in some sense. • For instance, MSE

Maximization of Correntropy Criterion • Correntropy of the desired signal and the system output V(D,Y) is estimated by • where

Correntropy induced metric • Define • satisfy the following properties: • Non-negativity • Identity of indiscernibles • Symmetry • Triangle inequality

CIM contours • Contours of CIM(E,0) in 2D sample space • close, like L2 norm • Intermediate, like L1 norm • far apart, saturates with large-value elements (direction sensitive)

MCC is minimization of CIM • MCC   

MCC is M-estimation MCC   where

Minimization of Error Entropy • Renyi’s quadratic error entropy is estimated by • Information Potential (IP)

Relation between MEE and MCC • Define • Construct

Relation between MEE and MCC

IP induced metric • Define • is a pseudo-metric. • NO identity of indiscernibles.

IPM contours • Contours of IPM(E,0) in 2D sample space • valley along e1 = e2, not sensitive to the error mean • saturates with points far from the valley

MEE and its equivalences • MEE     

MEE is M-estimation Assume the error PDF with then

Nuisance of conventional MEE • How to determine the location of the error PDF since it is shift-invariant. • Conventionally by making the error mean equal to zero. • In the case that the error PDF is non-symmetric or has heavy tails the estimation of error mean is problematic. • Fixing the error peak at the origin is obviously better than the conventional method of shifting the error based on zero-mean.

ERROR ENTROPY WITH FIDUCIAL POINTS • supervised training  most of the errors equal to zero • minimizes the error entropy with respect to 0 • Denote • E is the error vector and e0 serves a point of reference

ERROR ENTROPY WITH FIDUCIAL POINTS • In general, we have

ERROR ENTROPY WITH FIDUCIAL POINTS • λis a weighting constant between 0 and 1 • how many fiducial points at the origin • λ =0  MEE • λ =1  MCC • 0 < λ < 1  Minimization of Error Entropy with Fiducial points (MEEF).

ERROR ENTROPY WITH FIDUCIAL POINTS • MCC term locates the main peak of the error PDF and fixes it at the origin even in the cases where the estimation of the error mean is not robust • Unifying two cost functions actually retains all the merits of being completely robust with outlier resistance and kernel size resilience.

Metric induced by MEEF • Well-defined metric • directional sensitive • favor errors with the same sign • penalize errors have different signs

X input variable f unknown function N noise Y observation Noise PDF Experiment 1: Robust regression

Regression results

Experiment 2: Chaotic signal prediction • Mackey-Glass chaotic time series with parameter t=30 • time delayed neural network (TDNN) • 7 inputs, • 14 hidden PEs • tanh nonlinearity • 1 linear output

Training error PDF

Conclusions • Establish connections between MEE, distance function and M-estimation • Theoretically explains the robustness of this family of cost functions • Unify MEE and MCC in the framework of information theoretic models • propose a new cost function—minimization of error entropy with fiducial points (MEEF) which solves the problem of MEE being shift-invariant in an elegant and robust way.

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

Presentation Transcript

Estimation Error and Portfolio Optimization

Entropy and Information

National Accounts and SAM Estimation Using Cross-Entropy Methods

Entropy Estimation and Applications to Decision Trees

P0D Reconstruction Systematic Error Estimation

P0D Reconstruction Systematic Error Estimation (Update)

Error Estimation in TV Imaging

Multilevel Models in Survey Error Estimation

WRFDA Background Error Estimation

M-Estimation

ERROR ENTROPY, CORRENTROPY AND M-ESTIMATION

Replicate Variance Estimation and High Entropy Variance Approximation

Error Estimation

Estimation : Precision and accuracy , standard error , confidence intervals

Geostationary surface albedo retrieval error estimation

Estimation Error and Portfolio Optimization

Geostationary surface albedo retrieval error estimation

Error estimation

Entropy and Majorisation

Correntropy as a similarity measure

Model Error and Parameter Estimation