Analytic Solution of Hierarchical Variational Bayes Approach in Inverse Problems

Analytic Solution of Hierarchical Variational Bayes Approach in Linear Inverse Problem Shinichi Nakajima, Sumio Watanabe Nikon Corporation Tokyo Institute of Technology

Contents • Introduction • Linear inverse problem • Hierarchical variational Bayes [Sato et al.04] • James-Stein estimator • Purpose • Theoretical analysis • Setting • Solution • Discussion • Conclusions

Linear inverse problem Linear inverse problem Example : Magnetoencephalography (MEG) : magnetic field detected by N detectors : observable : lead field matrix : constant matrix : parameter to be estimated : electric current at M sites : noise : observation noise Ill-posed !

Prior : Model : 1. Minimum norm maximum likelihood 2. Maximum A posterior (MAP) , where B-2 is constant. 3. Hierarchical Bayes B-2 is also a parameter to be estimated! Methods for ill-posed problem 1, 2 : similar. 3 : very different from 1, 2.

If estimate and by Bayesian methods, many small elements become zero. (relevance determination) See [9] if interested. Hierarchical Bayes a.k.a. Automatic Relevance Determination (ARD) [Mackay94,Neal96] Model : Prior : Estimate from observation, introducing hyperprior : Why ? singularities, hierarchy

Restriction: Hierarchical variational Bayes But, Bayes estimation requires huge computational costs. Apply VB [Sato et al.04]. Optimum = Bayes posterior Trial posterior: Free energy: where Variational method

: ML estimator (arithmetic mean) ML is efficient (never dominated by any unbiased estimator),but is inadmissible (dominated by biased estimator) when [Stein56]. ML JS (K=3) shrinkage factor true mean James-Stein (JS) estimator for any true Domination of a over b : for a certain true K-dimensional mean estimation (Regular model) A certain relation between EB and JSwas discussed in [Efron&Morris73] : samples James-Stein estimator [James&Stein61]

: degree of shrinkage Purpose [Sato et al.04] have derived simple iterative algorithm based on HVB in MEG application, and experimentally shown good performance. We theoretically analyze the HVB and derive its solution, and discuss a relation between HVB and positive-part JS, focusing on simplified version of Sato’s approach. Positive part JS :

Setting Consider time series data. a’ ARDModel : Prior : time u U b Use constant hyperparameter during U [Sato et al. 04] time u

Summary of setting Observable : Parameter : Hyperparmeter (constant during U): n : # of samples Constant matrix： Model : priors: m-th element : d-dimensional normal where : identity matrix

Variational condition Restriction: Variational method

Theorem 1 Not explicit! Theorem 1: The VB estimator of m-th element is given by where HVB solution is similar to positive-part JS estimator with degree of shrinkage proportional to U.

Proposition Simply use positive-part JS estimator : where Only requires calculation of Moore-Penrose inverse. (HVB needs iterative calculation.)

- When s are orthogonal, - When all s are parallel or orthogonal, Difference between VB and JS asymptotically equivalent. JS suppresses overfitting more than HVB. (ehhances relevant determination.) future work. - Otherwise,

Conclusions & future work • Conclusions • HVB provides similar result to JS estimation in linear inverse problem. • Time duration U affects learning. (large U enhances relevance determination．) • Future work • Difference from JS. • Bounds of Generalization Error. U a’ b time u time u

Thank you!

Analytic Solution of Hierarchical Variational Bayes Approach in Inverse Problems

Analytic Solution of Hierarchical Variational Bayes Approach in Inverse Problems

Presentation Transcript

Linear Inverse Problems

Bayes Classifier , Linear Regression

Inverse Source Problem

Linear Hierarchical Models

Variational Bayes Model Selection for Mixture Distribution

'Linear Hierarchical Models'

Variable Analytic Approach

Solution of the two dimensional inverse heat conduction problem using a Maximum Entropy Approach

III Solution of pde’s using variational principles

Variational Bayes 101

Generalization Error of Linear Neural Networks in an Empirical Bayes Approach

Stability Analysis of Positive Linear Switched Systems: A Variational Approach

HIERARCHICAL LINEAR MODELS

(Hierarchical) Log-Linear Models

Problem Solution Approach

ANALYTIC APPROACH:

Linear(-ized) Inverse Problems

Linear Hierarchical Modelling

Inverse Source Problem