160 likes | 335 Vues
A Constrained Optimization Approach For Function Approximations and System Identification. Gianluca Di Muro and Silvia Ferrari Lab for Intelligent Systems and Controls (LISC) Duke University WCCI 2008 June 4 2008. Motivation.
E N D
A Constrained Optimization Approach For Function Approximations and System Identification Gianluca Di Muro and Silvia Ferrari Lab for Intelligent Systems and Controls (LISC) Duke University WCCI 2008 June 4 2008
Motivation • Sigmoidal Neural Networksprovide excellent universal function approximation for multivariate input/output spaces on a compact set • They suffer from a serious limitation, known as interference, when they learn multivariate mapping sequentially: new knowledge may complete erase previous knowledge • Interference may jeopardize the use of ANNs
Long-Term-Memory (LTM) refers to prior knowledge that must be preserved by ANN at all times • Short-Term-Memory (STM) refers to new information that, acquired incrementally, not to be consolidated into LTM • CPROP enforces LTM constraints, constituting previous knowledge, adopting as many LTM connections as number of constraints • Constraints are satisfied analytically, while STM knowledge is acquired incrementally LTM and STM
Problem Assumptions Suppose we want to approximate a multi-dimensional function: known through a training set of input-output samples and assume that LTM can be expressed by a training set of input/output samples and, eventually, derivative information
Problem formulation: Constrained Optimization Network connections are partitioned into LTM synaptic connections and STM connections, with weights: and respectively Let the error vectors, relative to the new data (STM) which we want to assimilate, be: , j = 1,2,…,n ; k = 1,2,…,r CPROP training may be stated as: minimize subject to S. Ferrari and M. Jensenius, “A Constrained Optimization Approach to Preserving Prior Knowledge During Incremental Training”, IEEE Trans. Neural Netw., vol. 19, no. 6, pp. 996-1009, June 2008
Derivation of the Adjoined Gradient Output equation Memory equation: Output with memory Adjoined Gradient
n1 w11 1 p1 p2 . . . pq v1 d1 1 z v2 n2 2 . . . d2 vs . . . 1 ns s wsq ds 1 NN Application Examples NN architecture LM algorithm has been implemented for training Function Approximation Solution of ODE System ID
NN Function Approximation Results * * J. Mandziuk and L. Shastri, “Incremental class learning - an approach to longlife and scalable learning,” in Proc. IJCNN, 1999
NN Function Approximation, CPROP Results
NN Solution of ODE on Memory equation is given by boundary/initial conditions Define a grid on in order to approximate the RHS Obtain Training Data Samples from the ODE Minimize MSE on RHS, subject to memory equation Relevant Application: when an analytic approx is needed
NN System ID Application a > 0, b > 0 on angular displacement angular velocity NN used to approximate the system dynamics LTM: linearized equation near equilibria STM: state-space equation ‘far’ from equilibria to obtain the state-vector we have integrated the NN output, using RK4 Pendulum with friction
NN System ID showing interference Unconstrained NN, showing interference near unstable equilibria
NN System ID trained using CPROP NN trained using CPROP preserves its LTM virtually intact
NN System ID Accuracy of the Solution Accuracy of the LTM preserved by CPROP, showed by zooming in the phase portraits about
LISC Conclusions CPROP is capable to suppress interference, which jeopardizes the use of NN in many applications, through the adoption of the Adjoined Gradient Numeric results show excellent generalization and extrapolation properties Future work: extension to PDE Acknowledgment: National Science Foundation (ECS 0300236, CAREER ‘05).