1 / 15

Mutation Operator Evolution for EA-Based Neural Networks

Mutation Operator Evolution for EA-Based Neural Networks. By Ryan Meuth. Environment. Reward. State. Agent. Action. State Value Estimate. Action Policy. Reinforcement Learning. Reinforcement Learning. Good for On-Line learning where little is known about environment

palmer
Télécharger la présentation

Mutation Operator Evolution for EA-Based Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mutation Operator Evolution for EA-Based Neural Networks By Ryan Meuth

  2. Environment Reward State Agent Action State ValueEstimate Action Policy Reinforcement Learning

  3. Reinforcement Learning • Good for On-Line learning where little is known about environment • Easy to Implement in Discrete Environments • Value estimate can be stored for each state • In infinite time, optimal policy guaranteed. • Hard to Implement in Continuous Environments • Infinite States! Must estimate Value Function. • Neural Networks Can be used for function approximation.

  4. Neural Network Overview • Feed Forward Neural Network • Based on biological theories of neuron operation

  5. Feed-Forward Neural Network

  6. Recurrent Neural Network

  7. Neural Network Overview • Traditionally used with Error Back-Propagation • BP uses Samples to Generalize to Problem • Few “Unsupervised” Learning Methods • Problems with No Samples: On-Line Learning • Conjugate Reinforcement Back Propagation

  8. EA-NN • Both Supervised and Unsupervised Learning Method. • Uses weight set as genome of individual • Fitness Function is Mean-Squared Error over target function. • Mutation Operator is a sample from a Gaussian Distribution. • Possible that mutation operator might not be best.

  9. Uh… Why? • Could improve EA-NN efficiency • Faster Online Learning • Revamped tool for Reinforcment Learning • Smarter Robots. • Why Use an EA? • Knowledge – Independent

  10. Experimental Implementation • First Tier – Genetic Programming • Individual is Parse-tree representing Mutation operator • Fitness is Inverse of sum of MSE’s from EA Testbed • Second Tier – EA Testbed • 4 EA’s, spanning 2 classes of problems • 2 Feed-Forward Non-Linear Approximations • 1 High-Order, 1 Low-Order • 2 Recurrent Time Series Predictions • 1 Will be Time-Delayed, 1 Not Time-Delayed

  11. GP Implementation • Functional Set: {+,-,*,/} • Terminal Set: • Weight to be Modified • Random Constant • Uniform Random Variable • Over-Selection: 80% of Parents from top 32% • Rank-Based Survival • Initialized by Grow Method (Max Depth of 8) • Fitness: 1000/(AvgMSE) – num_nodes • P(Recomb) = 0.5; P(Mutation) = 0.5; • Repair Function • 5 runs, 100 generations each. • Steady State: Population of 1000 individuals, 20 children per generation.

  12. EA-NN Implementation • Recombination: Multi-Point Crossover • Mutation: Provided by GP • Fitness: MSE over test function (minimize) • P(Recomb) = 0.5; P(Mutation) = 0.5; • Non-Generational: Population of 10 individuals, 10 children per generation • 50 Runs of 50 Generations.

  13. Results • This is where results would go. • Single Uniform Random Variable: ~380 • Observed Individuals: ~600 • Improvement! Just have to Wait and See…

  14. Conclusions • I don’t know anything yet.

  15. Questions? Thank You!

More Related