1 / 28

Deep Learning

Deep Learning. Roger S. Gaborski. Visual System Visual cortex is defined in terms of hierarchical regions: V1  V2  V3  V4  V5  MST Some regions may be bypassed, depending on the features being extracted

nellis
Télécharger la présentation

Deep Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deep Learning Roger S. Gaborski

  2. Visual System Visual cortex is defined in terms of hierarchical regions: V1  V2  V3  V4  V5  MST Some regions may be bypassed, depending on the features being extracted The visual input becomes more abstract as the signals are processed by individual regions Roger S. Gaborski

  3. Multiple Stages of Processing Extracted Extracted Extracted Features Features Features INPUT DATA Layer1 Layer2 Layer3 OUTPUT DATA level of abstraction Roger S. Gaborski

  4. Traditional Neural Networks • Typically 2 layers, one hidden layer and one output layer • Uncommon to have more than 3 layers • INPUT and TARGET training data • Backward Error Propagation (BEP) becomes ineffective with more than 3 layers INPUT TARGET VALUE REF: www.nd.com Roger S. Gaborski

  5. Multilayer Neural Network • Build 6 layer feed forward neural network • Train with common training algorithm • RESULT: Failure ? ? ? ? INPUT DATA Layer1 Layer2 Layer3 OUTPUT DATA Roger S. Gaborski

  6. Deep Belief Networks • Need an approach that will allow training layer by layer – BUT I don’t know the output of each layer. • Hinton (2006) – A fast learning algorithm for deep belief networks” • Restricted Boltzmann Machine – single layer of hidden neurons not connected to each other • Fast algorithm that can find parameters even for deep networks (Contrastive Divergence Learning) •  Can Evolutionary Algorithms be used to evolve a network? Roger S. Gaborski

  7. One Layer Example of Neural Network Architecture Weight Matrix W 600 x 400 FEATURE VECTOR h INPUT VECTOR v v[1x600]*W[600x400] = h[1x400] 600 Input Neurons 400 Hidden Neurons Roger S. Gaborski

  8. How Do We Find Weight Matrix W • We need a ‘measure of error’ • One approach is to reconstruct the input vector v using the following equation: v_reconstruct [1x600] = h[1x400]*WT[400x600] • The difference between the reconstructed v and the original v is a measure of error Err = Σ ( v_reconstruct – v )2 Roger S. Gaborski

  9. SUMMARY: Goal • Propagate input vector to hidden units (Feed forward) • Propagate features extracted by hidden layer back to input neurons (Feed backwards) • Goal: Input vector and reconstructed input vector equivalent (Input = Reconstructed Input) • Use an evolutionary strategy approach to find W • This approach allows for any type of activation function and any network topology Roger S. Gaborski

  10. Use Evolutionary Algorithm to Find W Weight Matrix W 600 x 400 600 Input Neurons 400 Hidden Neurons Roger S. Gaborski

  11. Evolutionary Strategy ES(lambda+mu) • Lambda: size of population • Mu: Fittest individuals in population selected to create new population • Let lambda = 20, mu = 5 • Each selected fittest individual will create lambda/mu children (20/5 = 4) • The size of the new population will remain at 25 Roger S. Gaborski

  12. Population • Randomly reate the first population of potential W solutions: Current_population(:,:,k) = .1*randn([num_v,num_h]) • Evaluate each weight matrix W in population and rank W • Select mu fittest weight matrices. These will be used to create children (new potential solutions) • Create population of the mu fittest weight matrices and lambda/mu children for each mu • Population increases from lambda to lambda+mu, but error is monotonically decreasing function • Keep track of fittest W matrix Roger S. Gaborski

  13. Final Selection of W Matrix Best Weight Matrix W In Terms of Smallest Reconstructed Error 600 Input Neurons (data) 400 Hidden Neurons (h) Roger S. Gaborski

  14. Examples from the Simple Digit Problem

  15. Results for Binary Digit Problem Epochs = 50 BEST W AFTER 50 EPOCHS Roger S. Gaborski

  16. 50 Epochs Roger S. Gaborski

  17. Sample Results for Digit Problem Epochs = 500 BEST W AFTER 500 EPOCHS Roger S. Gaborski

  18. Sample Results for Digit Problem Epochs = 5000 BEST W AFTER 5000 EPOCHS Roger S. Gaborski

  19. Results for Digit Problem Epochs = 50,000 BEST W AFTER 50000 EPOCHS Roger S. Gaborski

  20. Results for Digit Problem Epochs = 50,000 BEST W AFTER 50000 EPOCHS Roger S. Gaborski

  21. Repeat Process with Second W using 400 Features as Input 400 Features 3 0 0 Fea t u r e s Best Weight Matrix W In Terms of Smallest Reconstructed Error Weight Matrix W2 600 Input Neurons Evolve W2 using 400 features as input Roger S. Gaborski

  22. Face RecognitionThe same approach is used to recognize faces Weight Matrix W 625 x 400 625 Input Neurons 400 Hidden Neurons Roger S. Gaborski 22

  23. 5000 Epochs, lambda=20, mu = 5 20 Faces in Training Data Grayscale, 25x25 Pixels Roger S. Gaborski

  24. 5000 Epochs, Typical Results Roger S. Gaborski

  25. Successfully Reconstruct Images from Features Roger S. Gaborski

  26. Random Data Roger S. Gaborski

  27. Face Classifier TRAIN ON FACES and NON-FACES FEATURES Matrices W1 , W2 ….W∞ Matrix V R FACE NON-FACE Two Output Neurons 625 Input Neurons Hidden Neurons Roger S. Gaborski

  28. Face DetectionNote: Face not in original training data Red Mark Upper Left Hand Corner of Face Roger S. Gaborski

More Related