1 / 21

Exploring the Principle of Presence for Lifelong Learning in Neural Networks

This work investigates the heuristic of "Presence" in structured neural networks, focusing on how concepts are memorized and learned. The Principle of Presence emphasizes that memorization is influenced by active concepts in mind, leading to faster generalization and efficient lifelong learning. By addressing issues like catastrophic forgetting and the importance of locality, this approach optimizes neural networks to adapt and grow knowledge over time. The research includes practical implementations and examples to illustrate the learning processes and implications of this principle.

renee
Télécharger la présentation

Exploring the Principle of Presence for Lifelong Learning in Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Principle of Presence: A Heuristic for Growing Knowledge Structured Neural Networks Laurent Orseau, INSA/IRISA, Rennes, France

  2. Neural Networks • Efficient at learning single problems • Fully connected • Convergence in W3 • Lifelong learning: • Specific cases can be important • More knowledge, more weights • Catastrophic forgetting -> Full connectivity not suitable -> Need localilty

  3. How can people learn so fast? • Focus, attention • Raw table storing? • Frog and • Car and • Running woman • With generalization

  4. What do people memorize? (1) • 1 memory: a set of « things » • Things are made of other, simpler things • Thing=concept • Basic concept=perceptual event

  5. What do people memorize? (2) • Remember only what is present in mind at the time of memorization: • What is seen • What is heard • What is thought • Etc.

  6. What do people memorize? (3) • Not what is not in mind! • Too many concepts are known • What is present: • Few things • Probably important • What is absent: • Many things • Probably unrelevant • Good but not always true -> heuristic

  7. Presence in everyday life • Easy to see what is present, harder to tell what is missing • Infants lose attention to balls that have just disappeared • The zero number invented long after other digits • Etc.

  8. The principle of presence • Memorization = create a new concept upon only active concepts • Independant of the number of known concepts • Few active concepts -> few variables -> fast generalization

  9. Implications • A concept can be active or inactive. • Activity must reflect importance, be rare ~ event (programming) • New concept = conjunction of actives ones • Concepts must be re-usable(lifelong): • Re-use = create a link from this concept • 2 independant concepts = 2 units -> More symbolic than MLP: a neuron can represent too many things

  10. Implementation: NN • Nonlinearity • Graphs properties: local or global connectivity • Weights: • Smooth on-line generalization • Resistant to noise • But more symbolic: • Inactivity: piecewise continuous activation function • Knowledge not too much distributed • Concepts not too much overlapping

  11. First implementation • Inputs: basic events • Output: target concept • No macro-concept: -> 3-layer • Neuron = conjunction, unless explicit (supervised learning), -> DNF • Output weights simulate priority

  12. Locality in learning • Only one neuron modified at a time: • Nearest = most activated • If target concept not activated when it should: • Generalize the nearest connected neuron • Add a neuron for that specific case • If target active, but not enough or too much: • Generalize the most activating neuron

  13. Learning: example (0) • Must learn AB. • Examples: ABC, ABD, ABE, but not AB. A B AB Inputs: C D Target already exists E …

  14. N1 active when A, B and C all active 1/3 Disjunction 2/3 1/3 N1 1 1/3 Conjunction 1 0 1-1/Ns 1 Learning: example (1) • ABC: A B AB C D E

  15. 1/3 >1/3 1/3 >1/3 1/3 <1/3 1/3 1/3 2/3 1 N2 1/3 Learning : example (2) • ABD: 2/3 A N1 1 B AB C D E

  16. >1/3 >>1/3 >>1/3 >1/3 <<1/3 <1/3 Learning : example (3) • ABE: N1 slightly active for AB 2/3 A N1 1 B AB C 1/3 1/3 2/3 1 N2 D 1/3 E

  17. Unuseful neuron Deleted by criterion Learning : example (4) • Final: N1 has generalized, active for AB 2/3 1/2 A N1 1 1/2 B 0 AB C 1/3 1/3 2/3 1 N2 D 1/3 E

  18. NETtalk task • TDNN: 120 neurons, 25.200 cnx, 90% • Presence: 753 neurons, 6.024 cnx, 74% • Then learns by heart • If inputs activity reversed -> catastrophic! • Many cognitive tasks heavily biased toward the principle of presence?

  19. Advantages w/r NNs • As many inputs as wanted, only active ones are used • Lifelong learning: • Large scale networks • Learns specific cases and generalizes, both quickly • Can lower weights without wrong prediction -> imitation

  20. But… • Few data, limiting the number of neurons: not as good as backprop • Creates many neurons (but can be deleted) • No negative weights

  21. Work in progress • Negative case, must stay rare • Inhibitory links • Re-use of concepts • Macro-concepts: each concept can become an input

More Related