1 / 150

Feed-Forward Neural Networks

Feed-Forward Neural Networks. 主講人 : 虞台文. Content. Introduction Single-Layer Perceptron Networks Learning Rules for Single-Layer Perceptron Networks Perceptron Learning Rule Adaline Leaning Rule -Leaning Rule Multilayer Perceptron Back Propagation Learning algorithm.

eileen
Télécharger la présentation

Feed-Forward Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Feed-Forward Neural Networks 主講人: 虞台文

  2. Content • Introduction • Single-Layer Perceptron Networks • Learning Rules for Single-Layer Perceptron Networks • Perceptron Learning Rule • Adaline Leaning Rule • -Leaning Rule • Multilayer Perceptron • Back Propagation Learning algorithm

  3. Feed-Forward Neural Networks Introduction

  4. Historical Background • 1943 McCulloch and Pitts proposed the first computational models of neuron. • 1949 Hebb proposed the first learning rule. • 1958 Rosenblatt’s work in perceptrons. • 1969 Minsky and Papert’s exposed limitation of the theory. • 1970s Decade of dormancy for neural networks. • 1980-90s Neural network return (self-organization, back-propagation algorithms, etc)

  5. Nervous Systems • Human brain contains ~ 1011 neurons. • Each neuron is connected ~ 104 others. • Some scientists compared the brain with a “complex, nonlinear, parallel computer”. • The largest modern neural networks achieve the complexity comparable to a nervous system of a fly.

  6. Neurons • The main purpose of neurons is to receive, analyze and transmit further the information in a form of signals (electric pulses). • When a neuron sends the information we say that a neuron “fires”.

  7. Neurons Acting through specialized projections known as dendrites and axons, neurons carry information throughout the neural network. This animation demonstrates the firing of a synapse between the pre-synaptic terminal of one neuron to the soma (cell body) of another neuron.

  8. x1 wi1 x2  yi wi2 . . . f (.) a (.) wim =i xm= 1 bias A Model ofArtificial Neuron

  9. x1 wi1 x2  yi wi2 . . . f (.) a (.) wim =i xm= 1 bias A Model ofArtificial Neuron

  10. y1 y2 yn . . . . . . . . . . . . x1 x2 xm Feed-Forward Neural Networks • Graph representation: • nodes: neurons • arrows: signal flow directions • A neural network that does not contain cycles (feedback loops) is called a feed–forward network (or perceptron).

  11. y1 y2 yn . . . Output Layer . . . . . . . . . Input Layer x1 x2 xm Layered Structure Hidden Layer(s)

  12. y1 y2 yn . . . . . . . . . . . . x1 x2 xm Knowledge and Memory • The output behavior of a network is determined by the weights. • Weights  the memory of an NN. • Knowledge distributed across the network. • Large number of nodes • increases the storage “capacity”; • ensures that the knowledge is robust; • fault tolerance. • Store new information by changing weights.

  13. y1 y2 yn . . . . . . . . . . . . x1 x2 xm Pattern Classification output pattern y • Function: x y • The NN’s output is used to distinguish between and recognize different input patterns. • Different output patterns correspond to particular classes of input patterns. • Networks with hidden layers can be used for solvingmore complex problems then just a linear pattern classification. input pattern x

  14. yi1 di1 yi2 di2 yin din xi1 xi2 xim Training Set Training . . . . . . Goal: . . . . . .

  15. y1 y2 yn . . . . . . . . . . . . x1 x2 xm Generalization • By properly training a neural network may produce reasonable answers for input patterns notseen during training (generalization). • Generalization is particularly useful for the analysis of a “noisy” data (e.g. time–series).

  16. y1 y2 yn . . . . . . . . . with noise without noise . . . x1 x2 xm Generalization • By properly training a neural network may produce reasonable answers for input patterns notseen during training (generalization). • Generalization is particularly useful for the analysis of a “noisy” data (e.g. time–series).

  17. Applications • Pattern classification • Object recognition • Function approximation • Data compression • Time series analysis and forecast • . . .

  18. Feed-Forward Neural Networks Single-Layer Perceptron Networks

  19. . . . y1 y2 yn . . . w1m w2m wn1 w22 w12 wn2 w11 wnm w21 . . . xm= 1 x1 x2 xm-1 The Single-Layered Perceptron

  20. . . . y1 y2 yn d1 d2 dn . . . w1m w2m wn1 w22 w12 wn2 w11 wnm w21 . . . xm= 1 x1 x2 xm-1 Training a Single-Layered Perceptron Training Set Goal:

  21. . . . y1 y2 yn d1 d2 dn . . . w1m w2m wn1 w22 w12 wn2 w11 wnm w21 . . . xm= 1 x1 x2 xm-1 Learning Rules • Linear Threshold Units (LTUs) : Perceptron Learning Rule • Linearly Graded Units (LGUs) : Widrow-Hoff learning Rule Training Set Goal:

  22. Feed-Forward Neural Networks Learning Rules for Single-Layered Perceptron Networks Perceptron Learning Rule Adline Leaning Rule -Learning Rule

  23. x1 wi1 x2 wi2  . . . +1 1 wim=i xm= 1 Perceptron Linear Threshold Unit sgn

  24. x1 wi1 x2 wi2  . . . +1 1 wim=i xm= 1 Goal: Perceptron Linear Threshold Unit sgn

  25. Class 1 (+1) Class 2 (1) x2 y x1 2 1 2 x2 x3= 1 x1 Goal: Example Class 1 g(x) = 2x1 +x2+2=0 Class 2

  26. Class 1 (+1) Class 2 (1) y 2 1 2 x2 x3= 1 x1 Goal: Augmented input vector Class 1 (+1) Class 2 (1)

  27. x3 x2 x1 Class 1 (0,0,0) (1,0, 1) (1.5, 1, 1) (2,0, 1) y (2.5, 1, 1) (1, 2, 1) (1, 2, 1) Class 2 2 1 2 x2 x3= 1 x1 Goal: Augmented input vector

  28. x3 x2 x1 Class 1 (0,0,0) (1,0, 1) (1.5, 1, 1) (2,0, 1) y (2.5, 1, 1) (1, 2, 1) (1, 2, 1) Class 2 2 1 2 x2 x3= 1 x1 Goal: Augmented input vector A plane passes through the origin in the augmented input space.

  29. 1 1 1 0 0 0 1 1 1 Linearly Separable vs. Linearly Non-Separable AND OR XOR Linearly Separable Linearly Separable Linearly Non-Separable

  30. Goal • Given training setsT1C1andT2  C2withelements in form ofx=(x1, x2 , ..., xm-1 , xm) T, where x1, x2 , ..., xm-1 Randxm= 1. • AssumeT1 andT2arelinearly separable. • Findw=(w1, w2 , ..., wm) Tsuch that

  31. wTx = 0is a hyperplain passes through the origin ofaugmented input space. Goal • Given training setsT1C1andT2  C2withelements in form ofx=(x1, x2 , ..., xm-1 , xm) T, where x1, x2 , ..., xm-1 Randxm= 1. • AssumeT1 andT2arelinearly separable. • Findw=(w1, w2 , ..., wm) Tsuch that

  32. d = +1 + d = 1  x2 w1 x w2 w6 x1 w5 w3 w4 Observation Which w’s correctly classify x? + What trick can be used?

  33. d = +1 + d = 1  x2 w x x1 Observation Is this w ok? + w1x1 +w2x2 = 0

  34. d = +1 + d = 1  x2 x w x1 Observation w1x1 +w2x2 = 0 Is this w ok? +

  35. d = +1 + d = 1  x2 x ? w ? x1 Observation w1x1 +w2x2 = 0 Is this w ok? + How to adjust w? w = ?

  36. d = +1 + d = 1  x2 x w x1 Observation Is this w ok? + How to adjust w? w = x reasonable? >0 <0

  37. d = +1 + d = 1  x2 x w x1 Observation Is this w ok? + reasonable? How to adjust w? w = x >0 <0

  38. d = +1 + d = 1  x2 w x x1 Observation Is this w ok?  w = ? +x x or

  39. d = +1 + d = 1  +   + No error Perceptron Learning Rule Upon misclassification on Define error

  40. +   + No error Perceptron Learning Rule Define error

  41. Learning Rate Error (dy) Input Perceptron Learning Rule

  42. Based on the general weight learning rule. Summary  Perceptron Learning Rule correct incorrect

  43. x y . . . . . .  d +  Converge? Summary  Perceptron Learning Rule

  44. x y . . . . . .  d +  • Exercise: Reference some papers or textbooks to prove the theorem. Perceptron Convergence Theorem If the given training set is linearly separable, the learning process will converge in a finite number of steps.

  45. x2 x(1) + x(2) + x1 x(3)  x(4)  Linearly Separable. The Learning Scenario

  46. x2 x(1) + x(2) + x1 x(3)  x(4) w0  The Learning Scenario

  47. x2 x(1) + x(2) w1 + x1 w0 x(3)  x(4) w0  The Learning Scenario

  48. x2 x(1) w2 + w1 x(2) w1 + x1 w0 x(3)  x(4) w0  The Learning Scenario

  49. x2 w2 x(1) w3 w2 + w1 x(2) w1 + x1 w0 x(3)  x(4) w0  The Learning Scenario

  50. x2 w2 x(1) w2 + w1 x(2) w1 + x1 w0 x(3)  x(4) w0  w4 = w3 The Learning Scenario w3

More Related