1 / 36

Single Layer Perceptron = McCulloch – Pitts Type

Ch 2. Concept Map. Single Layer Perceptron = McCulloch – Pitts Type. Learning starts in Ch 2. Architecture, Learning. Adaline : Linear Learning. Perceptron : Nonlinear Learning. Real valued output. Binary output. Binary output. Objective : W x > 0 , x ω ₁

jchamp
Télécharger la présentation

Single Layer Perceptron = McCulloch – Pitts Type

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch 2. Concept Map Single Layer Perceptron = McCulloch – Pitts Type Learning starts in Ch 2 Architecture, Learning Adaline : Linear Learning Perceptron : Nonlinear Learning Real valued output Binary output Binary output Objective : W x> 0 , x ω₁ W x < 0 , x ω₂ T ⊂ T T • Objective : W x = Any Desired • Real Value • Can be used for Both • Classification / Regression Objective : W x = 1 +- ⊂ T • Correction for Correct Sign, too • No correction for correct sign • Always Finds Perfect hyperplanes • for Linearly Separable Cases • All Patterns x Move Hyperplanes • by an Amount Proportional to x • Finds Optimum Hyperplanes for • Linearly Nonseparable Cases • Error Patterns Push & Pull • Hyperplanes

  2. x (n) 2 x (n) D ( ) n 1 y ( n ) ( n ) 2 e ( n ) d ( n ) Chapter 2. Single Layer Perceptron 1. Perceptron Ref. Perceptron, Madaline and Backpropagation, Widrow, Proc.IEEE 90. Rosenblatt - Perceptron Learning 2 D 1 Adjusted Training Set Adjustable Weights x (n) 1 Adder s(n) å M M M (n)=Bias D+1 1 (n) + D - Error signal response Desired

  3. Perceptron Learning – Error Correction Learning • Initialize to zero or some random vector. 2) Input and Adjust the weights as follows: . α = Learning Rate that scales x. Rosenblatt set α = 1. The choice does not affect stability but it affects convergence time only if w(0) ≠ 0. A. Case 1: d(n) – y(n) = 1 or d(n) = 1 and y(n) =0 ; y(n+1) y(n) B. Case 2: d(n) – y(n) = –1 or d(n) = 0 and y(n) = 1 ; C. Case 3: d(n) = y(n) Do nothing. 3) Repeat 1)2) until No Classification Error for a Linearly Separable Case.

  4. x ( 1 ) w ( 1 ) 0 w ( 2 ) 0 D ( 2 ) (2) Patterns of Classification Error - push and pull the decision hyperplanes. ( 1 ) D Same as (1) except

  5. (3) Example – OR Learning x(1) pullsdown to , x(2) pulls up to . 3 1 2 2 or (0) X = 0 w T 1 : Error at x(1) 1 2 (1) X = 0 w (2) X = 0 T T 3 w Error at x(2) – Overcorrection 3 2 No Error for All Patterns– Terminate Learning

  6. Of AND Function

  7. 2. Adaptive Linear Element – ADALINE = Linear Neuron If d(n) = Real, Regression. If d(n) = Integer (0/1 or -1/+1), Classification. Perceptron Adaline s(n) y(n) x(n) - Perceptron error å D + D e(n) d(n)

  8. - LMS Learning Rule – Normalized, controls stability and speed of conv. • In general, Data Set is not completely specified in advance. • LMS is good for training data stream drawn from stationary distribution at least for x(n) In practice 0.1 < <1 cf. -LMS : Instantaneous Gradient Descent

  9. (2) Minimum Disturbance Principle – Adapt to reduce the Output Errorfor the Current Training Pattern, with Minimal disturbance to Responses already learned x(n) w(n+1) D w(n) w(n)

  10. (3) Learning Rate Scheduling (Annealing) α, μ = Learning Rate, Step Size μ(n+1) = μ(n) – β or μ(n) = c/n or μ0 / [1 + n/τ] The Perceptron Learning Rule can also be derived from the LMS-Rule :

  11. Perceptron Learning Works with binary (bipolar)outputs LMS Learning Works with both binary and analog Always converges for linearly separable cases - Theorem May not converge even for linearly separable cases May be unstable for linearly nonseparable cases – Use pocket Alg. Also works well for linearly nonseparable cases – finds minimum error solution No correction for correct classification Always corrects Objective: > 0 or < 0 Objective: = 1 or − 1 Nonlinear rule Linear rule

  12. Graphical Representation of Learning For classification with Adaline, one could just stop learning when the signs are all correct – way to speed up Adaline learning. Perceptron ω 1 1 1 ω 1 Adaline After Learning -1 -1 ω ω 2 2

  13. (4) ADALINE To Noise Cancellation, Equalization, Echo Cancellation

  14. Student Questions -2005 • Adaline Learning takes longer than Perceptron in general. But, still it may have an edge over Perceptron in some aspect ? • Diff. between α–LMS and μ–LMS. Self-normalizing vs. const. Latter always converges in the mean to the minimum MSE solution. • Anyway to learn without desired outputs ? • What does it mean when Perceptron learning can be derived from LMS using J = … • Not clear on Regression. • How can we guarantee a whole system convergence ? • Which of Perceptron and Adaline performs better, is more efficient ?

More Related