1 / 17

CS344: Introduction to Artificial Intelligence

CS344: Introduction to Artificial Intelligence. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 17: Feedforward network: non linear separability 9 th Feb, 2012. # of Threshold functions n # Boolean functions (2^2^n) #Threshold Functions (2 n2 ) 1 4 4 2 16 14

tuwa
Télécharger la présentation

CS344: Introduction to Artificial Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS344: Introduction to Artificial Intelligence Pushpak BhattacharyyaCSE Dept., IIT Bombay Lecture 17: Feedforwardnetwork: non linear separability 9th Feb, 2012

  2. # of Threshold functions • n # Boolean functions (2^2^n)#Threshold Functions (2n2) • 1 4 4 • 2 16 14 • 3 256 128 • 64K 1008 • Functions computable by perceptrons - threshold functions • #TF becomes negligibly small for larger values of #BF. • For n=2, all functions except XOR and XNOR are computable.

  3. #regions produced by n hyperplanes in d dimensions determines the computing power of the perceptron

  4. Feedforward Network

  5. Limitations of perceptron • Non-linear separability is all pervading • Single perceptron does not have enough computing power • Eg: XOR cannot be computed by perceptron

  6. Solutions • Tolerate error (Ex: pocket algorithm used by connectionist expert systems). • Try to get the best possible hyperplane using only perceptrons • Use higher dimension surfaces Ex: Degree - 2 surfaces like parabola • Use layered network

  7. Pocket Algorithm • Algorithm evolved in 1985 – essentially uses PTA • Basic Idea: • Always preserve the best weight obtained so far in the “pocket” • Change weights, if found better (i.e. changed weights result in reduced error).

  8. Linear Separability(Single Hyper-planes) (0,1) +ve (1,1) -ve Hyper-plane 2 Hyper-plane 1 Line 1 better than Line 2, Since has less error +ve +ve (0,0) -ve (1,0) +ve

  9. Parabolic Surface Separation (0,1) +ve (1,1) -ve +ve +ve (0,0) -ve (1,0) +ve

  10. Linear Surface Separation(Multiple Hyper-planes) (0,1) +ve (1,1) -ve Hyper-plane 2 Hyper-plane 1 +ve +ve (0,0) -ve (1,0) +ve

  11. Multilayer Perceptron for XOR (1/2) y Required: x1 = 0, x2 = 0  y = 0 x1 = 1, x2 = 1 y = 0 x1 = 0, x2 = 1  y = 1 x1 = 1, x2 = 0  y = 1 x1 x2

  12. Multilayer Perceptron for XOR (2/2) y At (0,1): P1 y1 = 1 P2 y2 = 1 At (0,0): P1 y1 = 0 P2 y2 = 1 Y2 Y1 h2 h1 At (1,1): P1 y1 = 1 P2 y2 = 0 At (1,0): P1 y1 = 1 P2 y2 = 1 y = y1 AND y2 y1 = X1 OR X2 y2 = X1 NAND X2 x1 x2

  13. XOR using 2 layers: Another way • Non-LS function expressed as a linearly separable • function of individual linearly separable functions.

  14. Example - XOR • = 0.5 w1=1 w2=1  Calculation of XOR x1x2 x1x2 Calculation of x1x2 • = 1 w1=-1 w2=1.5 x2 x1

  15. Example - XOR • = 0.5 w1=1 w2=1 x1x2 1 1 x1x2 1.5 -1 -1 1.5 x2 x1

  16. Some Terminology • A multilayer feedforward neural network has • Input layer • Output layer • Hidden layer (asserts computation) Output units and hidden units are called computation units.

  17. Training of the MLP • Multilayer Perceptron (MLP) • Question:- How to find weights for the hidden layers when no target output is available? • Credit assignment problem – to be solved by “Gradient Descent”

More Related