1 / 31

Linear classification

Linear classification. Biological inspirations. Some numbers… The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000 synapses Properties of the brain It can learn, reorganize itself from experience It adapts to the environment

sereno
Télécharger la présentation

Linear classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear classification

  2. Biological inspirations • Some numbers… • The human brain contains about 10 billion nerve cells (neurons) • Each neuron is connected to the others through 10000 synapses • Properties of the brain • It can learn, reorganize itself from experience • It adapts to the environment • It is robust and fault tolerant

  3. Biological neuron (simplified model) • A neuron has • A branching input (dendrites) • A branching output (the axon) • The information circulates from the dendrites to the axon via the cell body • The cell body sums up the inputs in some way and fires– generates a signal through the axon – if the result is greater than some threshold

  4. An Artificial Neuron • - weights • - inputs • Definition : Non linear, parameterized function with restricted output range Activation Function Usually not pictured (we’ll see why), but you can imagine a threshold parameter here.

  5. Same Idea using the Notation in the Book

  6. The Output of a Neuron • As described so far… This simplest form of a neuron is also called a perceptron.

  7. The Output of a Neuron • Other possibilities, such as the sigmoid function for continuous output. • is the activation of the neuron • is a parameter which controls the shape of the curve (usually )

  8. Linear Regression using a Perceptron • Linear regression: • Find a linear function (straight line) that best predicts the continuous-valued output.

  9. Linear Regression As an Optimization Problem • Finding the optimal weights could be solved through: • Gradient descent • Simulated annealing • Genetic algorithms • … and now Neural Networks

  10. Linear Regression using a Perceptron

  11. The Bias Term • So far we have defined the output of a perceptron as controlled by a threshold x1w1+ x2w2 + x3w3… + xnwn >= t • But just like the weights, this threshold is a parameter that needs to be adjusted • Solution: make it another weight x1w1+ x2w2 + x3w3… + xnwn + (1)(-t)>= 0 The bias term.

  12. A Neuron with a Bias Term

  13. Another Example • Assign weights to perform the logical OR operation.

  14. Artificial Neural Network (ANN) • A mathematical model to solve engineering problems • Group of highly connected neurons to realize compositions of non linear functions • Tasks • Classification • Discrimination • Estimation

  15. The information is propagated from the inputs to the outputs There are no cycles between outputs and inputs the state of the system is not preserved from one iteration to another Feed Forward Neural Networks Output layer 2nd hidden layer 1st hidden layer x1 x2 ….. xn

  16. ANN Structure • Finite number of inputs • Zero or more hidden layers • One or more outputs • All nodes at the hidden and output layers contain a bias term.

  17. Examples • Handwriting character recognition • Control of a virtual agent

  18. ALVINNNeural Network controlled AGV (1994) weights

  19. http://blog.davidsingleton.org/nnrccar

  20. Learning • The procedure that consists in estimating the weight parameters so that the whole network can perform a specific task • The Learning process (supervised) • Present the network a number of inputs and their corresponding outputs • See how closely the actual outputs match the desired ones • Modify the parameters to better approximate the desired outputs

  21. Perceptron Learning Rule • Initialize the weights to some random values (or 0) • For each sample in the training set • Calculate the current output of the perceptron, • Update the weights • Repeat until the error is smaller than some predefined threshold is the learning rate, usually

  22. Linear Separability • Perceptrons can classify any input that is linearly separable. • For more complex problems we need a more complex model.

  23. A B B A B A A B B A B A A B B A B A Different Non-Linearly Separable Problems Types of Decision Regions Exclusive-OR Problem Classes with Meshed regions Most General Region Shapes Structure Single-Layer Half Plane Bounded By Hyperplane Two-Layer Convex Open Or Closed Regions Arbitrary (Complexity Limited by No. of Nodes) Three-Layer

  24. Calculating the Weights • The weights are a vector of parameters where we need to find a global optimum • Could be solved by: • Simulated annealing • Gradient descent • Genetic algorithms • http://www.youtube.com/watch?v=0Str0Rdkxxo Perceptron learning rule is pretty much gradient descent.

  25. Learning the Weights in a Neural Network • Perceptron learning rule (gradient descent) worked before, but it required us to know the correct output of the node. • How do we know the correct output of a given hidden node??

  26. Backpropagation Algorithm • Gradient descent over entire network weight vector • Easily generalized to arbitrary directed graphs • Will find a local, not necessarily global error minimum • in practice often works well (can be invoked multiple times with different initial weights)

  27. Backpropagation Algorithm • Initialize the weights to some random values (or 0) • For each sample in the training set • Calculate the current output of the node, • For each output node , update the weights • For each hidden node, update the weights • For all network weights do • Repeat until weights converge or desired accuracy is achieved

  28. Intuition • General idea: hidden nodes are “responsible” for some of the error at the output nodes it connects to • The change in the hidden weights is proportional to the strength (magnitude) of the connection between the hidden node and the output node • This is the same as the perceptron learning rule, but for a sigmoid decision function instead of a step decision function (full derivation on p. 726)

  29. Intuition • General idea: hidden nodes are “responsible” for some of the error at the output nodes it connects to • The change in the hidden weights is proportional to the strength (magnitude) of the connection between the hidden node and the output node

  30. Intuition • When expanded, the update to the output nodes is almost the same as the perceptron rule • Slight difference is that the algorithm uses a sigmoid function instead of a step function • (full derivation on p. 726)

  31. Questions

More Related