190 likes | 297 Vues
Explore the impact of hidden layers and thresholded units in backpropagation networks, including the use of the delta rule for weight adjustments. Discover the importance of error functions and how solutions can be re-represented for learnability.
E N D
3 1 0 w1 w2 1 2 0 1
5 w5 w6 3 4 w2 w3 w1 w4 1 2 Same as a linear network without any hidden layer!
5 w5 w6 If netj > thresh, aj = 1 Else aj = 0 3 4 w2 w3 w1 w4 1 2
0 1 1 1 0 0 1 0 1 0 5 If netj > 9.9, aj = 1 Else aj = 0 10 -10 0 1 3 4 10 10 5 5 Unit 3 1 2 Unit 4
So with thresholded units and a hidden layer, solutions exist… • …and solutions can be viewed as “re-representing” the inputs, so as to make the mapping to the output unit learnable. • BUT, how can we learn the correct weights instead of just setting them by hand?
But what if: Simple delta rule: …What function should we use for aj?
1.00 0.90 0.80 0.70 0.60 0.50 Change in activation 0.40 Activation 0.30 0.20 0.10 0.00 -10 -5 0 5 10 Net input
5 w5 w6 3 4 w2 w3 w1 w4 1 2
5 6 Targets For outputs delta computed directly based on error. Delta is stored at each unit and also used directly to adjust each incoming weight. 5 6 Output 3 4 For hidden units, there are no targets; “error” signal is instead the sum of the output unit deltas. These are used to compute deltas for the hidden units, which are again stored with unit and used to directly change incoming weights. Hidden 1 2 Deltas, and hence error signal at output, can propagate backward through network through many layers until it reaches the input. Input
Sum-squared error: 5 w5 w6 3 4 w2 w3 w1 w4 1 2 Cross-entropy error:
5 w5 w6 3 4 w2 w3 w1 w4 1 2
1 0 0 1 3 w1 w2 1 2 2