1 / 20

Understanding Neural Networks Through C++ Implementation

Learn about feedforward neural networks, Perceptron, Sigmoid function, backpropagation, optimization, and more in this project. Implementation in C++ with use of CUDA for operations. Training results on MNIST database and performance analysis included.

wbeamer
Télécharger la présentation

Understanding Neural Networks Through C++ Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural Networks Vladimir Pleskonjić 3188/2015

  2. General • Feedforward neural networks • Inputs are numeric features • Outputs are in range (0, 1) Neural network

  3. Perceptron • Building blocks • Numeric inputs • Output in (0, 1) • Bias (always equal to +1) Perceptron

  4. Perceptron • Inputs are summed (bias included) weighted by perceived importance Θ • Output is sigmoid function of this sum Sigmoid function

  5. Architecture • Parallel perceptrons form one neural layer • Serial layers form the neural network • Multiple hidden layers are allowed Neural layers

  6. Feedforward • Calculating output is called feedforward • Layer by layer • Outputs are confidences for their class • Interpreted by the user

  7. Training • Weights Θ are the soul of our network • Train examples • Weights are configured to minimize the error function

  8. Error function • Error per output • Error per train example • Total error Error for single output

  9. Backpropagation • Partial derivatives of error function with respect to each weight in Θ • Derivative rules and formulae

  10. Optimization • Gradient descent • Learning rate • Weights should be initialized with small random values Descent

  11. Regularization • Overfitting • Addition to error function • Sum of squares of weights • Regularization coefficient • Not applied on links connecting to biases

  12. Stochastic gradient descent • Train cycles • Batch (all at once) • Stochastic (one by one) • Mini-batch (in groups)

  13. Putting it together • Design architecture • Randomize weights • Train cycle: backpropagation + gradient descent • Repeat train cycles a number of times • Feedforward to use • Interpret output confidences

  14. Implementation • C++ • Matrix class • CUDA used for some of the operations • NeuralNetwork class

  15. Code: feedforward

  16. Code: gradient descent

  17. Pseudo code: training

  18. Results • MNIST database • My configuration • Train accuracy: 95.3% • Test accuracy: 95.1%

  19. Performance Execution time comparison The aim of this project was not to achieve the optimal performance, but to reach a better understanding of the algorithm and its implementation.

  20. Questions? • Thank you!

More Related