1 / 51

Deep Learning - Evolution and Future Trends

Artificial Intelligence is transforming the world. Deep Learning, an integral part of this new Artificial Intelligence paradigm, is becoming one of the most sought after skills. Learn more about Deep Learning and its Evolution.<br>

Télécharger la présentation

Deep Learning - Evolution and Future Trends

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture Series: AI is the New Electricity Deep Learning - SCOPING, EVOLUTION & FUTURE TRENDS Dr. Chiranjit Acharya AILABS Academy J-3, GP Block, Sector V, Salt Lake City, Kolkata, West Bengal 700091 Presented at AILABS Academy, Kolkata on April 18th 2018 Confidential, unpublished property of aiLabs. Do not duplicate or distribute. Use and distribution limited solely to authorized personnel. (c) Copyright 2018

  2. A Journey into Deep Learning ▪Cutting edge technology ▪Garnered traction in both industry and academics ▪Achieves near-human-level performance in many pattern recognition tasks ▪Excels in ▪structured, relational data ▪unstructured rich-media data such as image, video, audio and text AILABS (c) Copyright 2018 2

  3. A Journey into Deep Learning ▪What is Deep Learning? Where is the “deepness”? ▪Where does Deep Learning come from? ▪What are the models and algorithms of Deep Learning? ▪What is the trajectory of evolution of Deep Learning? ▪What are the future trends of Deep Learning? AILABS (c) Copyright 2018 3

  4. A Journey into Deep Learning AILABS (c) Copyright 2018 4

  5. Artificial Intelligence Holy Grail of AI Research ▪Understanding the neuro-biological and neuro- physical basis of human intelligence ▪science of intelligence ▪Building intelligent machines which can think and act like humans ▪engineering of intelligence AILABS (c) Copyright 2018 5

  6. Artificial Intelligence Facets of AI Research ▪knowledge representation ▪Reasoning ▪natural language understanding ▪natural scene understanding AILABS (c) Copyright 2018 6

  7. Artificial Intelligence Facets of AI Research ▪natural speech understanding ▪problem solving ▪Perception ▪Learning ▪planning AILABS (c) Copyright 2018 7

  8. Machine Learning Basic Doctrine of Learning ▪learning from examples Outcome of Learning ▪rules of inference for some predictive task ▪embodiment of the rules = model ▪model is an abstract computing device •kernel machine, decision tree, neural network AILABS (c) Copyright 2018 8

  9. Machine Learning Connotations of Learning ▪process of generalization ▪discovering nature/traits of data ▪unraveling patterns and anti-patterns in data AILABS (c) Copyright 2018 9

  10. Machine Learning Connotations of Learning: ▪knowing distributional characteristics of data ▪identifying causal effects and propagation ▪identifying non-causal co variations & correlations AILABS (c) Copyright 2018 10

  11. Machine Learning Design Aspects of Learning System ▪ Choose the training experience ▪ Choose exactly what is to be learned, i.e. the target function / machine ▪ Choose objective function & optimality criteria ▪ Choose a learning algorithm to infer the target function from the experience. AILABS (c) Copyright 2018 11

  12. Learning Work Flow ▪Stage 1: Feature Extraction, Feature subset selection, Feature Vector Representation ▪Stage 2: Training / Testing Set Creation and Augmentation ▪Stage 3: Training the Inference Machine ▪Stage 4: Running the Inference Machine on Test Set ▪Stage 5: Stratified Sampling and Validation AILABS (c) Copyright 2018 12

  13. Feature Extraction / Selection low-level parts mid-level parts Cognitive Elements high-level parts additional descriptors Domain Expert Corpus Knowledge Engineer Sparse Sparse Coder Representation AILABS (c) Copyright 2018 13

  14. Training Set Augmentation Existing Existing Training Set training set Sparse Samples Representation Random Sampler Reviewer Augmented training set AILABS (c) Copyright 2018 14

  15. Training and Prediction / Recognition Prediction / Recognition Model Adaptive Learner Training Set Unlabelled Residual Corpus Predicted / Recognized Corpus Prediction/ Recognition Model AILABS (c) Copyright 2018 15

  16. Sampling , Validation & Convergence Human Reviewed Stratified sub- samples Predicted Corpus Stratified sub- samples Reviewer Stratified Sampler Precision & Recall Calculator Go back to Training Set Augmentation End of Relevance Scoring No Yes Converged ? AILABS (c) Copyright 2018 16

  17. Evolution of Connectionist Models 1943: Artificial neuron model (McCulloch & Pitts) ▪ "A logical calculus of the ideas immanent in nervous activity" ▪ simple artificial “neurons” could be made to perform basic logical operations such as AND, OR and NOT ▪ known as Linear Threshold Gate ▪ NO learning AILABS (c) Copyright 2018 17

  18. Evolution of Connectionist Models 1943: Artificial neuron model (McCulloch & Pitts) w1j x1 n w2j   y  s w x b ( ) f s j j x2 j ij i j yj 0 i wnj xn bj AILABS (c) Copyright 2018 18

  19. Evolution of Connectionist Models 1957: Perceptron model (Rosenblatt) ▪ invention of learning rules inspired by ideas from neuroscience if Σ inputi * weighti> threshold, output = +1 if Σ inputi * weighti< threshold, output = -1 ▪ learns to classify input into two output classes ▪ Sigmoid transfer function: boundedness, graduality    1 as y x    0 as y x AILABS (c) Copyright 2018 19

  20. Evolution of Connectionist Models 1943: Artificial neuron model (McCulloch & Pitts) w1j x1 n w2j   y  s w x b ( ) f s j j x2 j ij i j yj 0 i wnj 1  js  1 e xn bj AILABS (c) Copyright 2018 20

  21. Evolution of Connectionist Models 1960s: Delta Learning Rule (Widrow & Hoff)    1 2 2 ˆ y ( ) E y Define the error as the squared residuals summed over all training cases: ▪ n n n       ˆ y w E E w    1 2 n n Now differentiate to get error derivatives for weights ▪ ˆ y n i i n    ˆ y ( ) x y , i n n n n The batch delta rule changes the weights in proportion to their error derivatives summed over all training cases ▪   E w     w i i AILABS (c) Copyright 2018 21

  22. Evolution of Connectionist Models 1969: Minsky's objection to Perceptrons ▪ Marvin Minsky & Seymour Papert: Perceptrons ▪ Unless input categories are linearly separable, a perceptron cannot learn to discriminate between them. ▪ Unfortunately, it appeared that many important categories were not linearly separable. AILABS (c) Copyright 2018 22

  23. Evolution of Connectionist Models 1969: Minsky's objection to Perceptrons Perceptrons are good at linear classification but ... x1 1 1 1 1 1 1 1 1 1 x2 AILABS (c) Copyright 2018 23

  24. Evolution of Connectionist Models 1969: Minsky's objection to Perceptrons Perceptrons are incapable of simple nonlinear classification like XOR (1) x1 (1) (0) X1 X2 Output 0 0 0 0 1 1 1 0 1 1 1 0 (0) (0) (1) (XOR operation) x2(1) (0) AILABS (c) Copyright 2018 24

  25. Universal Approximation Theorem Existential Version (Kolmogorov) ▪ There exists a finite combination of superposition and addition of continuous functions of single variables which can approximate any continuous, multivariate function on compact subsets of R^d. Constructive Version (Cybenko) ▪ The standard multilayer feed-forward network with a single hidden layer, containing finite number of hidden neurons, is a universal approximator among continuous functions on compact subsets of R^d, under mild assumptions on the activation function. AILABS (c) Copyright 2018 25

  26. Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) ▪ solution to Minsky's objection regarding perceptron's limitation ▪ nonlinear classification is achieved by fully connected, multilayer, feedforward networks of perceptrons (MLP) ▪ MLP can be trained by backpropagation ▪ Two-pass algorithm ▪ forward propagation of activation signals from input to output ▪ backward propagation of error derivatives from output to input AILABS (c) Copyright 2018 26

  27. Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) Layer 1 Layer 2 Input Output y1 1x 2x y2 … … … … … … … yM … x N Input Layer Hidden Layer Output Layer AILABS (c) Copyright 2018 27

  28. Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) ▪ solution to Minsky's objection regarding perceptron's limitation ▪ nonlinear classification is achieved by fully connected, multilayer, feedforward networks of perceptrons (MLP) ▪ MLP can be trained by backpropagation ▪ Two-pass algorithm ▪ forward propagation of activation signals from input to output ▪ backward propagation of error derivatives from output to input AILABS (c) Copyright 2018 28

  29. Machine Learning Example Handwriting Digit Recognition Machine “2” AILABS (c) Copyright 2018 29

  30. Handwriting Digit Recognition Input Output y1 0.1 is 1 y1 1x 2x y2 y2 0.7 is 2 The image is “2” … … … … y10 y1 0.2 is 0 x 256 16 x 16 = 256 Each output represents the confidence of a digit. Color → 1 No color → 0 AILABS (c) Copyright 2018 30

  31. Example Application Handwriting Digit Recognition y1 1x 2x y2 Machine “2 ” y10 … … … … x 256 AILABS (c) Copyright 2018 31

  32. Evolution of Connectionist Models 1989: Convolutional Neural Network (LeCun) neuron Layer 1 Layer 2 Layer L Input Output y1 … … … … 1x 2x y2 … … … … … … … … … yM … … … x N Output Layer Input Layer Hidden Layers Deep means many hidden layers AILABS (c) Copyright 2018 32

  33. Convolutional Neural Network ▪ Input can have very high dimension. Using a fully-connected neural network would need a large amount of parameters. ▪ CNNs are a special type of neural network whose hidden units are only connected to local receptive field. The number of parameters needed by CNNs is much ▪ ▪ smaller. Example: 200x200 image a)fully connected: 40,000 hidden units => 1.6 billion parameters b)CNN: 5x5 kernel (filter), 100 feature maps => 2,500 parameters AILABS (c) Copyright 2018 33

  34. Convolution Operation Patc h AILABS (c) Copyright 2018 34

  35. Convolution Operation in CNN Input: an image (2-D array): x Convolution kernel (2-D array of learnable parameters): w Feature map (2-D array of processed data): s Convolution operation in 2-D domains: ▪ ▪ ▪ ▪ AILABS (c) Copyright 2018 35

  36. Convolution Filters AILABS (c) Copyright 2018 36

  37. Convolution Operation with Filters C AILABS (c) Copyright 2018 37

  38. Convolution Layers Convolution Layer Channels Feature Maps AILABS (c) Copyright 2018 38

  39. 3 Stages of a Convolutional Layer AILABS (c) Copyright 2018 39

  40. Non Linear Stage Tanh(x) ReLU AILABS (c) Copyright 2018 40

  41. Evolution of Connectionist Models 2006: Deep Belief Networks (Hinton), Stacked Auto-Encoders (Bengio) neuron Layer 1 Layer 2 Layer L Input Output y1 … … … … 1x 2x y2 … … … … … … … … … yM … … … x N Output Layer Input Layer Hidden Layers Deep means man y hidden layers AILABS (c) Copyright 2018 41

  42. Deep Learning Traditional pattern recognition models use hand-crafted features and relatively simple trainable classifier. “Simple” Trainable Classifier hand-crafted feature extractor output This approach has the following limitations: • It is very tedious and costly to develop hand-crafted features ▪ The hand-crafted features are usually highly dependents on one application, and cannot be transferred easily to other applications AILABS (c) Copyright 2018 42

  43. Deep Learning Deep learning = representation learning Seeks to learn automatically through multiple stage of feature learning process. hierarchical representations (i.e. features) High-level features Mid-level features Trainable classifier Low-level features output Feature visualization of convolutional net trained on ImageNet (Zeiler and Fergus, 2013) AILABS (c) Copyright 2018 43

  44. Learning Hierarchical Representations High-level features Mid-level features Trainable classifier Low-level features output Increasing level of abstraction Hierarchy of representations with increasing level of abstraction. Each stage is a kind of trainable nonlinear feature transformation Image recognition Pixel → edge → motif → part → object Text Character → word → word group → clause → sentence → story AILABS (c) Copyright 2018 44

  45. Pooling Common pooling operations: Max pooling Report the maximum output within a rectangular neighborhood. Average pooling Report the average output of a rectangular neighborhood (possibly weighted by the distance from the central pixel). AILABS (c) Copyright 2018 45

  46. CiFAR10 CiFAR10 AILABS (c) Copyright 2018 46

  47. Deep CNN on CiFAR10 Deep CNN on CiFAR10 AILABS (c) Copyright 2018 47

  48. Deep CNN on CiFAR10 Deep CNN on CiFAR10 AILABS (c) Copyright 2018 48

  49. Deep CNN on CiFAR10 Deep CNN on CiFAR10 AILABS (c) Copyright 2018 49

  50. Future Trends ▪ Different and wider range of problems are being addressed ▪ natural language understanding ▪ natural scene understanding ▪ natural speech understanding ▪ Feature learning is being investigated at deeper level ▪ Manifold learning ▪ Reinforcement learning ▪ Integration with other paradigms of machine learning AILABS (c) Copyright 2018 50

More Related