1 / 69

Neural Network Applications Using an Improved Performance Training Algorithm

Neural Network Applications Using an Improved Performance Training Algorithm. Annamária R. Várkonyi-Kóczy 1, 2 , Balázs Tusor 2 1 Institute of Mechatronics and Vehicle Engineering, Óbuda University 2 Integrated Intelligent Space Japanese-Hungarian Laboratory

elam
Télécharger la présentation

Neural Network Applications Using an Improved Performance Training Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural Network Applications Using an Improved Performance Training Algorithm Annamária R. Várkonyi-Kóczy1, 2, BalázsTusor2 1 Institute of Mechatronics and Vehicle Engineering, Óbuda University 2 Integrated Intelligent Space Japanese-Hungarian Laboratory e-mail: varkonyi-koczy@uni-obuda.hu

  2. Outline • Introduction, Motivation for using SC Techniques • Neural Networks, Fuzzy Neural Networks, Circular Fuzzy Neural Networks • The place and success of NNs • A new training and clustering algorithms • Classification examples • A real-world application: fuzzy hand posture and gesture detection system • Inputs of the system • Fuzzy hand posture models • The NN based hand posture identification system • Results • Conclusions

  3. Motivation for using SC Techniques We need something ”non-classical”: Problems • Nonlinearity, never unseen spatial and temporal complexity of systems and tasks • Imprecise, uncertain, insufficient, ambiguous, contradictory information, lack of knowledge • Finite resources  Strict time requirements (real-time processing) • Need for optimization + • Need for user’s comfort New challenges/more complex tasks to be solved  more sophisticated solutions needed

  4. Motivation for using SC Techniques We need something ”non-classical”: Intentions • We would like to build MACHINES to be able to do the same as humans do (e.g. autonomous cars driving in heavy traffic). • We always would like to find an algorithm leading to an OPTIMUM solution (even when facing too much uncertainty and lack of knowledge) • We would like to ensure MAXIMUM performance (usually impossible from every points of view, i.e. some kind of trade-off e.g. between performance and costs) • We prefer environmental COMFORT (user friendly machines)

  5. Need for optimization • Traditionally: optimization = precision • New definition (L.A. Zadeh): optimization = cost optimization • But what is cost!? precision and certainty also carry a cost

  6. User’s comfort Human language Modularity, simplicity, hierarchical structures Aims of the processing preprocessing processing improving the performance of the algorithms giving more support to the processing (new) aims of preprocessing image processing / computer vision: noise smoothing feature extraction (edge, corner detection) pattern recognition, etc. 3D modeling, medical diagnostics, etc. automatic 3D modeling, automatic ... preprocessing processing

  7. Motivation for using SC Techniques We need something ”non-classical”: Elements of the Solution • Low complexity, approximate modeling • Application of adaptive and robust techniques • Definition and application of the proper cost function including the hierarchy and measure of importance of the elements • Trade-off between accuracy (granularity) and complexity (computational time and resource need) • Giving support for the further processing These do not cope with traditional and AI methods, only with Soft Computing Techniques and Computational Intelligence

  8. What is Computational Intelligence? Computer + Intelligence Increased computer facilities Added by the new methods L.A. Zadeh, Fuzzy Sets [1965]: “In traditional – hard – computing, the prime desiderata are precision, certainty, and rigor. By contrast, the point of departure of soft computing is the thesis that precision and certainty carry a cost and that computation, reasoning, and decision making should exploit – whenever possible – the tolerance for imprecision and uncertainty.”

  9. What is Computational Intelligence? • CI can be viewed as a consortium of methodologies which play important role in conception, design, and utilization of information/intelligent systems. • The principal members of the consortium are: fuzzy logic (FL), neuro computing (NC), evolutionary computing (EC), anytime computing (AC), probabilistic computing (PC), chaotic computing (CC), and (parts of) machine learning (ML). • The methodologies are complementary and synergistic, rather than competitive. • What is common: Exploit the tolerance for imprecision, uncertainty, and partial truth to achieve tractability, robustness, low solution cost and better rapport with reality.

  10. Soft Computing Methods (Computational Intelligence) fulfill all of the five requirements:(Low complexity, approximate modelingapplication of adaptive and robust techniquesDefinition and application of the proper cost function including the hierarchy and measure of importance of the elementsTrade-off between accuracy (granularity) and complexity (computational time and resource need)Giving support for the further processing)

  11. Methods of Computational Intelligence • fuzzy logic –low complexity, easy build in of the a priori knowledge into computers, tolerance for imprecision, interpretability • neuro computing - learning ability • evolutionary computing – optimization, optimum learning • anytime computing – robustness, flexibility, adaptivity, coping with the temporal circumstances • probabilistic reasoning – uncertainty, logic • chaotic computing – open mind • machine learning - intelligence

  12. Neural Networks • It mimics the human brain • (McCullogh & Pitts, 1943, Hebb, 1949) • Rosenblatt, 1958 (Perceptrone) • Widrow-Hoff, 1960 (Adaline) • …

  13. Neural Networks Neural Nets are parallel, distributed information processing tools which are • Highly connected systems composed of identical or similar operational units evaluating local processing (processing element, neuron) usually in a well-ordered topology • Possessing some kind of learning algorithm which usually means learning by patterns and also determines the mode of the information processing • They also possess an information recall algorithm making possible the usage of the previously learned information

  14. Application areas where NNs are successfully used • One and multi-dimensional signal processing (image processing, speech processing, etc.) • System identification and control • Robotics • Medical diagnostics • Economical features estimation • Associative memory = content addressable memory

  15. Application area where NNs are successfully used • Classification system (e.g. Pattern recognition, character recognition) • Optimization system (the usually feedback NN approximates the cost function) (e.g. radio frequency distribution, A/D converter, traveling salesman problem) • Approximation system (any input-output mapping) • Nonlinear dynamic system model (e.g. Solution of partial differential equation systems, prediction, rule learning)

  16. Main features • Complex, non-linear input-output mapping • Adaptivity, learning ability • distributed architecture • fault tolerant property • possibility of parallel analog or digital VLSI implementations • Analogy with neurobiology

  17. Classical neural nets • Static nets (without memory, feedforward networks) • One layer • Multi layer • MLP (Multi Layer Perceptron) • RBF (Radial Basis Function) • CMAC (Cerebellar Model Articulation Controller) • Dynamic nets (with memory or feedback recall networks) • Feedforward (with memory elements) • Feedback • Local feedback • Global feedback

  18. Feedforward architectures One layer architectures: Rosenblatt perceptron

  19. Feedforward architectures One layer architectures Input Output Tunable parameters (weighting factors)

  20. Feedforward architectures Multilayer network (static MLP net)

  21. Approximation property • universal approximation property for some kinds of NNs • Kolmogorov: Any continuous real valued N variable function defined over the [0,1]Ncompact interval can be represented with the help of appropriately chosen 1 variable functions and sum operation.

  22. Learning Learning = structure + parameter estimation • supervised learning • unsupervised learning • analytic learning • Convergence?? • Complexity??

  23. System: d=f(x,n) Criteria: C(d,y) NN Model: y=fM(x,w) Supervised learning estimation of the model parameters by x, y, d n (noise) d x Input C=C(ε) y Parameter tuning

  24. Supervised learning • Criteria function • Quadratic: • ...

  25. Minimization of the criteria • Analytic solution (only if it is very simple) • Iterative techniques • Gradient methods • Searching methods • Exhaustive • Random • Genetic search

  26. Parameter correction • Perceptron • Gradient methods • LMS (least means square algorithm) • ...

  27. Fuzzy Neural Networks • Fuzzy Neural Networks (FNNs) • based on the concept of NNs • numerical inputs • weights, biases, outputs: fuzzy numbers

  28. Circular Fuzzy Neural Networks (CFNNs) • based on the concept of FNNs • topology realigned to a circular shape • connection between the hidden and input layers trimmed • the trimming done depends on the input data • e.g., for 3D coordinates, each coordinate can be connectedto only 3 neighboring hidden layer neurons • dramatic decrease in the required training time

  29. Classification • Classification = the most important unsupervised learning problem: it deals with finding a structure in a collection of unlabeled data • Clustering = assigning a set of objects into groups whose members are similar in some way and are “dissimilar” to the objects belonging to other groups (clusters) • (usually iterative) multi-objective optimization problem • Clustering is a main task of explorative data mining, statistical data analysis used in machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, etc. • Difficult problem: multi-dimensional spaces, time/data complexity, finding an adequate distance measure, non-unambiguous interpretation of the results, overlapping of the clusters, etc.

  30. The Training and Clustering Algorithms • Goal: • To further increase the speed of the training of the ANNs used for classification • Idea: • During the learning phase, instead of directly using the training data the data should be clustered and the ANNs should be trained by using the centers of the obtained clusters u – input u’– centers of the appointed clusters y – output of the model d – desired output c – value determinedby the criteria function

  31. The Algorithm of the Clustering Step (modified K-means alg.)

  32. The ANNs • Feedforward MLP, BP algorithm • Number of neurons: 2-10-2 • learning rate: 0.8 • momentum factor: 0.1 • Teaching set: 500 samples, randomly chosen from the clusters • Test set: 1000 samples, separately generated

  33. Examples: Problem #1 • Easily solvable problem • 4 classes, no overlapping

  34. The Resulting Clusters and Required Training Time in the First Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.25 (First experiment)

  35. Comparison between the Results of the Training using the Clustered and the Cropped Datasets of the 1st Experiment

  36. Examples: Problem #2 Moderately hard problem 4 classes, slight overlapping

  37. The Resulting Clusters and Required Training Time in the Second Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.25

  38. Comparison between the Results of the Training using the Clustered and Cropped Datasets of the 2nd Experiment

  39. Comparison of the Accuracy and Training Time Results of the Clustered and Cropped Cases of the 2nd Experiment

  40. Examples: Problem #3 Hard problem 4 classes, significant overlapping

  41. The Resulting Clusters and Required Training Time in the Third Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.2

  42. Comparison between the Results of the Training using the Clustered and Cropped Datasets of the 3rd Experiment

  43. Comparison of the Accuracy Results of the Clustered and Cropped Cases of the 3rd Experiment

  44. Examples: Problem #4 easy problem 4 classes, no overlapping d = 0.2 0.1 0.05 The original dataset The trained network’s classifying ability

  45. Accuracy/training time

  46. Examples: Problem #5 Moderately complex problem 3 classes, with some overlapping The network could not learn the original training data with the same options d = 0.2 0.1 0.05 The original dataset

More Related