1 / 18

Neural Networks, Fuzzy Logic, and Statistical Methods

Neural Networks, Fuzzy Logic, and Statistical Methods. CITS4404 AI and Adaptive Systems. Neural Networks (NNs). Reading: S. Russell and P. Norvig, Section 20.5, Artificial Intelligence: A Modern Approach, Prentice Hall, 2002.

Télécharger la présentation

Neural Networks, Fuzzy Logic, and Statistical Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural Networks, Fuzzy Logic, and Statistical Methods CITS4404 AI and Adaptive Systems

  2. Neural Networks (NNs) Reading: S. Russell and P. Norvig, Section 20.5, Artificial Intelligence: A Modern Approach, Prentice Hall, 2002. G. McNeil and D. Anderson, “Artificial Neural Networks Technology”, The Data & Analysis Center for Software Technical Report, 1992.

  3. The Nature-Inspired Metaphor • Inspired by the brain: • neurons are structurally simple cells that aggregate and disseminate electrical signals • computational power and intelligence emerges from the vast interconnected network of neurons • NNs act mainly as: • function approximators • pattern recognisers • They learn from observed data Diagrams taken from a report on neural networks by C. Stergiou and D. Siganos An Overview of Core CI Technologies

  4. The Neuron Model Bias Weight Activation Function • A neuron combines values via its input function and its activation function • The bias determines the threshold needed for a “positive” response • Single-layer neural networks (perceptrons) can represent only linearly-separable functions Input Function Output Input Links Output Links An Overview of Core CI Technologies

  5. Multi-Layered Neural Networks • A network is formed by the connections (links) of many nodes • inputs map to outputs through one or more hidden layers • Link-weights control the behaviour of the function represented by the NN • adjusting the weights changes the encoded function An Overview of Core CI Technologies

  6. Multi-Layered Neural Networks • Hidden layers increase the “power” of the NN at the cost of extra complexity and training time: • perceptrons capture only linearly-separable functions • an NN with a single (sufficiently large) hidden layer can represent any continuous function with arbitrary accuracy • two hidden layers are needed to represent discontinuous functions • There are two main types of multi-layered NNs: • feed-forward: simple acyclic structure – the stateless encoding allows functions of just its current input • recurrent: cyclic feedback loops are allowed – the stateful encoding supports short-term memory An Overview of Core CI Technologies

  7. Training Neural Networks • Training means adjusting link-weights to minimise some measure of error (the cost function) • i.e. learning is an optimisation search in weight-space • Any search algorithm can be used, most commonly gradient descent (back propagation) • Common learning paradigms: • supervised learning: training is by comparison with known input/output examples (a training set) • unsupervised learning: no a priori training set is provided; the system discovers patterns in the input • reinforcement learning: training uses environmental feedback to assess the quality of actions An Overview of Core CI Technologies

  8. Neuro-Evolution and Reinforcement Learning • Neuro-evolution uses a Neural Network to describe the phenotype of a solution, where a genome is the weights on the edges (or even the topology of the network) • Methods such as PSO or EAs are then used to optimise the network weights, given feedback • These techniques are particularly useful for reinforcement learning, where fitness is easy to calculate but input-output pairs are hard to generate An Overview of Core CI Technologies

  9. Neural Nets in Unsupervised Learning • Neural Networks can also be used for unsupervised learning • Large input and output layers have a smaller hidden layer in between, and the error is then calculated as the difference between the input and output layer • The distance between elements is the distance between their hidden layers An Overview of Core CI Technologies

  10. Fuzzy Systems Reading: Lofti Zadeh, “Fuzzy logic”, Computer IEEE 1988:4 83-93. G. Gerla, “Fuzzy logic programming and fuzzy control”, Studia Logica, 79 (2005): 231-254. Jan Jantzen, “Design of fuzzy controllers”, Technical Report.

  11. Fuzzy systems • Fuzzy logic facilitates the definition of control systems that can make good decisions from noisy, imprecise, or partial information • There are two key concepts • Graduation: everything is a matter of degree, e.g. it can be “not cold”, or “a bit cold”, or “a lot cold”, or … • Granulation: everything is “clumped”, e.g. age is young, middle-aged, or old young old 1 middle-aged 0 age

  12. Fuzzy Logic • The syntax of Fuzzy logic typically includes • propositions (“It is raining”, “CITS4404 is difficult”, etc.) • Boolean connectives (and, not, etc.) • The semantics of Fuzzy logic differs from propositional logic; rather than assigning a True/False value to a proposition, we assign a degree of truth between 0 and 1, e.g. v(“CITS4404 is difficult”) = 0.8 • Typical interpretations of the operators and and not are • v(not p) = 1 – v(p) • v(p and q) = min {v(p), v(q)} (Godel-Dummett norm) • Different semantics may be given by varying the interpretation of and (the T-norm). Anything commutative, associative, monotonic, continuous, and with 1 as an identity can be a T-norm. Other common T-norms are: • v(p and q) = v(p) * v(q) (product norm) • v(p and q) = max{v(p) + v(q) – 1, 0} (Lukasiewicz norm)

  13. Vagueness and Uncertainty • The product norm captures our understanding of probability or uncertainty with a strong independence assumption • prob(Rain and Wind) = prob(Rain) * prob(Wind) • The Godel-Dummett norm is a fair representation of Vagueness: • if it’s a bit windy and very rainy, it’s a bit windy and rainy • Fuzzy logic provides a unifying logical framework for all CI Techniques, as CI techniques are inherently vague • whether or not it is actually implemented is another question

  14. Fuzzy Controllers • A fuzzy control system is a collection of rules • IF X [AND Y] THEN Z • e.g. IF cold AND ¬warming-up THEN increase heating slightly • Such rulesare usually derived empirically from experience, rather than from the system itself • attempt to mimic human-style logic • granulation means that the exact values of any constants (e.g. where does cold start/end?) are less important • The fuzzy rules typically take observations, and according to these observations’ membership of fuzzy sets, we get a fuzzy action • The fuzzy action then needs to be defuzzified to become a precise output

  15. Fuzzy Control • Applying Fuzzy Rules temperature Cold Right Hot no change heat heat -ve no change heat cool d(temperature) / dt Image from http://www.faqs.org/docs/fuzzy/ zero no change cool cool +ve

  16. Statistical Methods Reading: S. Russell and P. Norvig, Section 20.1, Artificial Intelligence: A Modern Approach, Prentice Hall, 2002. R. Barros, M. Basgalupp, A. de Carvalho, A Freitas, “A Survey of Evolutionary Algorithms for Decision Tree Induction”, IEEE Transactions on Systems, Man and Cybernetics.

  17. Naïve Bayes Classifiers • Naïve Bayes Classifiers use a strong independence assumption when trying to determine the class of entity, given observations of that entity • Bayes Rule: • Probabilities are easy to maintain from observations, and calculations are cheap An Overview of Core CI Technologies

  18. Decision Tree Analysis • Decision trees are used for classification problems, where leaves represent classes and branches represent features leading to those classes • Decision trees are easy to use and quite powerful • There are many statistical methods to build decision trees from observations An Overview of Core CI Technologies

More Related