INFSY540 Information Resources in Management

INFSY540Information Resources in Management Lesson 10 Chapter 10 Artificial Neural Networks and Genetic Algorithms

Learning from Observations • Learning can be viewed as trying determine the representation of a function. • Examples of input output pairs with two points and then with three points. • Ockham’s razor- The most likely hypothesis is the simplest one that is consistent with all observations.

Cognitive vs Biological AI • Cognitive-based Artificial Intelligence • Top Down approach • Attempts to model psychological processes • Concentrates on what the brain gets done • Expert System approach • Biological-based Artificial Intelligence • Bottom Up approach • Attempts to model biological processes • Concentrates on how the brain works • Artificial Neural Network approach

Introduction to Neural Networks • As a biological model, Neural Nets seek to emulate how the human brain works. • How does the brain work? • The human receives input from independent nerves. • The brain receives these independent signals and interprets them based on past experiences. • Much brain reasoning is based on pattern recognition. • Patterns of impulses from the skin identify simple sensations such as pain or pressure • The brain decides how to react to these impulses and sends output signals to the muscles and other organs.

What are ANNs? • Rough Definition: • an adaptive information processing system designed to mimic the brain’s vast web of massively interconnected neurons. • Attributes: • system of highly interconnected processors, each operating independently and in parallel • trained (not programmed) for an application • learns by example • processing ability is stored in connection weights which are obtained by a process of adaptation or learning

Biological Neuron

A Model of an Artificial Neuron Inputs are Stimulation Levels Output is the Response of the Neuron X1 X2 Output (axon) Xn (dendrites) Single Node (neuron)

A Model of an Artificial Neuron Weights are Synaptic Strength (Local memory stores previous computations, modifies weights) (synapses) W1 X1 W2 X2 Output S Wn (axon) Xn (dendrites) S = W1X1 + W2X2 + ...+ WnXn =WiXi Single Node with Sum of Weighted Inputs (neuron)

A Model of an Artificial Neuron Inputs Transfer Function determines output (based on comparison of S to threshold) Weights W1 X1 W2 f(S) X2 Output = f (S) S Wn Xn ƒ(s) ƒ(s) s s Step Function Sigmoid Function Single Node with Sum of Weighted Inputs compared to a threshold to determine output

 ƒ  ƒ  ƒ  ƒ   ƒ ƒ Connection (Artificial Synapse) neurons w • Outputs continue to spread the signal inj = wijouti • Types of connections • Excititory: positive • Inhibitory: negative • Lateral: within same layer • Self: connection from a neuron back to itself i1 Lateral w i2 neuron w inj outi Self i3 axon w i4 w i5 synapses dendrites

Feedforward ANN • This is one example of how the nodes in a network can be connected. It is typically used with “backpropagation”. • Another example is for every node to be connected to every other node.

How ANNs Work • First, data must be obtained. • Second, the network architecture and training mechanism must be chosen. • Third, the network must be trained. • Fourth, the network must be tested.

How ANNs Work • Network Training: • Begin with a random set of weights. • The net is provided with a series of inputs and corresponding outputs (one pair of inputs/outputs at a time). The net calculates its own solution and compares it to the correct one. • The net then adjusts the weights to reduce error. • Training continues until net is good enough or run out of time. • Network Testing (i.e. Validation): • The net is tested with cases not included in the training set. • The net output and desired output are compared. • If enough test set cases are incorrect, then the net must be retrained and retested.

Example: Tree Classification

Classification System labeled feature values pattern 1 X 1 2 Sensing System Neural Network • Sensing System • imaging system, spectrometer, sensor array, etc. • Measurements (Features) • wavelength, color, voltage, temperature, pressure, intensity, shape, etc. X environment 2 3 4

Example: Tree Classifier Two Features: INPUTS OUTPUTS Four Classes: Black Spruce (BS) Needle Length Western Hemlock (WH) Classifier Western Larch (WL) Cone Length White Spruce (WS) Sensor: Ruler

Tree Classifier: Data Cone Needle Tree BS WH WL WS 25 mm 11 mm Black Spruce 1 0 0 0 26 mm 11 mm Black Spruce 1 0 0 0 26 mm 10 mm Black Spruce 1 0 0 0 24 mm 9 mm Black Spruce 1 0 0 0 20 mm 13 mm Western Hemlock 0 1 0 0 21 mm 14 mm Western Hemlock 0 1 0 0 19 mm 8 mm Western Hemlock 0 1 0 0 21 mm 20 mm Western Hemlock 0 1 0 0 28 mm 30 mm Western Larch 0 0 1 0 37 mm 31 mm Western Larch 0 0 1 0 33 mm 33 mm Western Larch 0 0 1 0 32 mm 28 mm Western Larch 0 0 1 0 51 mm 19 mm White Spruce 0 0 0 1 50 mm 20 mm White Spruce 0 0 0 1 52 mm 20 mm White Spruce 0 0 0 1 51 mm 21 mm White Spruce 0 0 0 1

Tree Classifier: Training Process Iterations = 0 MSE = 0.754 Iterations = 1000 MSE = 0.235 40 40 western larch western larch 30 30 western Needle Length (mm) western 20 20 hemlock hemlock white white spruce spruce 10 10 black spruce 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Cone Length (mm) Cone Length (mm) Iterations = 2000 MSE = 0.046 Iterations = 3000 MSE = 0.009 40 40 40 western larch western larch 30 30 30 Needle Length (mm) western 20 20 20 western hemlock white hemlock white spruce spruce 10 10 10 black black spruce spruce 0 0 0 0 0 10 10 20 20 30 30 40 40 50 50 60 60 0 10 20 30 40 50 60 Cone Length (mm) Cone Length (mm) Cone Length (mm)

Tree Classifier: Results Cone Black Spruce Length ƒ  ƒ  Western Hemlock ƒ  Needle ƒ  Westerm Larch Length ƒ  ƒ  White Spruce ƒ 

Modeling Example • Function Approximation: On the interval [0,1] f(x) = 0.02(12 + 3x - 3.5x2 + 7.2x3)(1 + cos4x)(1 + 0.8 sin3x • Data (many hundred points) x f(x) 0.0 0.480 0.1 0.529 0.2 0.084 0.3 0.061 0.4 0.181 0.5 0.108 0.6 0.195 0.7 0.071 0.8 0.107

Modeling: Results Error RMS = 0.0117 MSE = 0.000137

Character Recognition Example Input Nodes Hidden Nodes Output Nodes 0 = 1 1 = 2 0 = 3 The green circles of the input nodes represent 1, the dark 0. The green boxes of the output nodes represent 1, the white 0.

Ways to Categorize ANNs • Architecture of Nodes and Arcs (i.e. How are nodes connected) • There are many different architectures • We will show the Feedforward and Recurrent • General Training Schemes • Supervised or Unsupervised (Will discuss later) • Specific Training Approaches • Many different types (Will discuss later)

Some Tasks Performed by ANNs • Prediction/Forecasting • Recognize Trends in Time Series Data • Decision • Recognize Key Components in a Given Situation • Classification • Recognize Objects and Assign to Appropriate Classes • Modeling • Recognize Similar Conditions to those in the Model

Neural Net Application: Diagnosis • Breast Cancer Diagnosis • Developed from Neural Nets trained to identify tanks • Headache Diagnostic System • There are over 130 different types of headaches (believe it or not), and each has separate causes or combinations of causes (dietary, environmental, etc.). • A neural net can help classify the headache based on the location, severity, and type (constant, throbbing, ...) of pain present.

Neural Net Applications:Diagnosis & Repair • Shock Absorber Testing • Determining what particular portion of a shock absorber is going to fail is a difficult task. • Similar to the TED* expert system, neural networks can be used to identify faults in mechanical equipment. Some researchers are examining neural network systems used to analyze shock response patterns (force applied vs. displacement of shock cylinder). * Work is being done on developing a neural network to improve Turbine Engine Diagnostic expert system

Strengths of Neural Nets • Generally efficient, even for complex problems. • Remarkably consistent, given a good set of training cases. • Adaptability • Parallelism

Why Use Neural Networks? • Mature field -- widely accepted • Consistent • Efficient • Use existing historical data to make decisions

Limitations of Neural Nets • Amount of training data needed. • Training cases must be plentiful . • Training cases should be consistent. • Training cases must be sufficiently diverse. • Outcomes must be known in advance (for supervised training). • Scaling-up the net is difficult given new outcomes: • No satisfactory mathematical model exists for this process -- yet. • The net must be retrained from scratch if the set of desired outcomes change.

Some Good ANN References • A Practical Guide to Neural Nets, W. Illingsworth & M. Nelson , 1991 (A Very Easy Read!!) • Artificial Neural Systems, J. Zurada, 1992 • Pacific Northwest Laboratory • http://www.emsl.pnl.gov:2080/docs/cie/neural/ • Applets for Neural Networks and Artificial Life • http://www.aist.go.jp/NIBH/~b0616/Lab/Links.html#BL • Function Approximation Applet http://neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.html • Web Applets for Interactive Tutorials on ANNs http://home.cc.umanitoba.ca/~umcorbe9/anns.html#Applets Some ANN WWW SITES

More ANN References • Function Approximation Using Neural Networks neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.htm • Artificial Neural Networks Tutorial www.fee.vutbr.cz/UIVT/research/neurnet/bookmarks.html.iso-8859-1 • MINI-TUTORIAL ON ARTIFICIAL NEURAL NETWORKS http://www.imagination-engines.com/anntut.htm • Artificial Neural Networks Lab on the Web. www.dcs.napier.ac.uk/coil/rec_resources/Software_and_demos25.html • Software Examples. The Html Neural Net Consulter. nastol.astro.lu.se/~henrik/neuralnet1.html • MICI Neural Network Tutorials and Demos www.glue.umd.edu/~jbr/NeuralTut/tutor.html • Sites using neural network applets. http://www.aist.go.jp/NIBH/~b0616/Lab/Links.html

ANN Questions These are questions that only a technologist would need to know. Managers would not generally need to know the answers to these questions. • What will the inputs be? • What will the outputs be? • Will signals be discrete or continuous? • What if the inputs aren’t numeric? • How should you organize the network? • How many hidden layers should there be? • How many nodes per hidden layer? • Should weights be fixed, or is there any need to adapt as circumstances change? • Should you have a hardware or software ANN solution? (i.e. do you need a neural net chip?)

Questions about artificial neural networks? Did we cover the math of backpropagation?

Genetic Algorithms • Optimization and Search are difficult problems: • Domains are complex • They require heavy computation • Getting best solution is nearly impossible • Ops Research has developed techniques for them • e.g. linear programming, goal programming • AI community has developed alternative techniques

What is an Optimization Problem? • To optimize is to “make the most effective use of”, according to Webster’s Dictionary. • Optimization can mean: • Maximize effective use of resources • Minimize costs • Minimize risks • Maximize crop yield • Minimize casualties

+ - 2 2 2 sin ( x y ) 0 . 5 = f(x,y) 0 . 5 + + 2 2 1 0 . 001 (x y )) ( - < < s . t . 1 x 1 - < < 1 y 1 Typical Optimization Problems • VP Opns wants to visit all company sites while minimizing transportation costs. • Find a series of moves in a chess game that guarantees a victory • Find a maximum value for the function

Optimization Problems = Search Problems • Types of Search Problems • To find the top of Mount Everest • To find the South Pole • To find the deepest part of the ocean (aka Mariannas Trench) Which is the easiest? How do you know when to stop?

Illustration of Search Problem z y x

Difference between Prediction and Optimization • Prediction: What is the nutrition content of a McDonald’s Happy Meal? • Optimization: What is the most nutritious meal at McDonald’s? • Solving optimization problems typically requires solving many interations of smaller prediction problems.

Problems with Searching • Domains are complex • They require heavy computation • Getting the best solution may be impossible • 10! = 3,628,800 possible combinations • if computer can solve 1,000,000 evaluations per second • 3.6 seconds • 25! = 15,500,000,000,000,000,000,000,000 • 16 billion years to solve this problem

Sample OptimizationProblem • Think of each possible combination of characteristics of a “zebra” in the Serengeti • The strength of the combination corresponds to how well the zebra evades the lions. Survival of the fittest

The “Zebra Model” • A gene is a single characteristic about an individual zebra. • Some examples of zebras genes listed below. • In GA terms, a gene is a parameter in the solution. Genes of Zebra #1. Heart Size #2. Leg Length #3. Forelimb Strength ... #n. Lung Capacity

The “Zebra Model” • The combination of genes is called a chromosome (genome): • The genetic makeup of a zebra • Think of each chromosome as a “potential alternative solution”. Genes of Zebra #1. Heart Size #2. Leg Length #3. Forelimb Strength ... #n. Lung Capacity Chromosome

The “Zebra Model” • The fitness describes how well a zebra evades lions. • In Genetic Algorithms, the Fitness Function is a function that calculates how well a chromosome performs. Genes of Zebra #1. Heart Size #2. Leg Length #3. Forelimb Strength ... #n. Lung Capacity Chromosome Fitness = 37

Fitness = 68 Fitness = 55 Fitness = 30 Fitness = 75 Fitness = 44 Fitness = 57 Fitness = 65 Fitness = 48 Fitness = 36 Fitness = 77 Fitness = 61 Fitness = 42 Fitness = 51 The “Zebra Model” • A generation describes a herd of zebras. • The GA evaluates a population of chromosomes at once rather than one solution at a time

Fitness = 68 Fitness = 55 Fitness = 30 Fitness = 75 Fitness = 44 Fitness = 57 Fitness = 65 Fitness = 48 Fitness = 36 Fitness = 77 Fitness = 61 Fitness = 42 Fitness = 51 The “Zebra Model” • In each generation, the weakest zebras are caught by the lions.

Fitness = 75 Fitness = 55 Fitness = 66 Fitness = 38 Fitness = 48 Fitness = 83 Fitness = 68 Fitness = 57 Fitness = 44 Fitness = 65 Fitness = 51 Fitness = 77 Fitness = 61 The “Zebra Model” To make up for lost comrades, the surviving zebras reproduce. Some will be stronger than their parents, others weaker.

Fitness = 75 Fitness = 55 Fitness = 66 Fitness = 38 Fitness = 48 Fitness = 83 Fitness = 68 Fitness = 57 Fitness = 44 Fitness = 65 Fitness = 51 Fitness = 77 Fitness = 61 The “Zebra Model” • Occasionally, a child has a mutation: • Usually these mutant children are weaker than their parents and die. • Occasionally these children have some new characteristic that makes them stronger than previous generations. • This mutation allows the GA to search new regions of the search space and examine new types of zebras.

INFSY540 Information Resources in Management