Adaptive Robotics COM2110 Autumn Semester 2008 Lecturer: Amanda Sharkey

Adaptive RoboticsCOM2110 Autumn Semester 2008Lecturer: Amanda Sharkey

Researchers at Georgia Tech have built a biologically inspired robot to perform actions of service dogs • Users issue verbal commands to robot, and indicate object with laser pointer. • Eg. Fetching items, or closing doors or drawers. • Worked with trainers of dogs • Could replicate 10 commands and tasks • Robots could reduce costs, and waiting times for service dogs • Companionship?

Assignment (30%) • Due by Monday 17th Nov at 11 am • Write an essay (1500-2500 words) on one of the following topics. You should use the lectures as a starting point, but also research the topic yourself. Plan your answer. Include a reference section, with the references cited in full. 1. Identify the main characteristics of Behaviour-based robotics, and contrast the approach to that of “Good old-fashioned AI”. 2. To what extent did Grey Walter’s robots, Elsie and Elmer, differ from robots that preceded, or followed them. 3. Explain how the concepts of “emergence” and “embodiment” are related to recent developments in robotics and artificial intelligence.

Assignment notes Essay – Try to structure it with introduction, body and conclusion. Plan out an argument, like pseudo-code. Include references - in the text, either numerical [1], or by name, Cao et al (1997) - In a Reference section at the end journal articles: [1] Cao, Fukunaga and Kahn (1997) Cooperative mobile robotics: antecedents and directions. Autonomous Robots, 4,1, 7-27. And books Gordon, D. (1999) Ants at work: How an insect society is organised. W.W.Norton and Co.,London. Secondary citation Sharkey, A.J.C. (1800) How to cite references, cited in Gordon, D. (1999) Ants at work: How an insect society is organised. W.W. Norton and Co. London.

Training weights with delta rule to produce target outputs • Testing – presenting new inputs to (already trained) weights, to see what the output is. Generalisation performance is the percentage of the test set the trained net gets right. • Motor outputs Bias unit Light sensor inputs

Advantages of Neural Nets for robotics • They provide a straightforward mapping between sensors and motors • They are robust to noise (noisy sensors and environments) • They can provide a biologically plausible metaphor • They offer a relatively smooth search space (gradual changes in weights = gradual changes in behaviour)

Example: Hebbian learning for collision avoidance • From Pfeiffer and Scheier (1999) On-line learning Motor action Collision layer Distance sensors Fixed weights between collision layer and motor actions. When collision occurs, and distance sensor active, Hebbian learning used to strengthen the weight between the two. After learning – activating the distance sensor will result in the collision detector being activated. The robot will learn to avoid objects.

Other methods of setting up ANN controller • Can use delta rule (for linearly separable patterns). • Or can use backpropagation learning rule to train multi-layer net • Needs training set • how can you translate the behaviour of not bumping into objects into a training set? • See Sharkey (1998) for one method – write simple code (innate controller) for avoiding obstacles, and collect examples of inputs and outputs for training • Or Genetic algorithms can be used to evolve neural net controller

Darwin …..

Genetic Algorithms- a form of evolutionary computation • A GA operates on a population of artificial chromosomes, selectively reproducing chromosomes of individuals with better performances with some random mutations. • Artificial chromosome (genotype) encodes characteristics of individual (phenotype) • E.g. could encode weights of artificial neural network

Fitness function: • Used to evaluate the performance of each phenotype. • Higher fitness values better • E.g. minimising difference between output of function and target • Smaller difference = higher fitness

Start with population of randomly generated chromosomes • Decode each chromosome and evaluate its fitness • Apply genetic operators to create new population • Crossover • Mutation • Selective reproduction • Repeat until desired individual found, or best fitness stops increasing.

Genetic operators: Crossover and Mutation New population created by selective reproduction. Offspring are randomly paired, crossed over and mutated One point crossover – for a pair of chromosomes select a random point to crossover material between two individuals.

Genetic operators: Mutation • For binary representations, switch value of selected bits • For real value representations, increment by small number randomly extracted from distribution centred round zero. • Other representations: substitute selected location with symbol randomly chosen from same alphabet

If evolving NN weights, crossover may not work – mutation often more effective.

Genetic operators: Selective reproduction • Making copies of best individuals – more copies in next generation. • Problem: method breaks down when all individuals have similar fitness values, (becomes like random search) • or one or two have higher fitness values than rest of population (they dominate)

Solutions • Scale fitness values to enhance, or decrease individual differences • Rank-based selection: probability of making offspring proportional to rank • Truncation selection: ranking individuals and selecting top M individuals • Tournament based selection: randomly select 2, generate random number between 0 and 1, if number smaller than predefined parameter T, fitter individual makes offspring, or if greater, other individual reproduces. • Elitism: maintain best individual for next population.

Each generation: • fitness evaluation of all individuals in population • Selective reproduction • Crossover and mutation • Repeat for several generations – monitor average and best fitness: halt when fitness indicators stop increasing or satisfactory individual found.

Aim of using GAs: emergence of complex abilities from interaction between agent and environment • Difficult to establish what abilities and achievements are needed • Other approaches: • Incremental methods – change fitness evaluation at different stages • Evolution and training – can often work to both evolve and to train.

Useful reference (extensive review) - Yao, X. (1999) Evolving artificial neural networks. Proceedings of the IEE 87, 9, 1423-1447 Possible to evolve: • Weights and learning parameters • Architectures • Learning rules

Evolving weights and learning parameters • Advantage for evolutionary robotics: don’t need to specify network response to each pattern • Synaptic weights encoded on genotype • Strings of real values, or binary values • GAs used to evolve the weights • Combining evolution with NN learning – use GAs to find initial values of weights for nets subsequently trained with backpropagation

Evolving Architectures • Indirect coding of network on genotype – e.g. number of nodes, probability of connecting them, type of activation function • E.g. Harp, Samad and Guha (1989): genetic string encodes blueprint for net. Blueprint consists of several segments corresponding to a layer. • Segments have 2 parts: (I) node properties (no. of units, activation functions, geometric layout) (ii) outgoing properties (connection density, learning rate etc). • Blueprints decoded into networks, which are trained using Backpropagation.

Evolving learning rules • Evolvable hardware • E.g. Maris and Boekhorst 1996 • Sensor position evolved

5 min break……

Example of evolving robots • Floreano, D. and Mondado, F.(1994) Automatic creation of an autonomous agent: genetic evolution of a neural network driven robot. In D.Cliff, P.Husbands, J. Meyer and S.W. Wilson (Eds) From Animals to Animats 3: Proceedings of Third Conference on Simulation of Adaptive Behaviour. Cambridge, MA: MIT Press/Bradford Books • Comparison of evolution of simple navigation, to predesigned architecture

Predesigned architecture: • Braitenberg-type controller to perform straight motion and obstacle avoidance with Khepera robot • Positive connection between wheel and sensors on its own side: rotation speed of wheel proportional to activation of sensor • Negative connection between wheel and sensors on the opposite side: rotation speed of wheel is inversely proportional to sensor activation • Positive offset value to each wheel generates forward motion

Weighted sum on incoming signals steers robot away from objects • But design needs prior knowledge of sensors, motors and environments • E.g. if sensor has lower response profile than other sensors, its outgoing connections require stronger weights.

Evolving a controller(Floreano and Mondado, 1994) • Goal: evolving a controller to maximise forward motion while avoiding obstacles • Fitness function based on 3 variables

Three components in fitness function to encourage • Motion • Straight displacement • Obstacle avoidance

Where V is the sum of the rotation speeds of the two wheels Δv is the absolute value of the algebraic difference between the signed speed values of the wheels i is the normalised activation value of the infrared sensor with the highest value

First component V is computed by summing the rotation speeds of the 2 wheels (direction of rotation given by sign of read value, and speed by its absolute value).

Second component encourages 2 wheels to rotate in same direction. The higher the difference in rotation the closer will be to 1 e.g. if the left wheel rotates backwards at speed –0.4, and the right wheel rotates forward at speed 0.5 will be 0.9 The square root gives stronger weight to smaller differences. Since component is subtracted from 1, it is maximised by robots whose wheels move in the same direction, regardless of speed and overall direction.

Last component encourages obstacle avoidance Proximity sensors on Khepera emit a beam of infrared light – and measure quantity of reflected infrared light. Closer a robot is to an object, the higher the measured value. Value of i of most active sensor provides a measure of how close nearest object is. Value subtracted from 1, so this component selects robots that stay away from objects. Combined result of 3 components: selecting robots that move as straight as possible while avoiding obstacles in path.

Control system for robot Fixed network architecture Weights between 8 proximity sensors and 2 motor units (also bias unit) Recurrent connections at output layer Synaptic connections (weights) encoded as floating point numbers on chromosome

Evolutionary experiments • Each generation: individuals allowed to move for 80 sensory motor loops • Each sensory motor loop lasted 30 ms • Each generation: 80 individuals • Evolved using roulette wheel selection, biased mutations, and one point crossover • After 50 generations: smooth navigation around the maze without bumping into walls.

Comparison • Comparison between handcrafted, and evolved solution • Handcrafted: stopped when in looping maze (when two contralateral sensors receive the same inputs, their signals cancel each other out, and wheels don’t move) • Evolved solution: strong asymmetrical weights on recurrent connections to avoid deadlock situations • Conclusions: evolved solution is competitive to handcrafted solution. • Evolved solution emerges from interaction with environment

Dario Floreano

2nd example: Modularity • Reasons for modularity; • Complexity of task • Improving performance • Design • Biological inspiration (and/or modelling) • Examples of modularity; • Behaviour-based robotics and subsumption architecture • Where do modules come from? • Explicit design – based on analysis/understanding of task • Evolving modularity: an example (Nolfi, 1997)

Evolving modularity • Nolfi(1997) Using emergent modularity to develop control systems for mobile robots. Adaptive Behaviour 5, 3-4, 343-364 • Usual approach to behaviour-based robotics, depends on breaking required behaviour down into subcomponents

Stefano Nolfi

Adaptive Robotics COM2110 Autumn Semester 2008 Lecturer: Amanda Sharkey