240 likes | 397 Vues
Bioinspired Computing Lecture 6. Artificial Neural Networks: From Multilayer to Recurrent Neural Nets Netta Cohen. bed+bath. input. hidden. output. …+mirror+wardrobe. s. m. w. b. Auto-associative Memory.
E N D
Bioinspired ComputingLecture 6 Artificial Neural Networks: From Multilayer to Recurrent Neural Nets Netta Cohen
bed+bath input hidden output …+mirror+wardrobe s m w b Auto-associative Memory Auto-associative nets are trained to reproduce their input activation across their output nodes… Once trained on a set of images, for example, the net can automatically repair noisy or damaged images that are presented to it… A net trained on bedrooms and bathrooms, presented with an input including a sink and a bed might infer the presence of a mirror and a wardrobe – a bedsit.
Problems First, we have already noted that ANNs often depart from biological reality in several respects: Supervision: Real brains cannot rely on a supervisor to teach them, nor are they free to self-organise in any manner… Training vs. Testing: This distinction is an artificial one. Temporality: Real brains are continuously engaged with their environment, not exposed to a series of disconnected “trials”. Architecture: Real neurons and the networks that they form are far more complicated than the artificial neurons and simple connectivity that we have discussed so far. Does this matter? If ANNs are just tools, no, but if they are to model mind or life-like systems, the answer is maybe.
Recall NETtalk... Problems Problems Fodor & Pylyshyn raise a second, deeper problem, objecting to the fact that, unlike classical AI systems, distributed representations have no combinatorial syntactic structure. Classical representations of Jo&Flo contain representations of Jo and Flo, allowing inferences (e.g., from Jo&Flo to Jo). F&P claim that the ANN representations of Jo&Flo, Jo and Flo are not related in the same way: Jo is not any part or consequence of the pattern of activation across hidden nodes that represents Jo&Flo. Jo&Flo is unrelated to Flo&Jo. Cognition requires a language of thought. Languages are structured syntactically. If ANNs cannot support syntactic representations, they cannot support cognition. F&P’s critique is perhaps not a mortal blow, but is a severe challenge to the naïve ANN researcher…
Today... Supervision: We will move away from supervised learning and introduce various methods of so called unsupervised learning, both for multi-layer and other architecture nnets. Architecture: We will eliminate layers and feed-forward directionality and incorporate complex feedback and recurrence within the network. Temporality: We will introduce dynamical neural networks that continuously engage and interact with time-changing inputs. Learning: We will reduce learning to its bare essentials and try to pinpoint the minimal system that can exhibit learning. We will ask what learning is all about in biological systems and how it ties in with behaviour.
External learning rules External supervisor Off-line training a general-purpose net (that has been trained to do almost any sort of classification, recognition, fitting, association, and much more) Internalised learning rules An ability to learn by one’s self … in the real world a restricted learning capacity: We are primed to study a mother tongue, but less so to study math. natural learning supervised training Artificial nets can be configured to perform set tasks. The training usually involves: A natural learning experience typically implies: The result is typically: The result is typically: Can we design and train artificial neural nets to exhibit a more natural learning capacity?
Unsupervised Learningfor multi-layer nets Autoassociative learning: While these nets are often trained with backpropagation, they present a move away from conventional supervision since the “desired” output is none other than the input which can be stored internally. Reinforcement learning: In many real-life situations, we have no idea what the “desired” output of the network should be, but we do know what behaviour should result if the network is properly trained. For instance, a child riding a bike must be doing something right if she doesn’t fall off. The ability to condition behaviour with rewards and penalties could be used to feed back to the network even if the specific consequences at the neuronal level are hidden. Hebbian learning and self-organisation: In the complete absence of supervision or conditioning, the network can still self-organise and reach an appropriate and stable solution. Hebbian learning is based on the biological paradigm in which connections between pre- and post-synaptic neurons are enhanced whenever their activities are positively correlated.
output input Some neurons also receive external input. Some neurons produce the output of the network. Recurrent Neural Networks In recurrent neural nets, activation flows around the network, rather than feeding forward from input to output, through sequential layers. Randy Beer describes one scheme for recurrent nets: Each neuron is connected to each other neuron and to itself. These connections need not be symmetrical. The activity of neurons in this network is virtually the same as in feed-forward nets, except that as the activity of the network is updated at each time step, inputs from all other nodes (including one’s self) must be counted . Adapted from Randy Beer (1995) “A dynamical systems perspective on agent-environment interaction”, Artificial Intelligence 72: 173-215.
The Net’s Dynamic Character Consider the servile life of a feed-forward net: • it is dormant until an input is provided • this is mapped onto the output nodes via a hidden layer • weights are changed by an auxiliary learning algorithm • once again the net is dormant, awaiting input • Contrast the active life of a recurrent net: • in the absence of input it may generate&reverberate its own activity • this activity is free to flow around the net in any manner • external input may modify these intrinsic dynamics • the net may resist inputs or utilise them to guide its activity • if embedded, the net’s activity may affect its environment, which may alter its sensory input, which may perturb its dynamics, and so on…
What does it do? In the absence of input, or in the presence of a steady-state input, a recurrent network will usually approach a stable equilibrium (or so-called fixed state attractor). Other behaviours can be induced by training. For instance, a recurrent net can be trained to oscillate spontaneously (without any input), and in some cases even to generate chaoticbehaviour. One of the big challenges in this area is finding the best algorithms and network architectures to induce such diverse forms of dynamics. Once input is included, there is a fear that the abundance of internal stimulations and excitations will result in an explosion or saturation of activity. In fact by including a balance of excitations and inhibitions, and by correctly choosing a the activation functions, the activity is usually self-contained within reasonable bounds.
input target output 1 0 time Dynamical Neural Nets In all neural net discussions so far, we have assumed all inputs to be presented simultaneously, and each trial to be separate. Time was somehow deemed irrelevant. Recurrent nets can deal with inputs that are presented sequentially, as they would almost always be in real problems. The ability of the net to reverberate and sustain activity can serve as a working memory. Such nets are called Dynamical Neural Nets (DNN or DRNN). Consider an XOR with only one input node. We provide the network with a time series consisting of a pair of high and low values. The output neuron is to become active when the input sequence is 01 or 10, but remain inactive when the input sequence is 00 or 11.
Training Recurrent Neural Nets • As in multi-layer nets, recurrent nets can be trained with a variety of methods, • both supervised and unsupervised. Supervised learning algorithms include: • Backprop through Time (BPTT): • Calculate errors at output nodes at time t • Backpropagate the error to all nodes at time t-1 • repeat for some fixed number of time steps (usually<10) • Apply usual weight fixing formula • Real Time Recurrent Learning (RTRL) • Calculate errors at output nodes • Numerically seek steepest descent solution to minimise the error at this time step (including complete history of inputs & activity). These methods suffer from fast deterioration as more history is included. The longer the history, the larger the virtual number of layers. However, slight variations of these learning algorithms have successfully been used for a variety of applications. Examples include grammar learning & distinguishing between spoken languages.
Another example dates back to Edmund Halley’s observation in 1676 that Jupiter’s orbit was directed slowly towards the sun. If that prediction turned out true, Jupiter would sweep the inner planets with it into the sun (that’s us). That hypothesis threw the entire mathematical elite into a frenzy. Euler, Lagrange and Lambert made heroic attacks on the problem without solving it. No wonder. The problem involved 75 simultaneous equations resulting in some 20 billion possible choices of parameters. It was finally solved (probabilistically) by Laplace in 1787. Jupiter, it turns out, was only oscillating. The first half cycle of oscillations will be completed at about 2012. RNNs for Time Series Prediction Predicting the future is one of the biggest quests of human kind. What will the weather bring tomorrow? Is there global warming and how hot will it get? When will Wall Street crash again and how will oil prices fluctuate in the coming months? Can EEG recordings be used to predict the next epilepsy attack or ECG, the next heart attack? Such daunting questions can be regarded as problems in time series prediction for dynamical systems. They have occupied mathematicians and scientists for centuries and computer scientists since time immemorial (i.e. Charles Babbage).
x This sequence is used as input. output t 0 1 2 3 4 5 6 7... The output represents the predicted value of x at the present time x(t) based on information about its past values all the way from 0 to t-1. input RNNs for Time Series Prediction:How does it work? A time series is a sequence of numbers {x(t=0), x(t=1), x(t=2), … x(t-1)} that represent a certain process that has been discretised in time: If the behaviour of x can be described by a dynamical system, then a recurrent neural net should be able to model it in principle. The question is how many data points are needed (i.e. how far into the past must we go) to predict the future. Examples: if x(t) = A*x(t-1) + B, then one step back should suffice. If x(t) = A*x(t-1) + B + noise, more steps may be needed to effectively filter out some of the effects of the noise. If the data set is random, no recurrent neural net will make worth while predictions.
FFNNs and RNNs in BEASTBioinspired Evolutionary Agent Simulation Toolkit BEAST demos include several examples to demonstrate some fun features of bio-inspired tools in AI or bio-inspired applications. The mouse demo is a simple example of what's possible using an unsupervised neural net. The learning is implemented through a “genetic algorithm” which facilitates a form of reinforcement learning. Mice are trained to collect pieces of cheese. To start with, they roam around the screen aimlessly, but as they are rewarded for successful collisions with cheese, they develop a strategy to find the nearest piece and aim for it. The neural net is rather elementary. The current version implements a feed-forward architecture. However, the same program runs equally well on recurrent neural nets (with both trained by the same algorithm). The strategy: The mice need to learn to convert the input signals they receive (corresponding to cheese location) into an output (corresponding to direction and velocity of motion). In a sense, the mice are learning to associate the input signals with the cheese and the expected reward. Sensors: The mice would get nowhere without their sensors. The simulation is situated and embodied. It is also dynamic. Mice “seeing” that the piece of cheese they are racing after has been snatched, will turn away. Adaptation is in-built.
FFNNs and RNNs in BEASTactive, dynamic, natural... Natural intelligences engage with the real world in a continuous interplay, attempting to cope with their environment as best they can. Today, we are seeing how ANNs can go beyond disjointed series of discrete inputs and teaching signals that are arbitrarily divided into periods of training and testing. The mouse demo shows how even a minimal neural network can exhibit forms of learning that mimick simple features in animal behaviour. The combination of dynamic engagement with the environment and bio-inspired continuous reinforcement learning are two crucial components in achieving this behaviour.
Animals are routinely faced with tasks that require them to combine reactive behaviour (such as reflexes, pattern recognition, etc.), sequential behaviour (such as stereotyped routines) and learning. An Example of Natural Learning • For example, a squirrel foraging for food must • move around her territory without injuring herself • distinguish dangerous or dull areas from those rich in food • learn to be able to travel to and from particular areas All of these different behaviours are carried out by the same nervous system – the squirrel’s brain. To some degree, different parts of the brain may specialise in different kinds of task. However, there is no sharp boundary between learned behaviours, instinctual behaviours and reflex behaviours.
Imagine a one-dimensional world, call it “Summer”. The agent starts in the middle, can move left or right, and can only see what happens to be at her exact current location. goal Summer goal landmark landmark 50% 50% agent agent • Goal: at the left-hand edge of the world 50% of the time, and at the right-hand edge the other 50% of the time. • Landmark: always Goal-side of the agent’s initial position. Half of the Problem… There are only two other objects in the world: The agent’s task is to reach the goal.
goal goal 50% 50% landmark landmark Summer agent agent goal Winter goal landmark landmark 50% 50% agent agent The Other Half… But what if the world is slightly more complex: Every ten trials, a coin is tossed: heads the world stays the same, tails it changes – “Summer” to “Winter”, and vice versa.
If it is always Summer, the agent does not need to learn anything, she could just follow a fixed Summer Strategy: The Need to Learn move left for a while if you encounter the landmark, carry on else change direction… Similarly, is it were always Winter, a different fixed, Winter Strategy would suffice: move left for a while if you encounter the landmark, change direction else carry on… To do well at the whole problem, the agent must learn whether it is Summer or Winter, and use this fact to help it solve the next few trials.
Randy Beer and Brian Yamauchi used artificial evolution to discover simple dynamical neural nets of the kind we encountered last time that could solve the whole problem… A Solution The best solutions could find the goal 9.5 times out of 10. How did they achieve this? The behaviour of a typical agent might look something like this: move to the left-hand edge if the landmark and the goal, or neither, are seen then:for the next 9 trials, pursue the Summer Strategy else: for the next 9 trials, pursue the Winter Strategy loop The ANN first learns which half of the problem it faces and then exploits this knowledge…
How is the dynamical neural network solving this task: How Does It Work? • The network has developed… • an ability to distinguish Summer from Winter • two modes of behaviour • a Summer Strategy and a Winter Strategy • a switch governing which mode of behaviour is exhibited Through combining these abilities, the dynamical neural network is able to solve the whole problem. The Summer and Winter strategies can be thought of as two attractors in the net’s dynamics. The “switch” is able to kick the system from one attractor to the other, and back again – learning which half of the problem the net is currently facing…
The network solves a task that requires learning by itself – there is no learning rule teaching the net. But is it really learning anything? Is It Really Learning? • As the net adapts to changes from Summer to Winter, no weights change – just the flow of activation around the network. But learning is changing weights isn’t it? • The net can only learn one thing. Is flicking a single switch back and forth too simple to be called learning? • Learning is mixed up with behaving. Aren’t learning mechanisms different from the mechanisms they train? Yamauchi and Beer’s work challenges all of these intuitions. How far can this approach be taken? Is this kind of learning different from the learning we carry out in school? If so, how?
Next time… • More neural networks (self-organising nets, competitive nets, & more) • Unsupervised learning algorithms • Applications Reading • Randy Beer (1995) “A dynamical systems perspective on agent-environment interaction”, Artificial Intelligence 72: 173-215. • In particular, much of today was based on • www.inf.ethz.ch/~schraudo/NNcourse/intro.html