Stochastic Diffusion Processes: communication, search and cognition

Stochastic Diffusion Processes: communication, search and cognition • “Nothing seems more possible to me than that people someday will come to the definite opinion that there is no copy in the ... nervous system which corresponds to a particular thought, or a particular idea, or memory.” • Ludwig Wittgenstein, 1948: Last writings on the Philosophy of Psychology, Volume 1. • Mark Bishop • Goldsmiths, University of London

Background • The talk is a synthesis of recent papers by Bishop (2009) and Nasuto, Bishop & de Meyer (2009): • Bishop, J.M., (2009), A Cognitive Computing fallacy? Cognition, computations and panpsychism, Cognitive Computing 1:3, pp. 221-233. • Nasuto, S.J., Bishop, J.M. & De Meyer (2009), Communicating neurons: a connectionist spiking neuron implementation of stochastic diffusion search, Neurocomputing 72, pp. 704-712. • Acknowledgements: • A number of RAs, graduate/project students worked with me to establish the foundations of SDP; in this talk in particular I draw on results from: Paul Beattie, Darren Myatt, Mohammad Majid, Daniel Jones, Tom Morey, Matt Warriner & Nicoletta Nicolaou.

Computations as cognition • In this talk I claim a ubiquitous computational metaphor lies at the heart of cognitive science in [at least] three modes: • (1) Explicitly: cognition is ‘computations on symbols’ • GOFAI (‘[physical] symbol systems’); functionalism (philosophy of mind); cognitivism (psychology); language of thought (philosophy; linguistics) • (2) Implicitly: cognition as ‘computations on sub-symbols’ • Connectionism (sub-symbolic AI; psychology; linguistics); the digital connectionist theory of mind (philosophy of mind). • (3) Descriptively: cognitive modelling via computational simulations • Hodgkin–Huxley mathematical models of neuron action potentials (computational neuroscience; psychology).

Overview • For each of the three identified modes of cognitive science I will highlight one or more well known critiques that motivate a change from the hegemony of the computational cognitive metaphor. • I will subsequently suggest a new cognitive metaphor; one grounded on ‘interactions and communication’. • And I will conclude by outlining NESTER - a novel connectionist architecture based on Stochastic Diffusion Processes - that may escape [at least some of the] classical critiques of computational cognition. • NB. This is not to suggest that we throw the baby (computational modelling) out with the bath water (the generic computational metaphor). • Simply that a new metaphor of communication may shed a new and useful light on areas of cognitive science hitherto obfuscated by the fog of mere computations.

1. Symbolic cognition • Cognition involves discrete, internal mental states (representations or symbols) whose manipulation can be described in terms of rules or algorithms: • Good old-fashioned cognitive psychology; computations on representations: • Cognitive states are computational relations to computational mental representations that have content. • Cognitive process - changes in cognitive states - are computational operations on computational mental representations that have content. • Good old-fashioned AI; computation on symbols: • Newell & Simon’s Physical Symbol System (PSS) hypothesis: “Any intelligent machine is at its core a PSS ... a PSS has the necessary and sufficient means for general intelligent action”.

Some critiques of symbolic cognition • Godelian: • Lucas - with [theoretical] knowledge of the Godel formula of any mathematical system, a human is always greater than any given computational system. • Penrose - computations cannot capture all of human [mathematical] understanding. • Searlian: • The Chinese room argument - syntax is not sufficient for semantics. • Computation as an ‘observer relative’ phenomena: • Searle - “For any program there is some sufficiently complex object such that there is some description of the object under which it is implementing the program”; e.g. Searle’s wall as an instantiation of the ‘Wordstar’ program. • Putnam - a rock implements every input-less FSA. • Bishop - a non-repeating digital counter (or, pace Putnam, any ‘open physical system’ such as a rock) implements any program with known-input over a finite time period.

2. Sub-symbolic cognition • In connectionist systems networks of learnt (tuned) feature detectors cause functionally specified cognitive effects; knowledge defined as vectors in Rn. • A Learning Algorithm (e.g. back propagation) maps a spatial trajectory of network parameters in a Euclidean space, Rn. • Over time network parameters learn/evolve to perform desired mappings over pairs of real valued input/output vectors. • Strengths of classical connectionism include: • Its application to many engineering problems requiring flexible A.I. • Its use as a metaphor for both high level and low level cognitive processes.

Critiques of sub-symbolic cognition(a) Van de Velde: type / token knowledge • Standard connectionist models most naturally represent knowledge as ‘types’ or ‘classes’, (book, computer, chair etc). • A restriction Van de Velde recognised as, “... a fundamental cause of many problems when modelling symbolic processes by connectionist networks.”

Critiques of sub-symbolic cognition(b) Dinsmore: only arity zero predicates • Conventional connectionist networks can represent knowledge as tokens, however such tokens are always materially and spatially defined as neuronal activations in the network. • Either each node represents a specific feature or knowledge is distributed across activations of groups of nodes. • Dinsmore suggests this form of representation is limited to ‘arity-zero predicates’ and that this is too strong a restriction to model general, real-world knowledge.

Critiques of sub-symbolic cognition(c) Abbott: implausible use of inhibition • e.g. In many ANNs lateral inhibition has been extensively used to: • ... perform ‘winner take all’ (Grossberg); • … normalise signals and/or prevent saturation (Douglas); • … define topological structure (Kohonen). • However Abbott suggests there is a “lack of evidence for widespread inhibitory neuronal mechanisms in the cortex”.

3) Computational modelling • All matter - from the simplest particles to the most complex living organisms - undergoes physical processes which are not usually given any special computational interpretations. • For example, although we can describe the operation of a spring, as it extends under moderate force, by Hook’s law; we don’t say that the spring computes, according to Hook’s law, how much it should deform. • However, when it comes to nervous systems the situation changes abruptly. • Since the publication of the Hodgkin-Huxley equations in 1952 single neuron behaviour has been extensively modelled computationally; • Subsequently in neuroscience it has been assumed that neurons possess special computational capabilities (e.g. this neuron computes x; where x may be gradients, edges, motion etc) which are not attributed to other, more complex, biological substances (e.g. DNA).

3) Critiques [of the hegemony] of computational modelling (i) • The assumption of ‘computational capabilities’ to individual neurons is an anthropomorphic viewpoint, because computation is an intentional notion and assumes existence of some ‘demon’ that is able to interpret it. • Thus, the very assumption of ‘computational capabilities’ of real neurons leads to a homuncular theory of mind.

3) Critiques of computational modelling (ii) • Discoveries in neuroscience since the development of the Hodgkin-Huxley model reveal complex neuronal behaviour and suggest that the mathematical characterisation of single neurons via non-linear ordinary differential equations does not capture the information processing complexity of real neurons: • In particular it has been hypothesised (e.g. by Koch and Barlow and Granger amongst others) that a neuron can select input contingent on its spatial location on the dendritic tree or its temporal structure. • Furthermore, there is strong evidence that real neurons operate on richer information than provided by a single real number (mean firing rate) and therefore that the full gamut of their operation cannot be adequately described in a standard Euclidean setting. • Instead of modelling the neuron as a logical or numerical function, perhaps it could be better described by an alternative metaphor?

An alternative metaphor: communication and interaction • Communication as process; two definitions from the dictionary: • “relating to the imparting or transmission of something”, (OED). • “something imparted, interchanged, or transmitted”, (Dictionary.com). • In this sense communication is a process of interaction that occurs between agent and umwelt; • Umwelt being the outer world, environment or reality, as it affects the agent; as such it may contain other agents. • Thus, contra computation, communication as process is: • an observer independent, objective property of agent-environment systems; • a potentially more powerful metaphor than algorithms.

Swarm Intelligence (SI) • In the last two decades there has been a shift in research in A.I. that seeks to move research away from the classical modes of either equating intelligence with mere symbol manipulations or simple connectionist systems ... • ... Moving away from the view that mind is merely equivalent to brain – a private internal process – hence de-emphasising the autonomy of the individual thinker and instead emphasising the collective nature of many intelligent processes. • Swarm Intelligence emphasises the social nature of some cognitive processes and draws inspiration from many natural collective systems that solve complex problems in search and optimisation.

Characteristics of swarm intelligence systems • Swarm Intelligence systems are typically made up of a population of simple agents interacting locally with one another and with their environment. • Swarm Intelligence agents typically follow very simple rules: • There is no centralised control structure dictating how individual agents should react and behave; • instead local interactions between agents lead to the emergence of [seemingly] intelligent global behaviour. • Natural examples of Swarm Intelligence include ant colonies, bird flocking, animal herding, bacterial growth, and fish schooling ... • ... even, as we shall see, workshop delegates seeking a good place to eat in an unfamiliar town!

The Restaurant Game • A group of delegates arrive in a foreign town for an extended workshop on the ‘Philosophy of the Information and Computer Sciences’ and need to find a good place to eat. • A ‘good’ place to eat is the restaurant where most delegates are likely to choose a meal they deem ‘GOOD’. • An individual delegate’s response to a randomly selected meal from a restaurant menu {GOOD or BAD} is termed a ‘partial hypothesis evaluation’; it provides partial evidence on the restaurant’s overall quality. • The ‘search space’ (each delegate’s hypothesis space) is the set of all restaurants in the town. • A naive exhaustive search by all the delegates for the best restaurant is impractical as there will be too many (restaurant : dish) combinations to evaluate over the duration of the summer school.

A simple metaphor* for a stochastic diffusion search to find a ‘good’ restaurant • EACH DELEGATE: • Opens ‘Yellow Pages’ and selects a restaurant to visit at random, so defining the agent’s initial restaurant hypothesis. • Partial hypothesis evaluation: at dinner the delegate selects a meal from the menu at random and subsequently decides if it was ‘GOOD’ or ‘BAD’. • Utilising Passive recruitment: the next morning at breakfast … • IF <last night’s meal was ‘GOOD’> • THEN maintain restaurant hypothesis and GOTO (2) • ELSE IF <last night’s meal was ‘BAD’> THEN communicate with a random colleague: • IF <colleague’s meal was ‘GOOD’> • THEN adopt colleague’s restaurant hypothesis and GOTO (2) • ELSE GOTO (1). * The ‘Restaurant Game’ is offered as an illustration of SDS diffusion and partial evaluation mechanisms only; the restaurant game is not fully isomorphic to SDS in some pathological cases.

NESTER: a connectionist framework to perform SDS • Retina and Memory cells: • Correspond to search space and target. • Temporally encode what a feature is and where it is via Inter Spike Intervals, ISI’s. • Matching cells: • Correspond to a population of SDS agents. • Periodically broadcast their hypothesis to other matching cells encoded via an Inter Spike Interval, (ISI). • All cells types operate independently and asynchronously.

NESTER implements SDS • In our 2009 paper Nasuto, Bishop & de Meyer demonstrate that in its operation NESTER instantiates Stochastic Diffusion Search. • Hence, over time, the hypothesis maintained by a dynamic cluster of matching cells will cluster around the best fit of the target on the retina [search space]. • Hence synchronisation of matching cell hypothesis-signals (encoded via ISIs) indicates convergence onto the ‘best fit’ location of the target on the retina.

Knowledge representation in NESTER • Each NESTER matching cell processes bi-variate information as an ISI encoding a ‘feature’ value and an ‘identifier’ value. • A ‘feature’ value: • Temporal encoding of the value of a target ‘feature’. • An ‘identifier’ value: • Temporal encoding of the relative position of the feature (either on the retina or in the target). • Hence in NESTER knowledge is not restricted to arity zero predicates and knowledge is naturally processed as ‘tokens’ not ‘types’.

Non-spatial binding of semantic knowledge in NESTER • Unlike conventional connectionist systems, in NESTER knowledge is not physically bound to specific matching cells, as the activity of individual cells dynamically fluctuates over time. • Hence in an individual matching cell (or specific groups of cells), activity has no fixed semantic interpretation. • Instead, by process of communication and interaction, a network of NESTER matching cells naturally self-organise in response to environmental stimuli. • On convergence, temporal stability in the search space is reflected by collective temporal stability in a pattern of activity across matching cells. • Such a cluster is dynamic in nature, yet stable, analogous to, “a forest whose contours do not change but whose individual trees do”, (Arthur).

Stochastic Diffusion Processes:cognition as communication • In this talk I have criticised the ubiquity of the computational metaphor in Cognitive Science. • I have introduced the ‘Restaurant Game’ as a metaphor for a simple Stochastic Diffusion Search (SDS) and subsequently described NESTER, as a spiking neuron connectionist implementation of SDS. • In conclusion I suggest that NESTER is a potentially interesting cognitive architecture as it: • is not vulnerable to [at least some of] the standard critiques of computational connectionism; • and is most naturally understood in terms of [the metaphors of] interaction and communication. • For SDSdemo see: <http://doc.gold.ac.uk/~map01mm/SDSSim/>. • For SDP repository see <http://www.doc.gold.ac.uk/~mas02mb/sdp/index.htm>.

Some investigations employing Stochastic Diffusion Processes • A unification mechanism for Baars’ ‘global workspace’ and Dennett’s ‘multiple drafts’ (Nasuto); a solution to the binding problem (Nasuto); a model for multi-stable visual attention (Nasuto); models of visual attention (Summers); a novel metaphor for cognitive processing (Nasuto, Bishop et al.); parameter estimation / 3D computer vision (Bishop; Myatt); resource allocation (Majid); sequence detection (Jones); lip tracking (Grech-Cini & McKee); eye tracking (Bishop & Torr); mobile robot localisation (Beattie et al.); site selection for wireless networks (Hurley & Whitaker); speech recognition (Nicolaou); methods for automated object placement in virtual scenes (Cant Langensiepen); feature tracking in Atmospheric Motion Vectors (Hernandez-Carrascal & Nasuto); system for hybridized efficient genetic algorithms to solve bi-objective optimization problems with application to network computing (US PATENT 60/941,600); automatic reconstruction of 3D dendritic structure from optical light microscopy serial stacks (Nasuto); physically inspired artificial learning models (Ruta & Gabrys); cellular automata and immunity amplified stochastic diffusion search (Coulter & Ehlers); hybrid control system for collectives of evolvable nanorobots and microrobots ([US PATENT AG06F1900FI] Solomon Research); individual customers influence on the operation of virtual power plants (Britta [MVV Energie]); stochastic diffusion search for real-time web search (Hameed); swarm intelligence systems for transportation engineering: principles and applications (Teodorovic); stochastic diffusion search and voting methods (Nircan); swarming behaviour in wagering gaming machines ([PATENT WO 2009005578 20090108] ); noise, cost and speed-accuracy trade-offs: decision-making in a decentralized system (Marshall, Dornhaus, Franks & Kovacs); computational molecular biology (Jones); moon rover localisation (Hari & Thiyagarajan); testing and evaluation of the effectiveness of the stochastic search and optimization alogrithms developed in a dynamic military systems environment ([US Military Research Call] ); swarm intelligence stability based on stochastic diffusion search (Abbas, Mudathir, Rao & Rao); stochastic programming of computer agents and system of systems designs (US Military).

Stochastic Diffusion Processes: communication, search and cognition