Distributed representations in AI: Building the world model and analogical reasoning

Distributed representations in AI:Building the world model and analogical reasoning Dr. Dmitri Rachkovskij Dept. Of Neural Information Processing TechnologiesInternational Research and Training Center of Information Technologies and Systems, Kiev, UkraineNational Ukrainian Academy of Sciencesdar@infrm.kiev.ua Slide number is in the right lower corner:

Distributed representations in AI:Building the world model and analogical reasoning: Plan 1. Intro: The world model - content and organization • The world model: models of attributes, objects, relations, episodes • Part-whole hierarchy • Classification hierarchy 2. The world model based on distributed representations • Symbolic, local, and distributed representations • General architecture of the world model • Representation and processing of simple structures • Representations of sequences and episodes 3. Analogical reasoning • Analogy and its research and modeling by cognitive psychologists • Modeling of analogy with distributed representations

The Agent's internal world model – the system of knowledge about domain and Agent itselfNecessary for organization of intelligent behavior The world model stores • episode and situations encountered by the Agent • reactions on situations • evaluations of results • etc. The world model is used for • recognition • analysis • prediction • reaction • etc.

Attributes black, furry, barking, big, four-legged... Objects - real or ideal physical bodies (table) animals (dog) unreal (centaur) Episodes and situations - many objects and relations (hunting, war,..) Relations spatial (above) temporal (after) part-whole (part-of) Relation R(X,Y,…) requires several objects X,Y,… Undirected (X and Y are neighbors) Directed (X above Y, X sold Y a BOOK) Arguments of directed relations have roles (agent X, object Y , etc.) Models of attributes, objects, relations, situations, ...Content of models - appearance, structure, behavior, etc. of objects Models of

Compositional structure of the world model: part-whole relations and hierarchy Models are interrelated. Model of an object (car) is associated with models of its parts (body, motor, wheels), attributes (color, etc.) Division objects-attributes is not absolute. Model of the part (e.g., wheel) can have • its own models-parts (tire, rim, cap), • attributes (shape - ring, texture – "protector", color - black, material - rubber) Structural attributes Attributes of appearance

Model of situation Hierarchy part-whole is also named as "meronymy-holonymy", "aggregate", "modular", "compositional", "structural" Model-whole may include (associate) models-parts of goals, actions, costs, evaluations, feelings, и т.д.

Examples of hierarchical structures • Logical or symbolic propositions • Patterns in structural or syntactic recognition • Complex chemical substances • Proteins in molecular biology • Computer programs • Knowledge bases • etc.

Classification structure of a world model: is-a relations and hierarchy Models of classes - combinations of attribute models Is-a relation (cat is an animal) is also hierarchical - there exist more abstract (general) or less abstract (specific) classes

Operations with class models Classes allow transferring experience to new objects and situations and making predictions. E.g., similar looking objects often behave similar.

Available "world models" World models ~ Knowledge Bases ~ Ontologies are beginning to be used in diverse applications • CYC - a general ontology for commonsense knowledge (Lenat and Guha) • UMLS (Unified Medical Language System) - an ontology of medical concepts (Humphreys and Lindberg ) • WORDNET - one of the most comprehensive lexical ontologies (Miller) • etc.

Representation of information in the world models Representation schemes used in ontologies: • Conceptual graphs • Semantic networks • Frames • Prolog predicates • Other special representation languages All these representation schemes are based on traditional symbolic and local representations of information - they have drawbacks

Distributed representations (1) immediately reflect similarity degree (2) provide high information capacity Allow (3) using of associative memory (4) formation of part-whole hierarchies (5) modeling of analogical reasoning

Local and symbolic implementations of models Representation of information in computer Pyramidal networks address field1 field2 field3 field4 … 1 name_А "A" - - … 2 name_В "В" - - … 3 name_АВ "АВ" address1 address2 … ... ... ... ... ... ... N Bracketed representations ((A B) (A C D))...(...) Symbolic representations entities (sun planet) expressions (((mass sun) :name masssun) ((mass planet) :name mass-planet) ((greater masssun massplanet) :name

Problems with symbolic and local representations • All-or-none similarity Models are either identical or unsimilar А is identical to А А is non-identical to В,С,…,Z… А - in a pointer to ((B C) ((B D) (X Y Z))... • Information capacity Coding "1 from N" In N units - up to N models

Problems with symbolic and local representations • Comparison and estimation of similarity of complex structured models are very complex (comparison of graphs - finding partial isomorphism)

Problems with symbolic and local representations • Graph isomorphism is not enough to justify estimation of similarity by humans (1) The fascists invaded France, causing people to flee France (2) Rats infested the apartment, causing the people to leave the apartment (3) The game show host kissed the contestant, inviting the audience to applaud the contestant Abstract structure of 3 sentences: Relation12 [Relation1(X,Y), Relation2(Z,Y)] Correspondences due to the abstract scheme The fascists  the game show host (X) ?? France  contestant (Y) ?? invaded  kissed (Relation1) ?? People  audience (Z) ?? to flee  to applaud (Relation2) ?

Distributed representations • In distributed representations, any object is represented by a distributed pattern of units' activity Long binary sparse stochastic codevector • binary (elements 0 and 1) Overlap: O=p*p*N • number of elements N~100 000 p=M/N • number of 1s M~1000 << N O = M*M/N=10=1% • 1s have (pseudo)random positions

Efficient use of resources for information representation Up to CNMitems (compare to N for local representations) Natural representation and estimation of similarity. Similarity of X and Y is calculated by dot-product: S(X,Y)=|X & Y| = SUM XiYi, i =1,…,N For binary vectors, dot-product = number of overlapping 1s Advantages of distributed representations S0.01 S0.5 S0.5

Some historyAcademician N.M.Amosov founded Dept. Of BioCybernetics, Inst. Of Cybernetics, Kiev, in 1960s Modeling of Thinking and the Mind, Spartan Books, USA, 1967. M-networks Amosov, N.M., Basilevsky E.B., Kasatkin A.M., Kasatkina L.M., Luk A.N., Kussul E.M., & Talayev S.A. (1972) M-network as possible basis for construction of heuristic models, Cybernetica 3, pp. 169-186. 1974 - first-in-the-world autonomous vehicle controlled by neural networks. The vehicle could move in natural environment Amosov, N.M., Kasatkin, A.M., & Kasatkina L.M. (1975) Active semantic networks in robots with an autonomous control. Fourth Intern. Joint Conference on Artificial intelligence, v.9, pp. 11-20 Amosov, N. M., Kussul, E. M., & Fomenko, V. D. (1975) Transport robot with a neural network control system. Advance papers of the Fourth Intern. Joint Conference on Artificial intelligence v.9 pp.1-10

Associative-Projective Neural NetworksAPNNs - proposed by Dr. Kussul in 1983 • E.Kussul (1992) Associative neuron-like structures • T.Baidyk (2002) Neural networks and problems of Artificial Intelligence • Amosov, Baidyk, Goltsev,Kasatkin, Kasatkina, Kussul, Rachkovskij Neurocomputers and Intelligent Robots (Spanish translation is prepared) Donald Hebb (1949) Organization of behavior

The APNN architecture of the world modelbased on distributed representations Module M includes buffer fields BUF and associative field ASC. BUF - make bitwise operations ASC - returns the most similarcodevector from the memory

The APNN architecture: associative field Retrieval Storage

The APNN architecture: associative fieldlocal version Storage Retrieval distributed version (Hopfield-type)

A simple parallel scheme of construction and processing of part-whole structures Searching of the most similar modelA' is similar to Aalpha' is similar to alpha Construction by superposition (disjunction) & thinning|<A v alpha> & A| = = 0.5 |A| = 0.5 |alpha|<...> = thinning = grouping Decoding of the model-whole through its parts

A simple parallel scheme of construction and processing of part-whole structures Interpretations of A  Retrieval of the model-whole by its model-part Association between two models-parts

Architecture of the world modelusing distributed representations

Representation of sequences and hierarchical episodes AB vs BA. X>>n (n is #). A>>1 B>>2 vs B>>1 A>>2. ( ,(),(,(,)) ) - labeled ordered acyclic graph>>1>>1>>2>>1>>1>>2>>2 >>3 "Spot bit Jane, causing Jane to flee from Spot"

Properties of distributed representations of complex hierarchical models • The same dimensionality of codevectors of parts and wholes • Codevector of model-whole is similar to codevectors of models-parts • Similar (with respect to objects and relations) hierarchical models have similar codevectors • Associations (between parts and wholes, attributes and objects, attributes and classes, objects and situations, etc.) are made by similarity of codevectors (not by connections or pointers, as in local and symbolic representations)

Comparison of hierarchical structures and analogical reasoning An ability to estimate easily similarity of complex structured representations is essential for many AI problems One interesting problem is modeling of analogical reasoning Gentner & Markman (1995, 1997, 2003); Hummel & Holyoak (1997); Eliasmith & Thagard (2001) Analogy, metaphor - a comparison process that allows consideration of one domain from the point of view of different domain (Gentner & Markman 1995, 2003) Rutherford: solar system  atom

Similarity of analogs Analogs are hierarchical structured episodes or situations Analogs are compared not only by "surface similarity" (common or similar elements - objects, relations, etc. ) "Structural similarity" is very important - how the elements are grouped in analogs ("structural consistency", "isomorphism")

3 stages of analogy processing 1. Access (retrieval, recall) - process of finding in memory the most similar base analog given target (input, cue, probe) episode 2. Mapping - process of finding correspondences between the elements of two analogs 3. Inferences about target analog based on the info from the base

Analogy in everyday life • Solution of problems. Base analog is used as a source of ideas about the target problem • Explanations. Base analog is used for understanding of target analog • Formation and evaluation of hypothesis • Justification of point of view (In political, historical, etc. discussions) • In literature. • Etc.

Analogy in solving problemsTarget analog - heat flow. Base analog - water flow.

Study of analogy in humans. Various types of analog similarity. Episodes with animals. Example - episodes with animals, adapted from Thagard, Holyoak, Nelson & Gochfeld (1990). General scheme: R0 (R1(X,Y), R2(Y,X)) The episodes have the same relations, as Probe, but various types of similarity

Modeling analogical reasoning with distributed representations Access to analogical episodes Similar structures have similar codevectors. Total similarity of structures is evaluated by the overlap of their codevectors Access to analogical episode is done by finding in long-term memory a codevector most similar to the codevector of the input episode

Access to analogical episodes Episodes with animals Humans demonstrate the following pattern of retrieving analogs from long-term memory: LS > CM  SF > AN > FOR. Forbus, Gentner& Law 1995; Ross 1989; Wharton, Holyoak 1994 Similarity values between codevectors of episodes (our model)

Mapping analogs with distributed representations Mapping (interpretation) of analogy - find corresponding elements of the analogs In our model, many analogs can be mapped by direct similarity of the codevectors of their elements

Mapping by similarity

Mapping analogs with distributed representations. More sequential scheme.

Distributed representations in APNNs (1) immediately reflect similarity degree (2) provide high information capacity Allow (3) using of distributed associative memory (4) formation of part-whole hierarchies (5) modeling of analogical reasoning

Distributed representations in AI: Building the world model and analogical reasoning

Distributed representations in AI: Building the world model and analogical reasoning

Presentation Transcript

Distributed Probabilistic Model-Building Genetic Algorithm

Using Civil Law to Teach Analogical Reasoning in the Common Law

Analogical Reasoning

Evaluation of representations in AI problem solving

The Evolution of General Intelligence: The Roles of Working Memory and Analogical Reasoning in Solving Novel Problems

Integer Representations and Counting in the Bit Probe Model

Teaching Clinical Reasoning In The Apprenticeship Model

Analogical Reasoning and Executive Control Develop Together in Preschool-Aged Children

Strategic Research Directions in AI: Distributed AI and Agent Systems

Part I: Mill’s Methods redux Part II: Analogical Reasoning

Connectionist Models of Analogical Reasoning

Analogical Reasoning

Comprehensibility of Model Representations

Qualitative and Analogical modeling of cultural reasoning

Emergent Representations and Reasoning in Adaptive Agents

Dense Distributed Representations

Overland and Channel Routing in the Distributed Model

Overland and Channel Routing in the Distributed Model

Distributed AI

Analogical Arguments

Evaluation of representations in AI problem solving