1 / 36

Abstracting and Composing High-Fidelity Cognitive Models of Multi-Agent Interaction

Abstracting and Composing High-Fidelity Cognitive Models of Multi-Agent Interaction MURI Kick-Off Meeting August 2008. Christian Lebiere David Reitter Psychology Department Carnegie Mellon University. Main Issues. Understand scaling properties of cognitive performance

andrew
Télécharger la présentation

Abstracting and Composing High-Fidelity Cognitive Models of Multi-Agent Interaction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Abstracting and Composing High-Fidelity Cognitive Models of Multi-Agent Interaction MURI Kick-Off Meeting August 2008 Christian Lebiere David Reitter Psychology Department Carnegie Mellon University

  2. Main Issues • Understand scaling properties of cognitive performance • Most experiments look at a single performance point rather than as a function of problem complexity, time pressure, etc • Key component in abstracting performance at higher levels • Understand interaction between humans and machines • Most experiments study and model human performance under a fixed scenario that misses key dynamics of interaction • Key aspect of both system robustness and vulnerabilities • Understand generality and composability of behavior • Almost all models are developed for specific tasks rather than assembling larger pieces of functionality from basic pieces • Key enabler of scaling models and abstracting their properties

  3. Cognitive Architectures • What is a cognitive architecture? • Invariant mechanisms to capture generality of cognition (Newell) • Aims for both breadth (Newell Test) and depth (quantitative data) • How are they used? • Develop model of a task (declarative knowledge, procedural strategies, architectural parameters) • Limits of model fitting (learning mechanisms, architectural constraints, reuse of model and parameters) • ACT-R • Modular organization, communication bottlenecks, mapping to brain regions • Mix of symbolic production system and subsymbolic statistical mechanisms

  4. ACT-R Cognitive Architecture Activation Learning Latency Utility Learning Productions Retrieval Goal Manual Visual Motor Vision Intentions Memory World 3 L 1 20 S Y 20 Size Fuel Turb Dec IF the goal is to categorize new stimulus and visual holds stimulus info S, F, T THEN start retrieval of chunk S, F, T and start manual mouse movement Stimulus SSL S13 Bi Chunk

  5. Sample Task: AMBR Synthetic ATC

  6. Model - Methodology • Model designed to solve task simply and effectively • Not engineered to reproduce any specific effects • Reuse of common design patterns • Makes modeling easier and faster • Reduces degrees of freedom • No fine-tuning of parameters • Left at default values or roughly estimated from data (2) • Architecture provides automatic learning of situation • Position & status of AC naturally learned from interaction

  7. Model - Methodology II • As many model runs as subject runs • Performance variability is an essential part of the task! • Model speed is essential (5 times real-time in this case) • Stochasticity is a fundamental feature of the architecture • Production selection • Declarative retrieval • Perception and actions • Stochasticity amplified by interaction with environment • Model captures most of variance of human performance • No individual variations factored in the model (W, efforts)

  8. Model - Overview • 5 (simple) declarative chunks encoding instructions • Associate color to action and penalty • 36 (simple) productions organized in 5 unit tasks • Color-Goal(5): top-level goal to pick next color target • Text-Goal(4): top-level goal to pick next area to scan • Scan-Text(7): goal to scan text window for new messages • Scan-Screen(8): goal to scan screen area for exiting AC • Process(12): processes a target with 3 or 4 mouse clicks • Unit tasks map naturally to ACT-R goal type and production-matching - a natural design pattern

  9. Flyoff - Performance • Performance is much better in the color than text condition • Performance degrades sharply with time pressure for text • Good fit except for text-high: huge variation with tuneup too

  10. Flyoff - Distribution • The model can yield a wide range of performances through retrieval and effort stochasticity and dynamic interaction • Model variability always tends to be lower than the subjects

  11. Flyoff - Penalty Profile • Errors: no speed change error or click error but incorrect and duplicated messages occurring during the handling of holds • Delays: more holds for high but fewer welcome and speed

  12. Flyoff - Latency • Response times increase exponentially with number of intervening events and faster for text than color condition • Model is slightly faster in color but slower in text condition

  13. Flyoff - Selection • The number of selections decreases roughly exponentially, with text starting lower but trailing off longer with final spike • Ceiling effect in color condition (mid & high): see workload

  14. Flyoff - Workload • Workload is higher in text condition and increases faster • Model reproduces both effects but misses ceiling effect in color condition even though it gets it for selection measure!

  15. Learning Categories • Model learns responses through instance-based categorization • Learning curve and level of performance reflects degree of complexity of function mapping aircraft characteristics to response

  16. Transfer Errors • Transfer performance is defined by (linear) similarities between stimuli values along each dimension (size, fuel, turb.) • Excellent match to trained instances (better than trial 8!). • Extrapolated: syntactic priming or non-linear similarities?

  17. Individual Stimuli Predictions • Good match to probability of accepting individual stimuli for each category. • RMSE: • Cat. 1 = 14.1% • Cat. 3 = 13.4% • Cat. 6 = 12.5%

  18. Task Approach • Use similar task to AMBR - AMBR variant, Team Argus, CMU-ASP (Aegis) - for exploration • Introduce team aspect that is implicit in task by interchangeably replacing controllers by humans, models or agents • Right properties, tractable, scalable even though somewhat abstract • Scale model to other domains (UAV control, Urban Search and Rescue) and environments (DDD, NeoCities) • Force model generalization across environments • Explore fidelity/tractability tradeoffs

  19. Issue 1: Scaling Properties • Cognitive science is usually concerned with absolute performance (e.g. latency) at fixed complexity points • Often less discriminative than scaling properties • Study human performance at multiple complexity points to understand scaling and robustness issues • Scaling provides strong constraints on algorithms and representations • Robustness is a key issue in extrapolating individual performance to multi-agent interaction and overall network performance, reliability and fault-tolerance • Quantify impact on all measures of performance • Converging measures of performance provide stronger evidence than separate measures susceptible to parametric manipulation • Understanding of scaling key to enabling abstraction

  20. Constraints and Analyses • AMBR illustrated strong cognitive constraints put on the scaling of performance as a function of task complexity • Past analyses have shown the impact of: • Architectural component interactions (Wray et al, 2007) • Representational choices (Lebiere & Wallach, 2001) • Parameter settings on dynamic processes (Lebiere, 1998)

  21. Scaling Experiments • Study human performance at multiple complexity points to understand scaling and robustness issues • Vary task complexity (e.g. level of aircraft autonomy) • Vary problem complexity (e.g. number of aircraft) • Vary information complexity (e.g. aircraft characteristics) • Vary network topology (e.g. number of controllers) • Vary rate of change of environment (e.g. appearance or disappearance of aircraft, weather, network topology) • Quantify impact on all measures of performance • Direct performance (number of targets handled, etc) • Situation awareness (levels, memory-based measures) • Workload (both self-reporting and physiological measures)

  22. Issue 2: Dynamic Interaction • Main problem in developing high-fidelity cognitive models of multi-agent interaction are the increased degrees of freedom of open-ended agent interaction • Methodology has been developed to model multi-agent interactions in games and logistics (supply chain) problems (West & Lebiere, 2001, Martin et al, 2004) • Develop baseline model to capture first-order dynamics • Replace most HITL with baseline model(s) to reduce DOF • Refine model based on greater data accuracy and revalidate • Methodology can be extended to multiple levels of our hierarchy, each time abstracting to next level • Also extends to heterogeneous simulations with mixed levels including HITL, models and agents

  23. Results: Model against Model • Performance resembles a random walk with widely varying outcomes • Distribution of streaks hints at fractal properties • The model with the larger lag will always win in the long run

  24. Results: Model against Human • Performance of human against lag1 model is similar to lag2 model • Lag2 model takes time to get started because of longer chunks whereas lag1 model starts faster because it uses fewer shorter chunks

  25. Results: Effects of Noise • Performance improves sharply with noise, then gradually decreases • Noise fundamentally alters the dynamic interaction between players • Noise is essential to adaptation in changing real-world environments

  26. Interactive Alignment • Tendency of interacting agents to align communicative means at different levels(Pickering & Garrod 2004) • Task success is correlated with alignment(Reitter & Moore 2007) • More alignment if interlocutors are perceived to be non-human (Branigan et al. 2003)

  27. Micro-Evolution • Communities will evolve communicative standards • e.g., Reference to Landmarks, identification strategies for locations(e.g., Garrod & Doherty 1994, Fay et al. in press) Garrod & Doherty 1994 : location identification strategy: counting boxes vs. connections

  28. Micro-Evolution • Evolutionary dynamics apply • How do cognitive agents enable and influence evolution? (Pressure? Heat?)

  29. Autonomous agents • Can autonomous agents support alignment and communicative evolution? • Interaction of humanoid cognitive models with autonomous agents • as a testbed before testing with humans. • How can communicative behavior of UAVs be adapted to take limitations of human cognition into account?

  30. Interaction Experiments • Impact of evolving, interactive communication • Vary constraints on evolution of communication (e.g. fixed vs. adaptive communication channel) • Vary constraints on sharing of communication (e.g. pair-wise vs. community communication development) • Impact of fixed, flexible or emergent network organization • Vary network flexibility (e.g. communication beyond grid) • Vary level of information sharing (e.g. information filters) • Accurate cognitive models for human-machine interaction • Adaptive interfaces (e.g. to predicted model workload) • Model-based autonomy (eg. handle monitoring, routine decision-making)

  31. Issue 3: Behavior Abstraction • First two issues build solutions toward this one • Study of scaling properties helps capture response function for all aspects of target behavior • Abstraction methodology helps iterate and test models at various levels of abstraction to maximize retention • Issues: • Grain scale of components (generic tasks, unit tasks?) • Attainable degree of fidelity at each level? • Capture individual differences or average, normative behavior? • Latter may miss key interaction aspects outliers • Individual differences as architectural parameters (WM, speed) • Use cognitive model to generate data to train machine learning agent tailored to individual decision makers

  32. ACT-R vs. Neural Network Model Answer Lag 1 Lag 2 Neural network model based on same principles (West, 1998;1999) • Simple 2-layer neural network • Localist representation • Linear output units • Fixed lag of 1 or 2 • Dynamics arise from the interaction of the two networks • Network structure (fields) can be mapped to chunk structure (slots) • ACT-R and network both store game instances (move sequences) • ACT-R and network are similarly sensitive to game statistics • Noise plays a more deliberate role in ACT-R than neural network

  33. Individual vs Group Models • Model of sequence expectation applied to baseball batting • Key representation and procedures general, not domain-specific • Cognitive architecture constrains performance to reproduce all main effects: recency, length of sequence and sequence ordering • Variation in performance between subjects can be captured using straightforward parameterization of perceptual-motor skills

  34. Markov Model (Gray, 2001) • 2 states: expecting fast or slow pitch • Probabilities of switching state as, af and temporal errors when expecting fast and slow pitch Tf, Ts need to be estimated • 2 more transition rules and associated parameters (ak, ab) to handle pitch count Basic Markov assumption: Current state determines future

  35. Markov vs. ACT-R • State representation • Markov has discrete states that represent decisions • ACT-R has graded states that reflect the state of memory • Transition probabilities • Markov needs to estimate state transition probabilities • ACT-R predicts state change based on theory of memory • Pitch count • Markov has to adopt additional rules and parameters • ACT-R generalizes using established representation • ACT-R is more constrained than Markov model • Similar results for backgammon domain: • Comparable results to NN and TD-learning with orders of magnitude fewer training instances

  36. Abstraction Experiments • Impact of Representation Fidelity • Vary degree of model fidelity to determine impact on network dynamics (e.g. high- vs. low-fidelity nodes for specialists vs. generalists) • Determine which model aspects are critical to performance • Impact of Skill Compositionality • Enforce skill composition through standard, common interface and determine impact on performance • Evaluate impact of architectural constructs including working memory support for multi-tasking • Relevant computer science concepts • Abstract Behavior Types • Generalization of abstract data types to temporal streams • Aspect-Oriented Programming • Generalization to allow more complex procedural interaction

More Related