A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University

Motivation Team Agents share a large amount of knowledge about the teamwork. Hard coded Interactions among participants. High-frequency message exchange. Communication risk. Multi-Agent Agent

Each agent has incomplete information from which uncertainties arise. Each agent has different problem solving capabilities. Data are decentralized and lack systems’ global control. Excessive/unrestricted communication leads to lack of scalability Challenging Issues in Designing Communication Protocols

Proactive Communication OBPC: Reduction of communication load through OBservations. DIP: Dynamicestimation of the probability distribution of Information Production and need. DTPC: Decision-Theoretic determination of communication strategies. Our Approach and Its Contributions

 CAST (Collab. Agents for Simulating Teamwork) MALLET (Multi-Agent Logic-based Language for Encoding Teamwork) Background (team-plan killwumpus(?w) (process (seq (agent-bind ?ca (constraint (play-role ?ca scout))) (DO ?ca (findwumpus ?w))) (agent-bind ?fi (constraint ((play-role ?fi fighter) (closest-to-wumpus ?fi ?w)))) (DO ?fi (movetowumpus ?w)) (DO ?fi (shootwumpus ?w)))))) (ioper shootwumpus (?w) (pre-cond (wumpus ?w) (location ?w ?x ?y) (dead ?w false)) (effect (dead ?w true)))

Proactive Communication OBPC DIP DTPC Overview CAST Team Structure & Teamwork Procedure KB KB KB KB KB Optimal Communication Strategy KB

Agent Execution Cycle Observe Sense Predict Info. need and production Execution Cycle Act Effect Decide Strategy Communicate Information

Syntax of Observability <observability> ::= (CanSee <viewing>)* (BelieveCanSee <believer><viewing>)* <viewing> ::= <observer><observable> <cond> <believer> ::= <agent> <observer> ::= <agent> <observable> ::= <property>|<action> <cond> ::= (<property>)* <property> ::= (<property-name> <object> <args>) <action> ::= (DO <doer> (<operator-name> <args>)) <object> ::= <agent>|<non-agent> <doer> ::= <agent>

(CanSee ca (location ?o ?x ?y) (location ca ?xc ?yc) (location ?o ?x ?y) (inradius ?x ?y ?xc ?yc rca) ) //The carrier can see the location property of an object. (CanSee ca (DO ?fi (shootwumpus ?w)) (play-role fighter ?fi) (location ca ?xc ?yc) (location ?fi ?x ?y) (adjacent ?xc ?yc ?x ?y) ) //The carrier can see the shootwumpus action of a fighter. (BelieveCanSee ca fi (location ?o ?x ?y) (location fi ?xi ?yi) (location ?o ?x ?y) (inradius ?x ?y ?xi ?yi rfi) ) //The carrier believes the fighter is able to see the location property of an object. (BelieveCanSee ca fi (DO ?f (shootwumpus ?w)) (play-role fighter ?f) ( ?f fi) (location ca ?xc ?yc) (location fi ?xi ?yi) (location ?f ?x ?y) (inradius ?xi ?yi ?xc ?yc rca) (inradius ?x ?y ?xc ?yc rca) (adjacent ?x ?y ?xi ?yi) ) //The carrier believes the fighter is able to see the shootwumpus action of another fighter. Example Observability Rules

ProactiveTell A provider reasons about what information it will have. A provider reasons about whether to deliver a piece of information when having the information. ActiveAsk A needer reasons about what information it will need. A needer reasons about whether to ask for a piece of information when needing the information. Proactive Communication Based on Observation

Evaluation Multi-Agent Wumpus World 20 wumpuses, 8 pits, and 20 piles of gold per world. 1 carrier and 3 fighters compose a team. The team goal is to kill wumpuses and get the gold without being killed. 5 randomly generated worlds with 20×20 cells.

Strategies Utility Function Cost Function Value Function Decision-Making Decision-Theoretic Proactive Communication

Decision-Making on Situation PA Situation PA: Provider produces a new piece of information b-a: Accept e 1 a-b: ProactiveTell 0 b-a: Wait e a-b: Silence b-a: Silence 2 e e b-a: ActiveAsk a: provider b: needer e: end

DM on Situation PB Situation PB: Provider receives a request for a piece of information a-b: Reply e 0 a-b: WaitUntilNext e

DM on Situation NA Situation NA: Needer needs a piece of information e b-a: Silence a-b: Reply t 1 0 b-a: ActiveAsk a-b: WaitUntilNext e a-b: Silence b-a: Wait e 0 t a-b: ProactiveTell t: transfer

DM on Situation NB Situation NB: Needer receives a piece of information 0 e b-a: Accept t

 Parameters in utility function: I: information about which communication occurs t: time of decision-making t1: time at which I is needed t2: time at which the value for I used is produced SU: situation at t S: strategy available at SU M: a set of messages involving in obtaining I E: environment state at t U(I, t, t1, t2, SU, S, M, E) =V(I, t, t1, t2, SU, S)–C(M) Utility Function

V(I, t, t1, t2, SU, S) =T(I, t, t1, t2, SU, S)//Timeliness +R(I, t, t1, t2, SU, S)//Relevance Value Function

Timeliness Whether agents use a value that can be produced in time when they need I. d(I, t, t1, t2, SU, S) = max(0, t2–t1) ft(d(I, t, t1, t2, SU, S)) s.t. ft(x) < ft(y) if y < x T(I, t, t1, t2, SU, S) = ft(d(I, t, t1, t2, SU, S)) Timeliness Function

Relevance Unprocessed, Most recent, Important P(I, t, t1, t2, SU, S) = Pr(I  t  t1 t2 no other value for I was produced between Int[t1,t2] | S  SU) frI(P(I, t, t1, t2, SU, S)) s.t. frI(x) < frI(y) if x < y R(I, t, t1, t2, SU, S) = frI(P(I, t, t1, t2, SU, S)) Relevance Function

Cost Function 0 if Mi= C(Mi) = k1 + k2 × len(Mi) otherwise

Expected Utility E(U) =

Strategies Situation PA: provider produces I ProactiveTell? Silence? Unfulfilled need Next production Unknown t Known Last need aware of Current time Last not sent Last sent

Strategies Situation PB: provider receives a request for I Reply?WaitUntilNext? Next production Unknown t Known Current time Last production

Strategies Situation NA: needer needs I ActiveAsk? Wait? Silence? Next production Most recent production Unknown t Known Last I received Current time

Strategies Situation NB: needer receives I Accept

Summary • Advantages of Approach: allows agents to make intelligent choices of communication policy based on: • frequencies: of needs, of sensing, of info. change • costs: of messages, plus penalities for delays in action, or acting with incorrect information

There are information needs among the team. Agents can communicate. There is uncertainty in the environment. Stochastic properties of teamwork process. Agents have incomplete/disjoint knowledge about the world. The team acts under critical time constraints, so proactive assistance becomes important. Criteria for Applicable Domains

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork

Presentation Transcript

Modeling Teamwork in Multi-Agent Systems: The CAST Architecture

A PROACTIVE APPROACH TO E-DISCOVERY

A Multi-Criterion Decision Making Approach to Problem Solving

A proactive approach to programming

A Game Theoretic Approach to Geographic Profiling

A Practical, Decision-theoretic Approach to Multi-robot Mapping and Exploration

Patient Journey Optimization using a Multi-agent approach

Automata-Theoretic approach

Multi-Agent System Communication Paradigms

A Control-Theoretic Approach to Flow Control in Communication Networks

Proactive Approach

Agent Communication in Multi Agent Systems

Modeling Teamwork in the CAST Multi-Agent System

Game-Theoretic Multi-Agent Learning

Fatherhood: A Proactive Approach To Neglect

A Multi-Agent-Approach

A Control Lyapunov Function Approach to Multi Agent Coordination

A Proactive Approach to Domestic Violence

Proactive Approach

Designing a Proactive Management Plan