1 / 28

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork. Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University. Motivation. Team.  Agents share a large amount of knowledge about the teamwork.

zander
Télécharger la présentation

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University

  2. Motivation Team Agents share a large amount of knowledge about the teamwork. Hard coded Interactions among participants. High-frequency message exchange. Communication risk. Multi-Agent Agent

  3. Each agent has incomplete information from which uncertainties arise. Each agent has different problem solving capabilities. Data are decentralized and lack systems’ global control. Excessive/unrestricted communication leads to lack of scalability Challenging Issues in Designing Communication Protocols

  4. Proactive Communication OBPC: Reduction of communication load through OBservations. DIP: Dynamicestimation of the probability distribution of Information Production and need. DTPC: Decision-Theoretic determination of communication strategies. Our Approach and Its Contributions

  5.  CAST (Collab. Agents for Simulating Teamwork) MALLET (Multi-Agent Logic-based Language for Encoding Teamwork) Background (team-plan killwumpus(?w) (process (seq (agent-bind ?ca (constraint (play-role ?ca scout))) (DO ?ca (findwumpus ?w))) (agent-bind ?fi (constraint ((play-role ?fi fighter) (closest-to-wumpus ?fi ?w)))) (DO ?fi (movetowumpus ?w)) (DO ?fi (shootwumpus ?w)))))) (ioper shootwumpus (?w) (pre-cond (wumpus ?w) (location ?w ?x ?y) (dead ?w false)) (effect (dead ?w true)))

  6. Proactive Communication OBPC DIP DTPC Overview CAST Team Structure & Teamwork Procedure KB KB KB KB KB Optimal Communication Strategy KB

  7. Agent Execution Cycle Observe Sense Predict Info. need and production Execution Cycle Act Effect Decide Strategy Communicate Information

  8. Syntax of Observability <observability> ::= (CanSee <viewing>)* (BelieveCanSee <believer><viewing>)* <viewing> ::= <observer><observable> <cond> <believer> ::= <agent> <observer> ::= <agent> <observable> ::= <property>|<action> <cond> ::= (<property>)* <property> ::= (<property-name> <object> <args>) <action> ::= (DO <doer> (<operator-name> <args>)) <object> ::= <agent>|<non-agent> <doer> ::= <agent>

  9. (CanSee ca (location ?o ?x ?y) (location ca ?xc ?yc) (location ?o ?x ?y) (inradius ?x ?y ?xc ?yc rca) ) //The carrier can see the location property of an object. (CanSee ca (DO ?fi (shootwumpus ?w)) (play-role fighter ?fi) (location ca ?xc ?yc) (location ?fi ?x ?y) (adjacent ?xc ?yc ?x ?y) ) //The carrier can see the shootwumpus action of a fighter. (BelieveCanSee ca fi (location ?o ?x ?y) (location fi ?xi ?yi) (location ?o ?x ?y) (inradius ?x ?y ?xi ?yi rfi) ) //The carrier believes the fighter is able to see the location property of an object. (BelieveCanSee ca fi (DO ?f (shootwumpus ?w)) (play-role fighter ?f) ( ?f fi) (location ca ?xc ?yc) (location fi ?xi ?yi) (location ?f ?x ?y) (inradius ?xi ?yi ?xc ?yc rca) (inradius ?x ?y ?xc ?yc rca) (adjacent ?x ?y ?xi ?yi) ) //The carrier believes the fighter is able to see the shootwumpus action of another fighter. Example Observability Rules

  10. ProactiveTell A provider reasons about what information it will have. A provider reasons about whether to deliver a piece of information when having the information. ActiveAsk A needer reasons about what information it will need. A needer reasons about whether to ask for a piece of information when needing the information. Proactive Communication Based on Observation

  11. Evaluation Multi-Agent Wumpus World 20 wumpuses, 8 pits, and 20 piles of gold per world. 1 carrier and 3 fighters compose a team. The team goal is to kill wumpuses and get the gold without being killed. 5 randomly generated worlds with 20×20 cells.

  12. Strategies Utility Function Cost Function Value Function Decision-Making Decision-Theoretic Proactive Communication

  13. Decision-Making on Situation PA Situation PA: Provider produces a new piece of information b-a: Accept e 1 a-b: ProactiveTell 0 b-a: Wait e a-b: Silence b-a: Silence 2 e e b-a: ActiveAsk a: provider b: needer e: end

  14. DM on Situation PB Situation PB: Provider receives a request for a piece of information a-b: Reply e 0 a-b: WaitUntilNext e

  15. DM on Situation NA Situation NA: Needer needs a piece of information e b-a: Silence a-b: Reply t 1 0 b-a: ActiveAsk a-b: WaitUntilNext e a-b: Silence b-a: Wait e 0 t a-b: ProactiveTell t: transfer

  16. DM on Situation NB Situation NB: Needer receives a piece of information 0 e b-a: Accept t

  17.  Parameters in utility function: I: information about which communication occurs t: time of decision-making t1: time at which I is needed t2: time at which the value for I used is produced SU: situation at t S: strategy available at SU M: a set of messages involving in obtaining I E: environment state at t U(I, t, t1, t2, SU, S, M, E) =V(I, t, t1, t2, SU, S)–C(M) Utility Function

  18. V(I, t, t1, t2, SU, S) =T(I, t, t1, t2, SU, S)//Timeliness +R(I, t, t1, t2, SU, S)//Relevance Value Function

  19. Timeliness Whether agents use a value that can be produced in time when they need I. d(I, t, t1, t2, SU, S) = max(0, t2–t1) ft(d(I, t, t1, t2, SU, S)) s.t. ft(x) < ft(y) if y < x T(I, t, t1, t2, SU, S) = ft(d(I, t, t1, t2, SU, S)) Timeliness Function

  20. Relevance Unprocessed, Most recent, Important P(I, t, t1, t2, SU, S) = Pr(I  t  t1 t2 no other value for I was produced between Int[t1,t2] | S  SU) frI(P(I, t, t1, t2, SU, S)) s.t. frI(x) < frI(y) if x < y R(I, t, t1, t2, SU, S) = frI(P(I, t, t1, t2, SU, S)) Relevance Function

  21. Cost Function 0 if Mi= C(Mi) = k1 + k2 × len(Mi) otherwise

  22. Expected Utility E(U) =

  23. Strategies Situation PA: provider produces I ProactiveTell? Silence? Unfulfilled need Next production Unknown t Known Last need aware of Current time Last not sent Last sent

  24. Strategies Situation PB: provider receives a request for I Reply?WaitUntilNext? Next production Unknown t Known Current time Last production

  25. Strategies Situation NA: needer needs I ActiveAsk? Wait? Silence? Next production Most recent production Unknown t Known Last I received Current time

  26. Strategies Situation NB: needer receives I Accept

  27. Summary • Advantages of Approach: allows agents to make intelligent choices of communication policy based on: • frequencies: of needs, of sensing, of info. change • costs: of messages, plus penalities for delays in action, or acting with incorrect information

  28. There are information needs among the team. Agents can communicate. There is uncertainty in the environment. Stochastic properties of teamwork process. Agents have incomplete/disjoint knowledge about the world. The team acts under critical time constraints, so proactive assistance becomes important. Criteria for Applicable Domains

More Related