370 likes | 455 Vues
Investigating negotiation strategies in incomplete information settings where players reveal private information, using opponent modeling and machine learning. Our agent outperformed people, leading to increased performance. Revealing preferences, exploiting information, and extensive empirical evaluation were key focus areas.
 
                
                E N D
A Study of Computational and Human Strategies in Revelation Games 1Noam Peled, 2Kobi Gal, 1Sarit Kraus 1Bar-Ilan university, Israel. 2Ben-Gurion university, Israel.
Outline What is this research about? An agent that negotiates with people in incomplete information settings. People can choose to reveal private information. Why is it difficult? Uncertainty over people’s preferences. People are affected by social/psychological issues.
Our approach Opponent modeling + machine learning. Results Agent outperformed people. Learned to exploit how people reveal information. People increased their performance when playing with agent (as compared to other people).
Revelation games Incomplete information over preferences: Players’ types are unknown. Players can reaveal their type before negotiating for a finite number of rounds. Revelation is costless. No discount factor.
Our setting This is the minimal setting where each of the two players can make one proposal:
Roadmap Probabilistic model of people. The model explicity represent social features. We built an agentthat uses this model for negotiation with people. Extensive empirical evaluation over 400 subjects from different countries!
Colored Trails Open source empirical test-bed for investigating decision making. Family of computer board games that involve negotiation over resources. Easy to design new games. Built in functionality for conducting experiments with people. Over 30 publications.
Objective The objective is to maximize your score: Try to get as close as possible to the goal. Using as few chips as possible. End up with as many chips as you can.
Example proposal of ‘me’ player • Here, the ‘me’ player is signaling aboutit’s true goal location.
The other participant decides whether to accept the proposal If it accepts the proposal, the game will end If it rejects, it will be able to make a counter proposal
The end of the game The ‘me’ player was moved to its goal using one gray and one red chips The other participant was moved to his goal using 2 cyan chips
Agent design • SIGAL: SIGmoid Acceptance Learning. • Models people’s behavior using probability distributions: • Making/accepting offers. • Whether people reveal their goals. • Maximizes its expected benefit given the model using backward induction.
SIGAL strategy: Round 2 • As a responder: accepts any beneficial proposal in terms of score in the game. • As a proposer: • Calculates its expected benefit from each proposal: • Choose offer that maximizes expected benefit.
SIGAL strategy: Round 1 • As responder: accept any proposal that gives it more than the expected benefit from round 2. • As a proposer: • Estimate its benefit from the other player counter proposal. • Calculates the expected benefit from each proposal. • Choose the offer that maximizes expected benefit.
Revelation phase • Use decision theory: SIGAL Calculates its expected benefit for both scenarios – revealing or not revealing. • Picks the best option.
Modeling acceptance of offers • Use logistic function (Camerer 2001) Acceptance probability Social utility function
Social utility function • People are not fully rational – a proposal is not desirable only for its benefit. • Weighted sum of social features.
Benefit feature • Benefit = proposed offer score – no agreement score • As example, the score of the ‘me’ player from the demo game was 135. • Without reaching and agreement, its score is 30 • Its benefit from the proposal is 135-30=105
Other social features • Difference in benefit = proposer’s benefit from the offer – responder’s benefit from the offer • Revelation decision. • Previous round: • The first proposal, if rejected, may affect the probability to accept the counter-proposal.
Learning • We used a genetic algorithm to find the optimal weights for people’s social utility function. • Use density estimation to learn how people make offers • Cross validation (10-fold). • Over-fitting removal: Stop learning in the minimum of the generalization error. • Error calculation on held out test set.
Fit to data The percentages are averages over similarutility ranges (bins)
What did SIGAL learn? • Which proposal to give in each round. • Whether to accept proposals or not in each round. • Whether to reveal its goal or not.
Empirical methodology • Israeli CS students and users over the web. • Subjects received an identical tutorial on revelation, and had to pass a test. • Each participant played two different boards • Compared SIGAL performance with people’s performance playing other people.
Why did SIGAL succeed? • Learned to ask for more from people when they reveal their goal. • Learned to make ‘fair’ proposals: • People dislike proposals with high benefits difference in favor to the proposer. • Learned to exploit generous people: • If people propose a generous offer in the first round, they are more willing to accept the counter offer.
People also benefit from SIGAL! People playing with SIGAL got much more than people playing with people!
Web users: Amazon Turk • Web based bulletin board for ‘Human Intelligence Tasks‘. • Millions of ‘workers’ are exposed to your task. • We got 140 ‘qualified’ participants in 8 hours!
Baseline equilibrium agent There are lots of equilibria in the game. We’ve developed an agent based on a pure equilibrium strategy.
Related work Did not take a decision theoretic approach. Repeated negotiation (Oshrat et al. 2009). Bayesian techniques (Hindriks et al. 2008). Approximation heuristics (Jonker et al. 2007). Did not evaluate a computer agent Repeated one-shot take-it-or-leave-it games (Gal et al. 2007).
Conclusions Revelation games are a new setting to study how people reveal information in a negotiation. Using a simple model, an agent can learn to outperform people in revelation games. Behavioral studies can actually help agent design. Combining decision theory with learning is a good approach for agent-design.
Future work Extend the argumentation framework. More signaling and revelation possibilities. Develop a model which predict in which extant the private information should be revealed during the game. EEG: Does features in brain waves can improve the prediction model?