330 likes | 408 Vues
Cultural Negotiating Agents. By: Elnaz Nouri ISI NL Seminar, June 6 th 2014. The Big Picture. Goal: Building virtual agents that can bargain and negotiate with people using natural language dialogue. Human Interacting with a Virtual Human.
E N D
Cultural Negotiating Agents By: Elnaz Nouri ISI NL Seminar, June 6th 2014
The Big Picture Goal: Building virtual agents thatcan bargain and negotiate with people using natural language dialogue. Human Interacting with a Virtual Human
In addition to challenges from a natural language perspective the following aspects need to be considered as well: Emotions, Ethics, Framing, Social relationships, Motivated illusions, Culture, etc… The Need to Understand Human Negotiation
The Importance of Considering Culture People's cultural background has been shown to affect the way they reach and fulfill agreements in negotiation. [Haim,2012] US Vice President Joe Biden & Japanese Prime Minister Shinzo Abe December 2013
Negotiations Our focus is on Modeling the Interpersonal Decision Making in Negotiation: Important Factor in Negotiation: • Interpersonal strategic decision-making processes [Thompson, 2004] [Curhan, 2006] • Involve resource allocation and potentially conflicting objectives [Pruitt, 1983] • Exchange of propositions and responses
Speech Recognizer Text-to-Speech System Language Understanding Response Generator Dialogue Manager Decision Making Component Components in a Spoken Dialogue System for Negotiation Output My focus: Simulate Decision Making in Negotiation Offers Responses Assessment of Relationship Input
What’s a suitable Social Decision Making Model for cultural negotiating agent?
Social Goals in Negotiation Negotiator's goals determines how the negotiation unfolds • People care about the outcome of others when making decisions. [Loewenstein, 1989] Examples: Sacrificing one’s own interest to help loved ones or harm adversaries. Participants withdrawal from experiments if they perceive inequity in remuneration. [Schmitt, 1972] Negotiations collapse when one party tries to maximize opponent’s displeasure rather than his own satisfaction. [Seigel, 1960] Disputants’ concern not only with their own outcome but also with the outcome of the opponents[Pruitt, 1986] Anomalies in behavioral games: Dictator game [Bolton, 1998] Prisoner’s Dilemma [Rapoport, 1965]
Example 1: The Negotiation Example[Nouri et al,Interspeech2013] 41 dyadic sessions (15 competitive, 13 individualistic and 13 cooperative) Different Goals: • Individualistic: goal maximize points for themselves [Vself] • Cooperative:maximize the joint gain with the other side [Vjoint] • Competitive: try to maximize points for yourself and prevent the other side from getting points [Vcompete]
Prediction Models for Negotiation Outcome and Goals[Nouri et al,Interspeech 2013] Significant differences in offers, responses and distribution of offers Predict Outcome Prediction Accuracy Predict Strategy • Features: • # of words spoken per speaker per turn • # of turns taken • #of negotiation issue related words • Sentiment(positive, negative) and subjectivity scores. • mean and standard deviation of the following acoustic features: peak slope, Normalized amplitude quotient (NAQ), f0, voiced/unvoiced, energy, energy slope, spectral stationary • amount of silence and speaking time per speaker • # of offers, acceptances or rejections Prediction Accuracy 10-fold cross validation (SVM) classifier with the RBF kernel
Example 2: The Ultimatum Game [Güth, 1982] Bargaining processes are often modeled as ultimatum bargaining games. [Stahl, 1972] 2-turn game over a certain amount of money: Accepts : split accordingly Rejects : both get zero • Proposer makes offer • Responder Dictator Games: similar but responders only accept • Expected Results According to Game Theoretic Models: • Observed Results: Frequency of offers made by proposers Acceptance ratio by responders Reject a high number of low offers Offer about half of the money Offer the minimum amount possible Accept any offer greater than zero
A suitable model: The MARV Decision Making Model [Nouri and Traum. CMVC Workshop, Reykjavik, Iceland, 2011] [Nouri, Georgila and Traum, Journal of AI and Society, 2014] Considers a combination of different valuation functions for evaluation of the utility: • Self Interest (the agent's own utility) [Scott, 1972] • Other Interest (the utility of another agent) [MacCrimmon and Messik 1976] • Total Utility (sum of individual utilities of all participating agents) • Average Utility (may not be derivable from Total Utility) • Relative Utilities (viewed in several ways, such as self/total, self-other, self/other, self/average) • Self/Total [MacCrimmon and Messik 1976] [Loewenstein, 1989] • Self/Other [MacCrimmon and Messik1976] [Lurie, 1987] • Self/Average • Self-Other [Griesinger & Livingston, 1973] [Lurie, 1987] • Minimum Utility (lower bound for any participant) [Rawls' Theory of Justice“, 1984] • Uncertainty (variation among possible outcomes)
The MARV Decision Making Model [Nouri and Traum. CMVC Workshop, Reykjavik, Iceland, 2011] Assess the situation from multiple perspective Combine valuation functions by assigning proper weights to each function (e.g. linear combination) General Formula: {Vself , Vother, Vtotal, …} Depends on the problem = Value(Choicei) Depend on the decision maker Individual differences: Different weights to the same valuation functions
Going Back to the Examples Example 1: Cultures are different in their objectives in the negotiation. Collectivistic cultures care about the gain of the other party more than individualistic cultures. [Carnevale, 1997] [Adair 2001] Example 2: Considerable variation of offers and acceptance rates across 4 cultures [Roth 1993; Camerer 2003] • Observed Results in 1 Country: • Observed Results in 4 countries: Frequency of offers made by proposers Acceptance ratio by responders (Roth, 1993)
The MARV Decision Making Model [Nouri and Traum. CMVC Workshop, Reykjavik, Iceland, 2011] Assess the situation from multiple perspective Combine valuation functions by assigning proper weights to each function (e.g. linear combination) General Formula: {Vself , Vother, Vtotal, …} Depends on the problem = Value(Choicei) Depend on the decision maker Cultural differences are modeled by appropriate weight setup.
Adapting MARV to model Culture = Value(Choicei) Use of a Model of Culture How to set the weights based on the Culture?
Using Culture Model to Set Up the Weights [Nouri and D. Traum. CMVC Workshop, Reykjavik, Iceland, 2011] There are several existing models of culture: Hofstede, GLOBE, World Value Survey, Schwartz, … Hofstede dimensional model of culture: • Power Distance (PDI) • Individualism vs. Collectivism (IDV) • Masculinity vs. Femininity (MAS) • Uncertainty Avoidance (UAI) • Long- vs. Short-Term Orientation (LTO)
VS08 Hofstede Survey Questions have sufficient time for your personal or home life (IDV) Q2 have a boss (direct superior) you can respect (PDI) get recognition for good performance (MAS) have security of employment (IDV) have pleasant people to work with (MAS) do work that is interesting (IDV) be consulted by your boss in decisions involving your work (PDI) live in a desirable area (MAS) have a job respected by your family and friends (IDV) have chances for promotion (MAS) keeping time free for fun (IVR) moderation: having few desires (IVR) being generous to other people (MON) modesty: looking small, not big (MON) If there is something expensive you really want to buy but you do not have enough money, what do you do? (LTO) How often do you feel nervous or tense?(UAI) Are you a happy person? (IVR) Are you the same person at work (or at school if you’re a student) and at home? (LTO) Do other people or circumstances ever prevent you from doing what you really want to (IVR) how would you describe your state of health these days? (UAI) How important is religion in your life?(MON) How proud are you to be a citizen of your country? (MON) How often, in your experience, are subordinates afraid to contradict their boss (or students their teacher?) (PDI) One can be a good manager without having a precise answer to every question that a subordinate may raise about his or her work (UAI) Persistent efforts are the surest way to results (LTO) An organization structure in which certain subordinates have two bosses should be avoided at all cost (PDI) A company's or organization's rules should not be broken - not even when the employee thinks breaking the rule would be in the organization's best interest (UAI) To what extent We should honor our heroes from the past (LTO)
Mapping the Culture Model to Weights on the Valuations Culture Model Hofstede’s Dimensional Model of Culture Value(Choicei)
Mapping for the Ultimatum Game Example • Four valuations functions {Vself, Vother, Vself/Other, Vlower-bound}
The Individualism Dimension (IDV) A society's position on this dimension is reflected in whether people’s self-image is defined in terms of “I” or “we.” • Factors to model: High individualism: has focus on self utility • Low individualism (high collectivism) • Different valuations for in-group vs. out-group relationships • For in-group focuses on other utility and fairness
Integration with Virtual Humans • Dialogue System • Uses (TACQ )architecture • Virtual Humans playing Ultimatum Game with one another • Policy for proposals made and acceptance or rejections by the responder are based on the MARV model calculations (weights are set based on the culture-model) Humans Playing Ultimatum Game through Natural Language Dialog with Virtual Humans (US Culture Model)
Integration with Virtual Humans Virtual Humans Playing Ultimatum Game with Virtual humans
Evaluation[Nouri and D. Traum. CMVC Workshop, Reykjavik, Iceland, 2011] • Results of simulating the one shot Ultimatum Game
Limitations • Is manual • Relies on previous culture models • Might not exist • The interpretations of the culture model is done manually based on the literature and personal understanding • Might not reflect reality of the culture
Automatically Learning the Weights [E. Nouri, K. Georgila, and D. Traum. A Cultural Decision-Making Model for Negotiation based on Inverse Reinforcement Learning. CogSci 2012] • If cultural data of decision making is available then it’s possible to use Inverse Reinforcement Learning (IRL) for automatically learning the weights on the valuation functions. • IRL assumes that the expert is trying to optimize an unknown reward function that can be expressed as a linear combination of known features. • The goal is to find the reward function that makes agent's behavior similar to that of the goal(expert) data. • Learn reward functions for 4 different cultures (US, Japan, Israel, Yugoslavia) playing the Ultimatum Game. [Roth et al. 1991] First with MARV values: {Vself, Vother, Vratio, Vlower-bound} Cultural Data
Inverse Reinforcement Learning Cultural Behavior Policy Simulated Behavior Generates Compare Reinforcement Learning Agent Similar enough? Yes interaction Reward Function No Reward Function Initial Random Reward Function Update the Reward function Environment Iterates until simulated data is similar enough to cultural human data
Evaluation with Ultimatum Game[E. Nouri, K. Georgila, and D. Traum. A Cultural Decision-Making Model for Negotiation based on Inverse Reinforcement Learning. CogSci 2012] • Experiment 1: compare learned policies to real cultural data • Models outperform two baselines • First baseline: random rewards • Strong baseline: maximizing self interest Human Self interest Random IRL Proposer Responder KL divergences for IRL and the two baselines for all cultures and roles
Evaluation with Ultimatum Game Experiment 2: use reward function learned from each culture to train policies for each role (proposer, responder) for the agent against users of each culture playing the other role Learned Reward Function for Japan Japan US Israel Yugoslavia Train against Japan US Israel Yugoslavia Test against Compare with real data
Evaluation with Ultimatum Game • Policy with reward function and users from same culture are more like observed data from that culture than from other combinations of reward function and users • Models for culture outperform models for other cultures Proposer Responder Reward Functions Reward Functions Train and Test Data Cross-culture results, comparison with human data from different cultures (KL divergences)
Limitations • Needs behavioral data to nd the values of people
Summary and Conclusion • Adapted a social utility decision making model to culture with two approaches • Showed how the decision making model can be used to simulate cultures’ behavior in simple negotiation