Agent Technology for e-Commerce

Agent Technology for e-Commerce Chapter 7: Elements of Strategic Interaction Maria Fasli http://cswww.essex.ac.uk/staff/mfasli/ATe-Commerce.htm

Agents in electronic markets • Agents have different (perhaps conflicting) goals • Each agent is trying to maximise its own payoff without necessarily being concerned about the welfare of other agents • Agents: consumers, producers • Resources, services, goods • Market infrastructure

Market economy • A market economy is a setting in which the goods and services that a consumer may acquire are available at known prices • The goods, initial endowments, and technological possibilities are owned by consumers. • The value is derived from consuming (or owning) goods and services. • Some of the consumer agents can use some of the commodities to produce others (producers).

Basic principles • The behaviour of agents in an economic setting is governed by two principles: • Optimisation principle: agents try to choose the best patterns of consumption that they can afford • Equilibrium principle: prices adjust until the amount that people demand of something is equal to the amount that is being supplied

Example: Market of flats • Two types of flats available: close to the University and further away • Students are interested in renting as close to the University as possible, but paying as little as possible • Landlords are interested in maximizing the rent from their flats

The demand curve

The supply curve

Equilibrium: Supply meets demand

Pareto efficient allocations • Pareto Efficiency is one possible way of checking if an economic system is producing an ‘optimal’ economic outcome. • An allocation x is Pareto efficient, if there is no other allocation in which • an agent is strictly better off • no other agent is worse off

Consumption bundles • The objects of consumer choice are called consumption bundles. • If bread is product x1 and butter is product x2 then (x1, x2) represents a consumption bundle, also written x • Often only two goods are used; one of them is called ‘all other goods’ so that we can focus on the trade off between one good and everything else • Agents have their own individual preferences over different consumption bundles

Preferences • (x1, x2) (y1, y2): a consumer strictly prefers bundle (x1, x2) rather than bundle (y1, y2) given the opportunity/choice • (x1, x2) ~ (y1, y2): a consumer is indifferent between the two bundles, i.e. would be just as satisfied with consuming bundle (x1,x2) as she would be with consuming bundle (y1, y2) • (x1, x2) (y1, y2): a consumer weakly prefers bundle (x1, x2) over bundle (y1, y2), i.e. the consumer prefers or is indifferent between the two bundles

Axioms of consumer preference Assume a consumer i and consumption bundles x, y and z. Then: • Bundles x and y are comparable, that is xy or yx, or both in which case the consumer is indifferent (completeness) • Any bundle is as at least as good as an identical bundle, that is, xx (reflexivity) • If xy and yz, then it follows that xz (transitivity)

Properties of preference relations • A preference relation is rational if it has the following two characteristics: • Completeness • Transitivity • If is rational then: • is both irreflexive (xx never holds) and transitive • ~ is reflexive (x ~ x for all x), transitive (if x ~ y and y ~ z then x ~ z) and symmetric (if x ~ y then y ~ x) • if xyz, then xz

Indifference Curves

Monotonicity • If (x1, x2) and (y1, y2) are two bundles and (y1, y2) has at least as much of both goods and more of one, then: (y1, y2) (x1, x2) • The indifference curve has a negative slope

Convexity • If two bundles x and x’ are both elements of the consumption set, then the bundle x’’=ax+(1-a)x’ is also an element of the consumption set for any a[0,1] • Averages are preferred to extremes

Utilities • A utility function u(x) assigns a numerical value to each element in X (the set of consumption bundles), ranking the elements of X in accordance with the individual’s preferences • Ordinal utilities: the size of the utility difference does not matter • Cardinal utilities: the size of the utility difference does not matter • u can be transformed into another form through a process of monotonic transformation which does not affect the ordering

Elements of a market economy • Assume a market where n goods are present • Each consumer i has a utility function ui(xi) which determines its preferences over various consumption bundles xi =(xi1, xi2,…,xin) • Each consumer i has an initial endowment (resources) ei • ei =(ei1, ei2 ,…, ein) is the vector describing consumer i’s • The initial total endowment of good g available in the economy is

Producers can use some of the commodities to produce others • yj=(yj1, yj2 ,…, yjn) is the vector describing producer j’s production of commodity g • The total (net) amount of good g available in the economy is • The market has prices p = (p1,p2,…,pn); pg is the price of good g • Prices specify the goods’ exchange rates and also determine the value of the consumers’ initial endowments

The profit of producer j is pyj • Each consumer i owns a share ij of producer j’s profits with • Hence i has a claim to a fraction of j’s profits

Equilibrium A market reaches an equilibrium state when there is no agent who wishes to deviate from that state

General equilibrium properties • Each general equilibrium is Pareto efficient • A general equilibrium exists if • there is a positive endowment of every good • the agents’ preferences are continuous, convex and monotone • A general equilibrium is unique if there is gross substitutability, i.e. raising the price of one good will not decrease the demand of another

Welfare Theorems First Welfare Theorem: Any competitive equilibrium is Pareto efficient Second Welfare Theorem: If the preferences and the technologies are convex, then any feasible Pareto optimal solution is a general equilibrium for some price vector and a set of endowments

Limitations A general equilibrium may not exist if: • Agents (consumers or producers) have market power • The aggregate excess demand function is noncontinuous (small changes in price result in big jumps in the quantity demanded) • Agents’ preferences have: • Externalities (some agent’s consumption or production directly influences another agent’s utility) • Nonconvexities • Complementarities (one commodity complements another)

Finding equilibrium solutions • How is a market equilibrium reached? • Algorithms need to take into account the tradeoffs between agents and the fact that the values of different goods to a single agent may be interdependent • Price tatonnement process • A distributed algorithm proposed by Leon Walras • An iterative price adjustment scheme which uses a steepest-descent search method in order to find an efficient solution (provided that one exists)

Game Theory • A game is a formal representation of a situation in which a number of agents interact in a setting of strategic interdependence • An agent’s welfare depends not only on its own decisions and actions but also on the other agents’ decisions and actions: the agents are in situations of strategic interdependence

Strategic games Elements of strategic interaction (game): • Players: Who plays the game; number of players • Rules: Who moves when? What can they do? • Outcomes: For each possible set of decision/ actions by the players, what is the outcome of the game? • Payoffs: What are the players’ preferences (utilities) over all the possible outcomes? • Information: What sort of information do players have? • Chance: Probability distribution over chance events, if any

Simple game example • Players: 2 players • Rules: Player A writes one of two words on a piece of paper ‘Top’ or ‘Bottom’. Player B simultaneously and independently writes ‘Left’ or ‘Right’ on a piece of paper. Each player submits their piece of paper • Outcomes and payoffs in normal form representation

Strategies • A strategy is a complete contingent plan or decision rule that describes how the player will act in each possible and distinguishable circumstance in which it is called upon to play • The player’s distinguishable circumstances (the player’s perspective) is represented by the set of its information sets • A pure strategy (deterministic) for player i specifies a deterministic choice (a single action smi) at each of its information sets • The strategy space is represented by a vector Si=(s1i,s2i,…sni), where sni is the n-th strategy chosen by player i

Player A Information set for Player B T B Player B L R L R (1,3) (1,1) (3,1) (1,0) Extensive form representation The extensive form representation of a game consists of: • The initial node (root) • branches • decision nodes • terminal nodes

Information in games • Perfect information • Imperfect information • Perfect recall • Common Knowledge • Games of complete (incomplete) information • Games of certainty (uncertainty) • Games of symmetric (asymmetric) information

Categories of games • Cooperative games: purpose is to develop mechanisms for cooperation • Competition games (zero sum games): the total benefit of all players in a game adds up to zero; one player can only benefit at the expense of another • Coexistence games: in biology, what is the right (equilibrium) mixture of behaviours (strategies) among a population of a particular species? • Commitment games: focus on commitment

Pareto efficiency in games • Agents are expected utility maximizers and therefore they prefer higher payoffs than lower ones • The players choose their strategies and arrive at a solution • A solution to a game is Pareto efficient if there is no other solution in which: • a player is strictly better off • no other player is worse off • A Pareto efficient solution is the socially optimal solution of a game

Dominant and dominated strategies • A strategy s*iis player’s i strictly dominant strategy if it maximises its payoff regardless of what the other players do • A dominant strategy may not be Pareto efficient • Some games do not have strictly dominant strategies • A strategy for a player i is dominated if there exists some alternative strategy that yields a greater payoff regardless of what the other players will do. Hence, a strategy s*i is a strictly dominant strategy if it dominates every other strategy in Si

Dominant Strategy Equilibrium • When all players have strictly dominant strategies, the outcome that ensues is called a dominant strategy equilibrium • The dominant strategy equilibrium concept makes no assumptions about the agents’ beliefs – very strong and robust • Not all games have a dominant strategy equilibrium

Nash Equilibrium • Battle of the Sexes (BoS): no dominant strategy equilibrium • An outcome (a pair of strategies) is a Nash equilibrium if each player’s strategy is an optimal choice given the other players’ strategies • A game can have more than one Nash equilibria • A game may have no Nash equilibrium

The Nash equilibrium can be defined in terms of the so-called best-response function • Bi(s-i) may contain many strategies • At a Nash equilibrium, each agent’s strategy is an optimal response to the other agents’ strategies

The Nash solution concept is underpinned by very strong assumptions • To play a Nash equilibrium in a single shot game • Every agent must have complete information about the others’ payoffs and preferences over outcomes (i.e. they must be common knowledge) • Rationality must be common knowledge • All agents must select the same Nash equilibrium

Mixed strategies • Players do not always make their choices with certainty. A player can randomise when faced with a choice • A mixed strategy is a probability distribution over pure strategies: where i=(smi) is the probability that i will choose strategy smi • Since iis a probability distribution we require:

Mixed Strategy Nash Equilibrium • In the BoS game • Sally chooses Basketball with p>0 • Kevin chooses Shopping with q>0 • For Sally: q(2)+(1-q)(0)=q(0)+(1-q)(1) • For Kevin: (1-p)(1)+p(0)=(1-p)(0)+p(2) p=q=1/3

The mixed strategy in which each player chooses the other’s favourite event with a probability 1/3 and their own with probability 2/3 is a mixed strategy equilibrium – but inefficient • Every finite strategic-form game has a mixed strategy Nash equilibrium

Interpretation of mixed strategies • An attempt by a player to behave unpredictably • As an expression of the other player’s beliefs regarding the pure strategy that the player itself is going to choose • Information that the players have about past interactions • Before the player moves it receives a private signal on which it can base its decision (but not consciously) • As above but the random factors now affect payoffs. Each player observes its own preferences, but not those of the others (this interpretation is due to Harsanyi)

Behaviour strategies • A behaviour strategy specifies the probability with which each action will be chosen, conditional on reaching that information set • Hence, a behaviour strategy specifies at each information set, a conditional probability distribution over the actions available at that information set

The prisoner’s dilemma • The Pareto efficient solution is for both players to cooperate • Each player has a strictly dominant strategy to defect • Dominant strategy equilibrium which is not Pareto efficient • Problem lies in the uncertainty that each player faces: each has to speculate about the other’s move • Does cooperation among agents arise as a result of irrational behaviour?

Repeated PD • How do you play such a game? • Depends on whether the game is one-shot, or it is to be played a finite or infinite number of times • In repeated PD each agent has the opportunity to punish the other for defection

Finite number of times: if the number of rounds are known, then in each round both agents will choose to defect. Why? • Consider two players playing the PD for 10 rounds • What is going to happen in game 10? Both agents will defect – there is no point in cooperating as this is the last game • What is going to happen in game 9? Since the last game does not matter anyway, both players will attempt to exploit each other’s cooperative nature and defect • Using backward induction: both players will defect in each game

Infinite (or unknown) number of times • The situation now changes • As the players do not know how many games they will have to play and each player can punish the other’s defection in the next round, each player has the incentive to cooperate • Axelrod’s shadow of the future

Axelrod’s Tournament • In 1980 Robert Axelrod invited scientists to encode their PD strategies and compete against each other • Winner strategy: Anatol Rappaport’s tit-for-tat • in the first round cooperate • in round r>1, do whatever your opponent did in round r-1 • The significance of the result has been debated: does cooperation prevail after all?

Tit-for-tat is successful as • It offers an immediate punishment for the other agent’s defection • It is a forgiving strategy, as it only punishes the defector once • It is a rewarding strategy, if the other agent cooperates, it rewards this by continuing to cooperate

Player A C D Player B C D C D (-1,-1) (-10,0) (0,-10) (-5,-5) Dynamic games • Games in which the players move sequentially • Agents that move later in the game have an advantage as they can see the others’ moves

Agent Technology for e-Commerce