420 likes | 514 Vues
A Proposed Game-Theoretic Model of Cooperation between Nodes in a MANET. Jim Catt ECE 695 Department of Electrical and Computer Engineering Purdue School of Engineering and Technology Spring 2006. Introduction and Motivation.
E N D
A Proposed Game-Theoretic Model of Cooperation between Nodes in a MANET Jim Catt ECE 695 Department of Electrical and Computer Engineering Purdue School of Engineering and Technology Spring 2006
Introduction and Motivation • In mobile ad hoc networks (MANET), nodes in the network must provide some level of relay service to other nodes in the network to achieve optimal global efficiency of network operation. • However, packet relay imposes a power cost on the relaying node. • Since MANET nodes are often battery powered, this is costly shortens node lifetime. • The most rational local strategy for each node is not to cooperate and only transmit its own packets
Introduction and Motivation • If all nodes adopt this locally rational strategy, network connectedness drops to zero. • All nodes lose in this case – nodal utility drops to zero • Yet, if each node cooperates, there is the possibility to maximize the utility of all nodes. • This is a classical game theory scenario • Game theory has been utilized to analyze several aspects of MANET operation • This project is restricted to analysis of cooperation
Objective • The objective of this work is to develop a practical game-theoretic model of nodal cooperation that uses measurable, realistic parameters to make strategy choices, and when combined with feasible protocol modifications, can be reasonably implemented in MANET nodes.
Prisoner’s Dilemma • The Prisoner’s Dilemma is often used as pedagogic example of game theory • Preliminaries • Player – an entity with preferences • Strategy – A set of actions available to a player, in response to the strategy of other players • Outcome – The result of complete set of strategic choices by all players in the game • Utility - the amount of welfare a player derives from an outcome (or strategy) • Often expressed as a utilityfunction, a mathematical mapping of the welfare received by the player from an outcome. • Payoff – Usually formulated as: p = utility - cost
Prisoner’s Dilemma • The Prisoner’s Dilemma scenario: • Two people are arrested for armed robbery • Not enough evidence to convict for armed robbery, but enough to convict for theft of getaway car • Each prisoner is given the following choices: • You confess and implicate your partner, but your partner doesn’t confess, you go free, she gets ten years in prison • If you both confess, both get 5 years in prison • If neither confesses, both get 2 years for auto theft. • Utility (payoff) mapping: • Go free 4 • 2 years 3 • 5 years 2 • 10 years 0
Prisoner’s Dilemma • The game can be represented in strategic form by a matrix: • The prisoners are separated and cannot communicate. • What will they decide? Prisoner 2 Prisoner 1
Prisoner’s Dilemma • Consider one prisoner at a time • For a specific strategy – either defect or cooperate – there are two possible payoffs • Which strategy offers the best set of potential payoffs? Or, equivalently, which strategy maximizes the minimum payoff? Prisoner 2 Prisoner 1
Prisoner’s Dilemma • (Defect, Defect) is an equilibrium solution to the game (Nash Equilibrium) • However, this clearly isn’t the optimal solution, which is (Cooperate, Cooperate). • Hence, a Nash equilibrium isn’t necessarily an optimal solution to a game !!! Prisoner 2 Prisoner 1
Strategies • Types of strategies: • Pure Strategy – a player chooses to play a certain strategy with probability 1. Usually only encountered in games of perfect information. • Mixed Strategy – a player has a set of strategies to choose from. A probability distribution describes the likelihood that a particular strategy will be chosen.
Game Theory and Cooperation in MANETs • Classical game theory models for cooperation in MANETs: • economic payment model • punishment/reward model. • Regardless of model, there is little consistency in the formulation of utility functions. • Many formulations employ abstractions for utilities and costs (less practical) • Some are based on some energy measure (more practical). • Many require extraordinary overhead in the exchange of information between nodes
Proposed Approach • Premise: the basic resource available to a node is its lifetime store of energy battery life. • This resource is available to be consumed for either computational functions or information exchange functions, both part of “mission” execution • Node behavior obtain a balance between: • achieving maximum lifetime • executing its mission.
Proposed Approach: Ground Rules • Sending and receiving packets requires cooperation. • Payment is in-kind (punishment/reward framework) • Payoff should be proportional to the benefit received. • Cost for cooperation: • decrease in potential lifetime, or • alternately, lost opportunity to transmit own packets in the future.
Problem Formulation • Dual objectives : • Maximization of the lifetime function • Subject to maintaining reward (R) 0. • Assumptions and conventions • Slotted communication intervals of fixed length • Packet length L is fixed for this study. • Data (symbol) rate Rb is fixed for this study. • One packet time = Tp = L/Rb.
Assumptions and Conventions (cont.) • On average, a node is connected to two or more adjacent nodes • nodes are uniformly distributed throughout the region of interest, and • The average mobility of the network is sufficiently high such that no node is confined to an edge or border region for long periods of time
Restrictions • Only selfish nodes are considered, not malicious nodes • The proposed approach is for steady state conditions. • Modification for startup conditions requires further study. • Energy consumption associated with packet reception is ignored because even a selfish node will listen for its own packets.
Playing the Game • A node has a relay buffer and own buffer. • At each slot time, a node plays a mixed strategy, and may choose from the following action set: • Neither transmit nor relay • Transmit its own packet, given a packet is available in its own buffer • Relay a received packet, given that a packet is available in its relay buffer. • For this version of the game, the node will not transmit if: • both its own buffer and its relay buffer is empty. • either sending its own packet or relaying a packet causes its cumulative payoff to be negative for the current slot time
Playing the Game • PR = probability that node i relays a packet. • PO = probability that node i sends its own packet. • R = payoff received by node i when it relays a packet • O = payoff received by node i when it sends its own packet • The expected payoff (reward) for node i, is: • A rational node will act to maintain cumulative R 0. Or: • Equality with zero is allowed because temporarily, the only strategy available to node i may cause R = 0.
Definitions • Definitions • Total available energy at t=0 is ET. • k = 1,2…,N, the number of packets relayed by node i for other nodes • m = 1,2…,M the number of own packets transmitted by node i. • The total number of relay nodes (end-to-end) required for node i’s m-th packet, is a random variable, • j = 0,1,2…,J, set of links to adjacent nodes • The power used to transmit the m-th packet over the j-th link is a random variable denoted by: • The energy used to transmit the m-th packet over the j-th link is given by: • Denote relay energy as Er, and energy used to transmit own packet as Eo.
Energy usage function • Average CPU power is Wcpu. • At time t, the total energy remaining for node i is:
Lifetime function • The maximum possible lifetime is: • Maximum remaining lifetime at time t is:
Payoff functions • Payoff = utility - cost.
Constructing PR and PO • PR and PO give the strategy rule that can be used by the node to pick its strategy at each slot time. • PR and PO should be proportional to the payoffs received by node i, and the level of cooperation received by node i. • Define V as a measure of the relationship between the payoffs, or, the ratio of the absolute values of the payoffs: • The expected payoff R becomes :
Constructing PR and PO • Define the following events: • AQR = the event that there is a packet in the relay buffer • AQO = the event that there is a packet in own transmit buffer • AR = the event that a packet is relayed • AO = the event that own packet is transmitted • AT = the event that a packet is transmitted, either a relayed packet or own packet • ARS = the event that a relayed packet successfully reaches its destination • AOS = the event that the node’s own packet successfully reaches its destination
Constructing PR and PO • Assertions: • The relevant event space is AT = (ARU AO) • PO = P(AO|AT), and PR = P(AR|AT) • PO + PR = 1 • From AQR AQO AOS ARS AO AR
Constructing PR and PO • The cooperation experienced by node i for relay of its own packets is P(AOS|AO). • Define the weighted payoff, O’ and weighted V’: • as P(AOS|AO) 0, V’0, PO1, PR0. • as P(AOS|AO) 1, V’, PO, and PR all approach equilibrium values
Strategy Rule parameters • Define β as an estimate of P(AOS|AO): • Define an estimate of PR: • update each parameter prior to each new slot time
Strategy Rule parameters • Define the cumulative reward up to the current slot time: • Define the candidate updates for RC: • Define :
Strategy Rule algorithm • If AQR=1, calculate R,k+1 and RR. • If AQO=1, calculate O,m+1 and RO. • if (AQO=1 & AQR=0), • if O > 0, then AO=1 (send own packet), elsedo nothing • else if (AQO = 0 & AQR=1), • if RR >= 0, then AR=1 (accept relay request), elsedo nothing • else if (AQO = 1 & AQR=1), • ifthen AR=0 (reject relay), and if O >0, AO=1 (send own packet), elsedo nothing • else if RR >= 0, then AR=1 (accept relay request) • else if O >0, then AO=1 (send own packet) • else do nothing • end • update β, PR and RC.
Strategy Rule algorithm • This algorithm can be applied on a global basis (no discrimination between nodes requesting relays) or on a node-by-node basis (a β parameter is calculated for each node).
Proposed Protocol Modifications for Own Packets • Routing Tables • For AODV, routing tables are modified to include all nodes on the path to the destination. However, the current routing method is still employed (i.e. next hop routing). • No change to DSR for path routing list • Furthermore, the routing table is modified by adding two fields to hold values that are used to estimate cooperation from other nodes. • NUM_PKT_OFFERED • NUM_PKT_ACCEPTED • These fields can be used to estimate each node’s unique β if distinguishing between nodes achieves better fairness. • Otherwise, when summed over all nodes, they can be used to calculate a global β
Proposed Protocol Modifications for Own Packets • Transport protocol must support an ACK mechanism in order to estimate P(AOS|AO) • A destination node k sends an ACK for each packet successfully received from node i (i.e., use a wireless, pseudo connection-oriented transport protocol) • To reduce overhead, an ACK could be applied to a block of packets, where block size is adjustable
Implementation for Own Packets • When node i transmits its own packet to destination node k: • If node j is an intermediate (relay) on the path to node k • NUM_PKT_OFFEREDj =+1. • If an ACK is received from node k, • NUM_PKT_ACCEPTEDj =+1 • If ACK timer expires, execute normal transport protocol congestion adaptation • If RERR is received for node k before ACK time out, • NUM_PKT_OFFEREDj =-1.
Summary • Developed payoff functions that include parameters incorporating energy usage and cooperation level. • Can be calculated from available or reasonably measurable information, or from minor modifications to protocol • Developed a stochastic decision rule based on modified payoff functions, thereby taking into account the influence on battery life and cooperation • Proposed minor protocol modification and routing table modification that enable the strategy rule. • Developed an algorithm implementing the strategy rule
Future Work • Formally verify that the proposed approach achieves a stable and optimal or pseudo-optimal equilibrium. • Alternately, prove that the proposed framework is Pareto-efficient. • Test the model using a network simulation tool to verify that: • it achieves optimality • it is stable • it is insensitive to noisy β and estimate of PR • the proposed protocol modifications are viable and do not add unacceptable overhead cost. • Develop a better method to estimate P(AOS|AO), as the estimator should take into account the impact of packet loss due to congestion or noise, i.e., remove or reduce the influence of these effects on β. • β may also need smoothing to account for lag in feedback • Develop modifications to the model that take into account start up conditions
References [1] J. Eichberger, “Game Theory for Economists”, Academic Press, Inc., San Diego, 1993. [2] Selwyn Yuen and Baochun Li, “Strategyproof Mechanisms towards Evolutionary Topology Formation in Autonomous Networks,” IEEE. [3] Haijin Yan and David Lowenthal, “Towards Cooperation Fairness in Mobile Ad Hoc Networks,” IEEE, WCNC 2005, pp. 2143-2148. [4] V. Srinivasan, P. Nuggehalli, C.F. Chiasserini, R.R. Rao,”Cooperation in Wireless Ad Hoc Networks,” IEEE Infocom 2003. [5] M. Felegyhazi, J-P. Hubaux, L. Buttyan,”Nash Equilibria of Packet Forwarding Strategies in Wireless Ad Hoc Networks,” IEEE Transactions on Mobile Computing, Vol. 5, No. 5, May 2006. [6] L. DaSilva and V. Srivastava, “Node Participation in Ad Hoc and Peer-to-Peer Networks: A Game-Theoretic Formulation,” Dept. of Electrical and Computer Engineering, Virginia Tech. University. [7] V. Srivastava, J. Neel, A.B. MacKenzie, R. Menon, L.A. DaSilva, J.E. Hicks, J.H. Reed, R.P. Gilles,”Using Game Theory to Analyze Wireless Ad Hoc Networks,” Mobile and Portable Radio Research Group, Virginia Tech. University. [8] K. Chen and K. Nahrstedt,”iPass: an Incentive Compatible Auction Scheme to Enable Packet Forwarding Service in MANET,” IEEE ICDCS 2004. [9] A.B. MacKenzie and S.B. Wicker, “Game Theory and the Design of Self-Configuring, Adaptive Wireless Networks,” IEEE Communications Magazine, November 2001. [10] P. Michiardi and R. Molva,”A Game Theoretic Approach to Evaluate Cooperation Enforcement Mechanisms in Mobile Ad hoc Networks,” Institut Eurecom, Sophia-Antipolis, Fr.
Utility functions • The utility function for a node transmitting its own packet is: • Utility has units of hops per joule. Maximizing utility with regard to resource usage also maximizes remaining lifetime.
Utility associated with relaying a packet • When node i relays a packet for node j, it should receive a benefit (utility) that is proportional to the utility accrued to node j. • Let hj be the total number of relay nodes required for j’’s packet. Node i’’s share of the utility accrued to j is:
Cost functions • The cost incurred by node i for either transmitting its own packet or relaying a packet is the incremental decrease in its potential future utility. • The incremental cost in lifetime for relaying a packet is:
Cost functions • Likewise, the incremental cost in lifetime for transmitting own packet is: • Let be the average utility received by node i in one packet time as a result of transmitting one of its own packets. Then, the incremental utility cost to node i when it relays a packet is proportional to the incremental cost in lifetime:
Cost functions • Likewise, the incremental utility cost to node i for transmitting its own packet is: