280 likes | 384 Vues
This research analyzes the dynamics of the Minority Game (MG) as a model of competition among adaptive agents, emphasizing learning methods and collective efficiency. We explore concepts such as Markovian behavior, neural networks, and reinforcement learning in simple and multistate worlds. The study investigates inefficiencies caused by strategic behaviors, offering remedies and generalizations to asymmetric problems. Utilizing results from the El Farol bar problem and employing sophisticated agents’ strategies, we uncover the implications of memory in decision-making and the optimization of prediction strategies in competitive environments.
E N D
Competition between adaptive agents: learning and collective efficiency Damien Challet Oxford University Matteo Marsili ICTP-Trieste (Italy) • My definition of the Minority Game • Simple worlds (M= 0) • Markovian behavior • Neural networks • Reinforcement learning • Multistate worlds (M> 0) • Cause of large inefficiencies • Remedies • From El Farol to MG and back challet@thphys.ox.ac.uk
'Truth is always in the minority' Kierkegaard
Zig-Zag-Zoug • Game played by Swiss children • 3 players, 3 feet, 3 magic words • “Ziiig” ... “Zaaag” .... “ZOUG!”
Minority Game Zig-Zag-Zoug with N players Aim: to be in the minority Outcome = #UP-#DOWN = #A-#B Model of competition between adaptive players Challet and Zhang (1997), from El Farol's bar problem (Arthur 1994)
Initial goals of the MG El Farol (1994): impossible to understand Drastic simplification, keeping key ingredients Bounded rationality Reinforcement learning Symmetrize the problem: 60/100 -> 50/50 Understand the symmetric problem Generalize results to the asymmetric problem
Repeated games Why playing again ? Frustration Losers in majority Induction • Limited capabilities • Beliefs, strategies,personality • Trial and error • Learning How to play ? Deduction • Rationality • Best answer • All lose !
Minority Game +1 Total losses = A2 -1 ... aN(t) Payoff player i -ai(t)A(t) A(t)=iai(t) N agents i=1, ..., N Choice ai (t) a1(t) a2(t)
Markovian learning 'If it ain't broken, don't fix it' (Reents et al., Physica A 2000: If I won, I stick to my previous choice If I lost, I change to the other choice with prob p Results: ( s2= < A> 2 ) • pN = x = cst (small p): s2 = 1 + 2x (1+ x/6) • p~ N 1/2s2 ~ N • p~ 1 s2 ~ N 2
Markovian learning II Problem: if N unknown, p= ? Try: p= f(t) e.g. p= t-k Convergence for any N Freezing When to stop ?
Neural networks Simple perceptrons, learning rate R (Metzler ++ 1999) s 2 = N + N(N-1)F(N,R) smin2 = N (1-2/p) = 0.363... N
Reinforcement learning • Each player has a register Di • Di> 0 + is better • Di< 0 - is better • Di(t+1) = Di(t) – A(t) • Choice: prob(+ | Di) = f(Di) f '(x) > 0 (RL)
Reinforcement learning II • Central result: agents minimize < A> 2 (predictability) for all f • Stationary state: < A> = 0 • Fluctuations = ? • Ex: f(x)=(1+tanh(K x))/2 exponential learning, K learning rate • K< Kcs2~ N • K> Kcs2~ N2
Reinforcement learning III Market Impact: each agent has an influence on the outcome • Naive agents: payoff - A = -A-i -a i • Non-naive agents: payoff - A + c a i • Smart agents: payoff -A-icf WLU, AU • Central result 2: non-naive agents minimize < A2> (fluctuations) for all f -> Nash equilibrium s2~ 1
Minority Games with memory If an agent believes that the outcome depends on the past results, the outcome will depend on the past results. Sun spot effect Self-fulfilling prophecies Fallacies of casual inference Consequence: The other agents will change their behavior accordingly
Minority Games with memory: naïve agents s2/N =P/N Fixed randomly drawn strategies = quenched disorder Tools of statistical physics give the exact solution in principle Agents minimize the predictability Predictability = Hamiltonian Optimization problem ? Numeric: Savit++ PRL99 Analytic: Challet++ PRL99 Coolen+ J. Phys A 2002
Minority Games with memory: low efficiency P/N is not the right scaling for large fluctuations
Minority Games with memory: origin of low efficiency Stochastic dynamical equation for strategy score Ui slow varying part + correlated noise II = K P-1/2 I: Size independent When I << II, large fluctuations Transition at I / K = G /P1/2 Critical signal to noise ratio = G / P1/2
Minority Games with memory: origin of low efficiency DetermineG Predict critical points G /P1/2 Check: I/K
Minority Games with memory: origin of low efficiency AFTER BEFORE
Minority Games with memory: sophisticated agents Agents minimize fluctuations Optimization problem again
Reverse problem Many variations, different global utility functions • Grand canonical game (play or not play) • Time window of scores (exponential moving average) • Any payoff Hence, given a task (global utility function), one knows how to design agents (local utility). example: optimal defects combinations (cf. Neil's talk)
From El Farol to MG and back El Farol N 0 L MG N 0 L = N/2 Differences, similarities? Which results from MG are valid for El Farol?
From El Farol to MG and back N 0 L Theorem: all results from MG apply to El Farol S N< a> Everything scales like (L/N – < a>)/S = g P ½ The El Farol problem with P states of the world is solved.
From El Farol to MG and back:new results If (L/N – < a>)/S = g P ½0, P>Pc = 2 S2 / [p (L/N-< a>)2]: no more phase transition.
Summary AU/WLU suppresses large fluctuations -> Nash equilibrium Design: agents must know they have an impact. The knowledge of the exact impact not crucial Reverse problem also possible MG: simple, rich, fun, and useful www.unifr.ch/econophysics/minority 102 commented references