Belief Learning in an Unstable Infinite Game Study

Belief Learning in an Unstable Infinite Game Paul J. Healy CMU

Issue #3 Issue #2 Belief Learning in an Unstable Infinite Game Issue #1

Issue #1: Infinite Games • Typical Learning Model: • Finite set of strategies • Strategies get weight based on ‘fitness’ • Bells & Whistles: experimentation, spillovers… • Many important games have infinite strategies • Duopoly, PG, bargaining, auctions, war of attrition… • Quality of fit sensitive to grid size? • Models don’t use strategy space structure

Previous Work • Grid size on fit quality: • Arifovic & Ledyard • Groves-Ledyard mechanisms • Convergence failure of RL with |S| = 51 • Strategy space structure: • Roth & Erev AER ’99 • Quality-of-fit/error measures • What’s the right metric space? • Closeness in probs. or closeness in strategies?

Issue #2: Unstable Game • Usually predicting convergence rates • Example: p–beauty contests • Instability: • Toughest test for learning models • Most statistical power

Previous Work • Chen & Tang ‘98 • Walker mechanism & unstable Groves-Ledyard • Reinforcement > Fictitious Play > Equilibrium • Healy ’06 • 5 PG mechanisms, predicting convergence or not • Feltovich ’00 • Unstable finite Bayesian game • Fit varies by game, error measure

Issue #3: Belief Learning • If subjects are forming beliefs, measure them! • Method 1: Direct elicitation • Incentivized guesses about s-i • Method 2: Inferred from payoff table usage • Tracking payoff ‘lookups’ may inform our models

Previous Work • Nyarko & Schotter ‘02 • Subjects BR to stated beliefs • Stated beliefs not too accurate • Costa-Gomes, Crawford & Boseta ’01 • Mouselab to identify types • How players solve games, not learning

This Paper • Pick an unstable infinite game • Give subjects a calculator tool & track usage • Elicit beliefs in some sessions • Fit models to data in standard way • Study formation of “beliefs” • “Beliefs” <= calculator tool • “Beliefs” <= elicited beliefs

The Game • Walker’s PG mechanism for 3 players • Added a ‘punishment’ parameter

Parameters & Equilibrium • vi(y) = biy – aiy2 + ci • Pareto optimum: y = 7.5 • Unique PSNE: si* = 2.5 • Punishment γ= 0.1 • Purpose: Not too wild, payoffs rarely negative • Guessing Payoff: 10 – |gL - sL|/4 - |gR - sR|/4 • Game Payoffs: Pr(<50) = 8.9% Pr(>100) = 71%

Choice of Grid Size S = [-10,10]

Properties of the Game • Best response: • BR Dynamics: unstable • One eigenvalue is +2

Interface

Design • PEEL Lab, U. Pittsburgh • All Sessions • 3 player groups, 50 periods • Same group, ID#s for all periods • Payoffs etc. common information • No explicit public good framing • Calculator always available • 5 minute ‘warm-up’ with calculator • Sessions 1-6 • Guess sL and sR. • Sessions 7-13 • Baseline: no guesses.

Does Elicitation Affect Choice? • Total Variation: • No significant difference (p=0.745) • No. of Strategy Switches: • No significant difference (p=0.405) • Autocorrelation (predictability): • Slightly more without elicitation • Total Earnings per Session: • No significant difference (p=1) • Missed Periods: • Elicited: 9/300 (3%) vs. Not: 3/350 (0.8%)

Does Play Converge? Average | si – si* | per Period Average | y – yo | per Period

Does Play Converge, Part 2

Accuracy of Beliefs • Guesses get better in time Average || s-i – s-i(t) || per Period Elicited guesses Calculator inputs

Model 1: Parametric EWA • δ : weight on strategy actually played • φ : decay rate of past attractions • ρ : decay rate of past experience • A(0): initial attractions • N(0): initial experience • λ : response sensitivity to attractions

Model 1’: Self-Tuning EWA • N(0) = 1 • Replace δ and φ with deterministic functions:

STEWA: Setup • Only remaining parameters: λ and A0 • λ will be estimated • 5 minutes of ‘Calculator Time’ gives A0 • Average payoff from calculator trials:

STEWA: Fit • Likelihoods are ‘zero’ for all λ • Guess: Lots of near misses in predictions • Alternative Measure: Quad. Scoring Rule • Best fit: λ = 0.04 (previous studies: λ>4) • Suggests attractions are very concentrated

STEWA: Adjustment Attempts • The problem: near misses in strategy space, not in time • Suggests: alter δ (weight on hypotheticals) • original specification : QSR* = 1.193 @ λ*=0.04 • δ = 0.7 (p-beauty est.): QSR* = 1.056 @ λ*=0.03 • δ = 1 (belief model): QSR* = 1.082 @ λ*=0.175 • δ(k,t) = % of B.R. payoff: QSR* = 1.077 @ λ*=0.06 • Altering φ: • 1/8 weight on surprises: QSR* = 1.228 @ λ*=0.04

STEWA: Other Modifications • Equal initial attractions: worse • Smoothing • Takes advantage of strategy space structure • λ spreads probability across strategies evenly • Smoothing spreads probability to nearby strategies • Smoothed Attractions • Smoothed Probabilities • But… No Improvement in QSR* or λ* ! • Tentative Conclusion: • STEWA: not broken, or can’t be fixed…

Other Standard Models • Nash Equilibrium • Uniform Mixed Strategy (‘Random’) • Logistic Cournot BR • Deterministic Cournot BR • Logistic Fictitious Play • Deterministic Fictitious Play • k-Period BR

“New” Models • Best respond to stated beliefs (S1-S6 only) • Best respond to calculator entries • Issue: how to aggregate calculator usage? • Decaying average of input • Reinforcement based on calculator payoffs • Decaying average of payoffs

Model Comparisons * Estimates on the grid of integers {-10,-9,…,9,10} In = periods 1-35 Out = periods 36-End

Model Comparisons 2

The “Take-Homes” • Methodological issues • Infinite strategy space • Convergence vs. Instability • Right notion of error • Self-Tuning EWA fits best. • Guesses & calculator input don’t seem to offer any more predictive power… ?!?!

Belief Learning in an Unstable Infinite Game Study

Belief Learning in an Unstable Infinite Game Study

Presentation Transcript

Belief Learning in an Unstable Infinite Game

An Old Lady with an Unstable Knee

Traditional Arguments for Belief in an Afterlife

Learning in Neural and Belief Networks

Risk Mitigation in an Unstable Market

Who needs an angioplasty in 2008? Unstable Angina

Unstable Klein-Gordon Modes in an Accelerating Universe

Psychology and the Infinite Game

Raymarching an Infinite City

Game Theory in Q-Learning

UNSTABLE

ϕ -bending in an infinite solenoid

Belief in the Supernatural as an Adaptation

Learning from Infinite Training Examples

Conquering an Infinite Cave

AVATARS in GAME-BASED LEARNING

Unstable

BELIEF IN GOD

Behaviours and Belief systems in User-Centred Game Design

Climbing an Infinite Ladder

Securing Development in an Unstable World

AVATARS in GAME-BASED LEARNING