1 / 16

From bounded rationality to learning

From bounded rationality to learning. Bernard WALLISER (Paris School of Economics) Rationality, Heuristics and Motivation in Decision Making, Pisa, November 12-14, 2010. Introduction (1). ¤ Simon’s problem decision considered as a reasoning process

koleyna
Télécharger la présentation

From bounded rationality to learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From bounded rationalityto learning Bernard WALLISER (Paris School of Economics) Rationality, Heuristics and Motivation in Decision Making, Pisa, November 12-14, 2010

  2. Introduction (1) ¤ Simon’s problem decision considered as a reasoning process + limited capacities of information gathering and treatment → bounded rationality procedures (satisficing, ….) But how are these procedures precisely related to the assumptions? ¤ Meta-optimization paradox if the decision-maker optimizes the gathering of information (Winter) the best choice procedure (Mongin-Walliser) he needs: to have previous information on available information to take into account the computing costs Hence, meta-optimization is engaged in an infinite regress which happens to be a vicious one (Mongin-Walliser)

  3. Introduction (2) ¤ Bounded rationality was initially defined in a static way: • fixed environment - fixed beliefs and preferences • fixed choice rule (combining beliefs and preferences) ¤ Learning processes later introduce partial dynamics : • non stationary environment (game) - revised beliefs when new information comes in • endogenous preferences when more experience is acquired • adaptive choice rule → comparison by distinguishing information gathering, information treatment and choice → treatment in various contexts (epistemic logics, decision theory)

  4. Statics: limited information on context ¤ uncertainty on context - plainly probabilistic - hierarchical but always probabilistic: ambiguity - non probabilistic: qualitative probabilities, belief functions Ex: Choquet utility maximization (Gilboa-Schmeidler) ¤ unawareness (not knowing and not knowing that not knowing) - treated in epistemic logics Ex: precautionary principle ¤ limited crossed beliefs - p-accuracy reasoning - k-level reasoning - crossed awareness (Meier et al) Ex: cognitive hierarchy model (Camerer)

  5. Statics: limited information on preferences ¤ multidimensional preferences - preferences are decomposed and simplified Ex: satisficing (Simon), elimination by aspects (Tversky) ¤ random preferences - preferences correspond to alternative ‘moods’ Ex: discrete choice model, quantal model ¤ context-dependent preferences - situated preferences Ex: context (history)-dependent aspiration levels for satisficing - reference points (statu quo, norm) Ex: reference for gains vs losses in EU

  6. Statics: simplified choice rule ¤ limited logical omniscience - treated in epistemic logics Ex: satisficing (sequential examination of actions) ? ¤finite number of internal states - simple expression of ‘computation complexity’ Ex: finite automata ¤computation costs - approximate cost of mental calculus Ex: basic operations

  7. Dynamics: information research on context ¤ exogenous information, - resulting from purchase at specialized institutes and characterized by its value (opposed to its cost) Ex: signals about actual state (correlated to it) → limited relevance ¤ endogenous free information - resulting from repeated observation Ex: observation of other’s action in fictitious play →memory constraints → scope constraints (information neighbourhood) ¤ endogenously induced information - resulting from voluntary (suboptimal) action and characterized by its value (opposed to loss of utility) Ex: search procedures → ambiguous interpretation

  8. Dynamics: information research on own’s preferences ¤observation of own’s past utility of actions - assuming that choice utility= (expected) felt utility Ex: CPR model → partial preferences (incompleteness) ¤ observation of other’s utility of actions - assuming that other’s utility = own’s utility (in same situations) Ex: imitation of successful opponents → biased preferences

  9. Dynamics: treatment of information about context ¤ expectation process - especially of other’s strategy - stationarity assumption → extrapolative expectation Ex: fictitious play (probability = frequency of past actions) ¤ belief revision procedure - 3 contexts: updating, revising, focusing - possibility of contradiction between initial belief and message → simplified or distorted Bayes rule (judgment biases) Ex: weight between initial belief and message ¤ reconstruction of structural information - 3 types of information: factual (past), structural (constant), strategic (future) - pattern recognition (trends, cycles) - revelation of other’s preferences (abductive process) Ex: reputation effect

  10. Dynamics: treatment of information about own’s preferences ¤ performance indices - (average or cumulative) index for each action - stationarity assumption → proxy for utility function Ex: CPR rule ¤ adaptation of aspiration levels - adaptive level for global index (for instance, best past utility) → proxy for utility level Ex: dynamic satisficing (Simon) ¤ reconstruction of structural information - design of relative vs absolute preferences Ex: regret matching (unconditional regret index: difference in the past between utility when using a given strategy and utility really obtained against others’ implemented strategies)

  11. Dynamics: adaptive choice rule (1) ¤ inertial behaviour - repeated action if sufficient past payoff Ex: reaction to aspiration levels (continue if levels are reached) ¤ exploration behaviour - random exploration, fixed or decreasing - directed exploration Ex: randomized fictitious play ¤ exploitation behaviour - quasi optimizing behaviour Ex: fictitious play ¤ stochastic reinforcement behaviour - noisy best response - stochastic matching (probabilistic behaviour monotonic with utility) → implicit exploration-exploitation dilemma Ex: CPR (decreasing exploration)

  12. Dynamics: adaptive choice rule (2) ¤ imitation - grounded on complementary preferences (preferential mimetism) - grounded on information differences (informational mimetism) - grounded on better experience (experienced mimetism) Ex: plain diffusion model imitation of successful opponents ¤ analogy-based reasoning - previous contexts (case-based reasoning) - repetitive game structures Ex: case-based rule (Gilboa-Schmeidler) analogical equilibrium (Jehiel)

  13. Dynamics: adaptive choice rule (3) ¤ restricted choice rules - specific action set, for instance unidimensional Ex: stubborn rule (Laslier-Walliser) - specific beliefs, for instance objective probabilities Ex: stopping rules in search - specific preferences, for instance multicriteria choice Ex: choice 2 by 2 + synthesis ¤ context- adaptive choice rules - parlour games: chess, cards, Cluedo Ex: keep pawns tight - sports Ex: throwing a ball with constant angle - labyrinth, puzzles Ex: keep right rule

  14. Asymptotic results ¤ system’s trajectory - transitory state - asymptotic state (speed of convergence) → different time scales (role of random shocks) ¤ convergence of expectations - towards locally rational ones ¤ convergence of actions (or strategies) - elimination of (strictly) dominated strategies, - convergence notions towards equilibrium states - convergence in time-average or action by action Ex: fictitious play - convergence towards a unique or multiple (point)-equilibrium (selection when exploration vanishes) - cyclical and chaotic attractors

  15. Conclusion (1) ¤dispersed models, even if two main classes (grounded on cognitive capacities) - belief-based learning Ex: fictitious play - reinforcement learning Ex: CPR ¤ combination of models - models depending on choice context and results - hybrid models - models with heterogenous agents ¤ need to consider the precise reasoning modes followed by agents: - counterfactual reasoning (simulation of opponents) - abductive reasoning (detection of structural or behavioral regularities) - analogical reasoning (situations treated as similar) - taxonomical reasoning (categorization)

  16. Conclusion (2) ¤ possibility of meta-learning - belief revision rule Ex: parameter trading off initial belief and message in extended Bayes rule - preferences Ex: degree of altruism in individual preferences - choice rule Ex: parameter in logit rule → learning levels give again rise to an infinite regress (in order to solve it, highest level has to be given) ¤ infinite regress stopped by evolution process, but - mix of evolutive process (capacities and constraints imposed by evolution) and cultural process (capacities and constraints conditioned by society) - very slow time scale (against fluctuating environment) - concrete mechanism not exhibited

More Related