Regret-Based Incremental Partial Revelation Mechanism Design for Optimal Decision-Making

Regret-based Incremental Partial Revelation Mechanism Design Nathanaël Hyafil, Craig Boutilier AAAI 2006 Department of Computer Science University of Toronto

$$ $$ $$ $$ $$ $$ $$ $$ Bargaining for a Car Luggage Capacity? Two Door? Cost? Engine Size? Color? Options?

Mechanism Design • Mechanism design tackles this: • Design rules of game to induce behavior that leads to maximization of some objective(e.g., social welfare, revenue, ...) • Objective value depends on private information held by self-interested agents  Elicitation + Incentives

“Computational” Mechanism Design • The interesting questions: • what preference info is relevant to the task at hand? • when is the elicitation effort worth the improvement it offers in terms of decision quality? • how to deal with incentives ?

Overview • Mechanism Design Background • Incremental Partial Revelation Mechanism • Regret-based iPRMs • Experimental results • Conclusion / Future Work

Basic Social Choice Setup • Choice of x from outcomes X • Agents 1..n: typetiTi and valuationvi(x, ti) • Type vectors: tT • Goal: implement social choice functionf: T  X • e.g., social welfare SW(x,t) =  vi(x, ti) • Quasi-linear utility: • ui(x, i ,ti ) = vi(x, ti ) - i • Our focus: social welfare maximization

Basic Mechanism Design • A mechanismm consists of three components: • actions Ai • allocation function O: A X • payment functions pi : A R • Mechanism is incentive compatible: • In equilibrium, agents reveal truthfully • Ex-post IC • Assume others tell the truth and agent i knows the others’ types • Then agent i should tell the truth

Properties • Mechanism is efficient: • maximizes social welfare given reported types: •  -efficient: within  of optimal social welfare • Ex post individually rational: • no agent can lose by participating • -IR: can lose at most 

Direct Mechanisms • Revelation principle: focus on direct mechanisms where agents directly and (in eq.) truthfully reveal their full types • For example, Groves scheme (e.g., VCG): • choose efficient allocation and use payment function: • incentive compatible in dominant strategies • efficient, individually rational

Cost of Full Revelation • Communication costs • Computation costs • Cognitive costs • Privacy costs INTRACTABLE! Partial revelation?

Partial Revelation • Full revelation: • Not always necessary for optimal decision • When necessary, not always worth the costs • Partial revelation: • Elicit just enough to make optimal decision • Trade-off elicitation costs with decision quality • Can we maintain incentives?

Existing Work on Partial Revelation [Conen,Hudson,Sandholm, Parkes, Nisan&Segal, Blumrosen&Nisan] • Most Work: • require enough revelation to determine optimal allocation and VCG payments • hence can’t offer savings in general [Nisan&Segal05] • Exception:Priority games [Blumrosen&Nisan 02] • specific settings (1-item, combinatorial auctions)

Overview • Mechanism Design Background • Incremental Partial Revelation Mechanism (iPRM) • Regret-based iPRMs • Experimental results • Conclusion / Future Work

Incremental Partial Revelation Mechanisms (iPRMs) • iPRM interacts with agents: • set of queriesQi (e.g. standard gamble:“v( car ) >5?”) • response rRi(qi) interpreted as partial type i (r)Ti(e.g. bounds on each parameter) • Formal Model (see paper)

iPRMs • Goal: • Trade-off quality of alloc. with revelation costs • Maintain acceptable incentives properties • At each step, given , choose between: • Terminating (which allocation?) • Eliciting (which query?)

Minimax Regret: Utility Uncertainty • Regret : • Max regret of x given  : • MMR-optimal allocation: x* = arg minx MR(x, )

Regret-based Elicitation • Find query to reduce MMR level? • Several heuristics proposed for preference elicitation. • We adapt Current Solution Strategy (CSS) • Focus elicitation on allocations involved in regret

Allocation Elicitation • Proposed allocation elicitation algorithm • Using SW-regret computation and elicitation • See paper for details • Allocation elicitation phase terminates with •  -efficient allocation • Partial type 

Incentive Properties • Let mechanism M = (x* , piT), with • -efficient allocation function x* • payments: piT(x* ; ) = maxt  piVCG(x* ; t) • Theorem 1: • M is -efficient, - ex post IR, - ex post IC •  = +() • (): bound on payment uncertainty

Approximate Incentives •  : bound on utility gain • But gain from manipulation outweighed by costs of manipulation • don’t know types of others • must simulate mechanism • Formal, approximate IC  practical, exact IC

2 Phase Approach • Bound on manipulability: •  + () : not a priori • If () too large: • Elicit to reduce payment uncertainty • Payment elicitation strategy: based on CSS (P-CSS) • Terminates with a priori bounds • ( + ) -IC • -IR , -efficiency

Direct Optimization • Causes of manipulability: • efficiency loss + payment uncertainty • MMR w.r.t. SW only accounts for efficiency loss • Should minimize global worst-case manipulability: • u(best lie) - u(truth) • efficiency loss bounded by worst-case manipulability • Formulate as regret optimization and elicitation • ask queries that directly reduce global manipulability

Single Phase Approach • Theorem 2:For M = (x* , piT), • If  =max worst case manipulability • Then M is • -efficient • - ex post IC • - ex post IR

Elicitation Strategies • Two Phase (2P): • SW loss and payment uncertainty for elicitation and decisions •  Two Phase ( 2P): • SW loss and payment uncertainty for elicitation • Manipulability for decisions • Common-Hybrid (CH): • Manipulability for elicitation and decisions • Myopically Optimal (MY): • Simulate all queries, ask best

Test Domains • Car Rental Problem: • 1 client , 2 dealers • Car: 8 attributes, 2-9 values, ~12000 cars • factored valuation/costs: 13 factors, size 1-4 • Total 825 parameters • Small Random Problems: • supplier-selection, 1 buyer, 2 sellers • 81 parameters

Results: Car Rental Initial regret: 99% of opt SW Zero-regret: 71/77 queries Avg remaining uncertainty:92% vs 64% at zero-manipulabilityAvg nb params queried: 8% • relevant parameters • reduces revelation • improves decision quality

Results: Random Problems

Conclusion • Theoretical model for iPRMs • Class of iPRMs with approximate incentives • Key point: • Approximation trade off cost vs. quality • Formal, approximate IC  practical, exact IC • Applicable to general mechanism design • Empirically very effective

Current + Future Work • More heuristics + test domains • Formal model manipulation and revelation costs  formal, exact IC  explicit revelation/quality trade-off • Sequentially optimal elicitation • One-shot partial revelation mechanisms“Mechanism Design with Partial Revelation” draft 2006

Questions?

Regret-Based Incremental Partial Revelation Mechanism Design for Optimal Decision-Making

Regret-Based Incremental Partial Revelation Mechanism Design for Optimal Decision-Making

Presentation Transcript

Algorithmic mechanism design

Mechanism Design

Mechanism Design

Trust Based Mechanism Design

Market based mechanism

Mechanism Design

Mechanism Design

Mechanism Design

Based on revelation

PARTIAL RECONFIGURATION DESIGN

Computational Mechanism Design

Regret

Mechanism Design II

Automated mechanism design

Population Based Incremental Learning

(One-shot) Mechanism Design with Partial Revelation

Automated Mechanism Design

Mechanism Design

Image-based walkthroughs from partial and incremental scene reconstructions

REGRET