370 likes | 502 Vues
Designing Games with a Purpose. By Luis von Ahn and Laura Dabbish. Introducing games with a purpose. Many tasks are trivial for humans, but very challenging for computer programs People spend a lot of time playing games Idea: Computation + Game Play
E N D
Designing Games with a Purpose By Luis von Ahn and Laura Dabbish
Introducing games with a purpose • Many tasks are trivial for humans, but very challenging for computer programs • People spend a lot of time playing games • Idea: Computation + Game Play • People playing GWAPs perform basic tasks that cannot be automated. While being entertained, people produce useful data as a side effect.
Related work • Recognized utility of human cycles and motivational power of gamelike interfaces • Open source software development • Wikipedia • Open Mind Initiative • Interactive machine learning • Incorporating game-like interfaces
THE QUESTION IS … • How to design these games such that… • …People enjoy playing them! • …They produce high quality outputs!
Basic structure achieves several goals • Encourage players to produce correct outputs • Partially verify the output correctness • Providing an enjoyable social experience
Make GWAPS more entertaining… How? • Introduce challenge • Introduce competition • Introduce variation • Introduce communication
Ensure output accuracy… How? • Random matching • Player testing • Repetition • Taboo outputs
Other design issues • Pre-recorded Games • More than two players
How to judge GWAP success? Expected Contribution = Throughput Average Lifetime Play
Conclusion and future work • First general method for integrating computation and game play! • (Everyone could/should contribute to AI progress!) • Other GWAP game types? • How do problems fit into GWAP templates? • How to motivate not only accuracy but creativity and diversity? • What kinds of problems fall outside of GWAP approach?
Questions? Comments? • What do you think of this approach in general? Which problems are suitable for this approach? • What do you love about these games? What are the inefficiencies in these games? • How do we make these games more enjoyable and more efficient in producing correct results?
A GAME-THEORETIC ANALYSIS OF THE ESP GAME By Shaili Jain and David Parkes
Two Different Payoff Models • Match-early preferences • Want to complete as many rounds as possible • Reflect current scoring function in ESP game • Low effort is a Bayes-NE • Rarest-words-first preferences • Want to match on infrequent words • How can we accomplish this?
? How can we assign scores to outcomes to promote desired behaviours?
The Model • Universe of words • Words relevant to an image • The game designer is trying to learn this • Dictionary size • Sets of words for a player to sample from • Word frequency • Probability of word being chosen if many people were asked to state a word relating to this image • Order words according to decreasing frequency • Effort level • Frequent words correspond to low effort
The Model continued • Two stages of the game: • 1st stage: choose an effort level • 2nd stage: choose a permutation on sampled dictionary • Only consider the strategies involving playing all words in the dictionary • Only consider consistent strategies: • Specify a total ordering on elements and applying that ordering to the realized dictionary • Complete strategy = effort level + word ordering
More Definitions • A match – first match • Probability of a match in a particular location • Outcome = word + location • Valuation function: a total ordering on outcomes • Utility
Match-Early Preferences • Lemma 1: Playing ↓ is not an ex-post NE. Proof: Player 2, D2 = {w2, w3} s2: play w2, then w3 Player 1, D1 = {w1, w2} s1: play w1, then w2 But, player 1 is better off playing w2 first!
Match-Early Preferences • Definition 6: stochastic dominance for 2nd stage strategy • (Lemma 2, 3) Stochastic dominance is sufficient and necessary for utility maximization. • (Lemma 5, 6) Playing ↓ is a strict best response to an opponent who plays ↓ • Theorem 1: (↓, ↓) is a strict Bayesian-Nash equilibrium of the 2nd stage of the ESP game for match-early preferences.
Match-Early Preferences • Definition 6: stochastic dominance for 2nd stage strategy • Fix opponent’s strategy s2, stochastic dominance: • Strategy s stochastically domiantes s’ • P(s, 1) + … + P(s, k) >= P(s’, 1) + … + P(s’, k), for all 1 < k < d
Match-Early Preferences • (Lemma 2, 3) Stochastic dominance is sufficient and necessary for utility maximization. • Proof by induction • Inductive step uses inductive hypothesis and stochastic dominance to establish result
MATCH-EARLY PREFERENCES • Key result (Lemma 4) Given effort level e, • D = {x, …}, D’ = {x’, …}, f(x) < f(x’) • D and D’ only differ by the element x and x’ • P(sampling D’) > P(sampling D) for effort level e
Match-Early Preferences • (Lemma 5, 6) Playing ↓ is a strict best response to an opponent who plays ↓ • Proof by induction • Base case (Lemma 5): the probability of a first match in location 1 is strictly maximized when player 1 plays her most frequent word first. • Inductive step (Lemma 6): Suppose player 2 plays ↓. Given that player 1 played her k highest frequency words first, the probability of a first match in locations 1 to k is strictly maximized when player 1 players her (k+1)st highest frequency word next.
Match-Early Preferences • Proof for Lemma 5 and 6 (Idea: use Lemma 4) • Want Pr(sampling D in A) > Pr(sampling D in B) • f(wi) > f(wi+1) • A (wi highest word) = C (no wi+1) and D (has wi+1) • B (wi+1 highest word) • 1-to-1 mapping between C and B • P(sampling D in C) > P(sampling D in B)
Match-Early Preferences • (Lemma 5, 6) Playing ↓ is a strict best response to an opponent who plays ↓ • Theorem 1: (↓, ↓) is a strict Bayesian-Nash equilibrium of the 2nd stage of the ESP game for match-early preferences.
MATCH-EARLY PREFERENCES CONT’D • Definition 7: stochastic dominance for complete strategy • (Lemma 7, 8) Stochastic dominance is sufficient and necessary for utility maximization • (Lemma 12) Playing L stochastically dominates playing M. • Theorem 2: ((L, ↓), (L, ↓)) is a strict Bayesian-Nash equilibrium for the complete game.
MATCH-EARLY PREFERENCES CONT’D • (Lemma 12) Playing L stochastically dominates playing M • Randomized mapping from DM to DL • D in DM is transformed by: Take low words in DM, continue sampling from DL until we get enough words
MATCH-EARLY PREFERENCES CONT’D • (Lemma 12) Playing L stochastically dominates playing M • Lemma 10: Each dictionary in DM is mapped to a dictionary in DL which is at least as likely to match against the opponent’s dictionary • Lemma 11: The probability of sampling D from DL is the same as the probability of getting D by sampling D’ from DM and then transform D’ into D under the randomized mapping.
Match-Early Preferences • Theorem 2: ((L, ↓), (L, ↓)) is a strict Bayesian-Nash equilibrium for the complete game.
RARE-WORDS-FIRST PREFERENCES • Definition 8: Rare-words first preferences • (Lemma 13, 14) Stochastic dominance is still sufficient and necessary for utility maximization • (Lemma 15) Suppose player 2 is playing ↓. For any dictionary, no consistent strategy of player 1 stochastically dominates all other consistent strategies. • (Lemma 16) Suppose player 2 is playing ↑. For any dictionary, no consistent strategy of player 1 stochastically dominates all other consistent strategies.
Rare-Words-First Preferences • Idea for proving Lemma 15 (and 16) • U = {w1, w2, w3, w4} d = 2 • D1 = {w1, w2} s1: w1, w2 s2: w2, w1 • x = Pr(D2 = {w2, w3} or D2 = {w2, w4}) • y = Pr(D2 = {w1, w2}) • z = Pr(D2 = {w1, w3} or D2 = {w1, w4}) • s1: (0, x, y+z, 0) s1’: (x, y, 0, z) • Neither s1 nor s1’ stochastically dominates the other
Future Work • Sufficient and necessary conditions for playing ↑ with high effort being a Bayesian-Nash equilibrium? • Incentive structure for high effort? - To extend the labels for an image • Other types of scoring functions? • Rules of Taboo words? • Consider entire sequence of words suggested rather than only focusing on the matched word?
Questions? Comments? • What do you think of the model? Does everything in the model make sense? Can you suggest improvements to the model? • What incentive structure could possibly lead to high effort? Would the use of Taboo words be useful for this purpose?