410 likes | 444 Vues
Thesis Project. By Willer Travassos. What is GWAP?. The term GWAP stands for Game With A Purpose, but what does that actually mean? It means that we channel human brain power to solve tasks that computers can’t yet achieve.
E N D
Thesis Project By Willer Travassos
What is GWAP? • The term GWAP stands for Game With A Purpose, but what does that actually mean? • It means that we channel human brain power to solve tasks that computers can’t yet achieve. • Done using games, such that it is an enjoyable experience for the people playing it.
What is GWAP? • We can find such types of games on www.gwap.com • It hosts games developed, during the research of Prof. Luis von Ahn, in the area of Internet Accessibility using image, video, and song tagging. • These games involve two people seeing an image (video, or audio) and describing it using one word. If the word matches, then it is added to the image description.
Possible Applications • But the main idea of GWAP does not need to be only applied towards internet accessibility. • ReCAPTCHA for example uses human brain power to determine the spelling of words from scanned book pages. • There are still several tasks/problems that computers can’t solve, while humans solve them easily.
Possible Applications • Of the 5 senses, the one that seems to have more promising results is the Sight sense. • Using this fact, and my experience of the computer vision classes I took, I decided to explore thesis subjects on a GWAP that dealt with sight. • More specifically, Face Recognition.
Face Recognition • Even though there have been research and extensive work done around face recognition software, it is not perfect. • It has weaknesses, such as working properly under certain environmental circumstances, and viewing angles. • But the biggest problem, is that face recognition seem to be less effective when facial expressions are made.
Face Recognition • So how can we solve problems like these? • Since machines fail at things that we seem to figure out so easily, why not use humans to figure them out? • As you can see this is the main idea of GWAP, and it is where it can aid us. • To help us further, there is such a game that consists of describing facial features. Namely, “Guess Who?”.
Guess Who? • Guess Who? is a board game, in which the player’s goal is to guess the mystery person of his/hers adversary. • The game starts by the players choosing their sides (blue or red), and then drawing one of the face cards. • After determining who starts the match, players keep asking yes or no questions (e.g. Is this person blonde?).
Guess Who? • As a player receives answers to his/hers questions, he or she eliminates the faces cards that do not fit that answer. • Each turn can be used to ask a question or to guess the mystery person. • The Game ends after one of the players guess correctly the mystery person of its opponent.
What I want to is… • Simply I want to apply the use of a Guess Who? Style GWAP to perform a very reliable face recognition method. • More importantly, I want to use my GWAP to create a deep database that contains the subjectivity of human facial analysis. • This means that I want to understand what and how a human brain interprets faces being the same as another. What features leads us to do such.
What I want to is… • This approach is obviously no the fastest method to facial recognition, but it approximates best to our capacity of detecting facial features. • Which can lead better face recognition algorithms, and understand the subjectivity that troubles to today’s algorithms. • A second very important “side-effect” to this approach takes us to the field of Computer Vision.
What I want to is… • What this brings to Comp. Vision? • Well, players will be feeding data about faces every time they play the game. This means that a computer will learn more and more about a face and its details, no matter how the face is presented. • This bit of machine learning can be used train computer, robots, and etc to better understand and investigate facial details. • To better recognize people, like children learning to recognize what certain facial expressions mean, or which people are part of their family…
Working with my GWAP • Like any other site, players who plan to fully use my GWAP need to register first. • I plan on allowing guests playing the game. • Once registered the users can access the game lobby, where user can chat, among other things (Note: only chat implemented until now).
Working with my GWAP • What does it mean to fully use the game? • Aside from the obvious use the other services the game provides, there is the important option of adding pictures of a person to his or hers profile. • One difference from the other GWAP is that I cannot keep providing pictures of faces by myself.
Working with my GWAP • Instead I will let users add them and let them initially tag pictures, as if they were in a social networking site. • This initial tag uses the most common attribute of the faces in these pictures. Namely, that person’s name. • To add such pictures users have to search first for people with that might already been added in our database.
Adding Pictures • The search is made on the names already entered in the database, using a mix of the Double Metaphone, and Levenshtein distance algorithms. • Which provides then possible results. Remember that people with equal names are allowed. • As tags are added to the pictures users, can search people according to the tags that describe them.
Double Metaphone (DM) • Published by Lawrence Phillips is part of a class of algorithms called phonetic algorithms. • This type of algorithm attempt to detect relationship between words through sounds they make. The version I am using is made by Adam Nelson. • DM works by producing 2 possible keys that represents the sound a word makes in 4 characters (the sweet spot between specificity and generality).
Double Metaphone (DM) • The first key created represents the English pronunciation, while the second key represents an alternative (although not usually computed). • I then use these keys to create a unique tag for a picture, which represents a picture, and store in the database. • In case there is people with the exact same name, I added an extra field to determine different people.
Example • DM is great because it allows good name searches even for misspelled words, since we concentrate on possible sounds a word makes. • For example: Willer or Wiler, would be considered the same. So if somebody was looking for me and misspelled my name, they could still find me.
Levenshtein Distance • Created by Vladimir Levenshtein, this algorithm measures the differences/distance between two strings using fuzzy logic (like a spell checker). • It is done by calculating the minimum number of insertions, deletions, and substitutions • Once the DM is used to generate keys, and save on the database. I use Levenshtein to compare the keys generated by the name search with the keys of all name tags of saved pictures. • Then rank the search results according to the generated Levenshtein distance. From Lowest to Highest.
Example • Words to be compared: ant, aunt • Levenshtein distance: 1 • One edit is needed, since we need to insert the u between a and n.
The Game • The game mirrors Guess Who? with some needed changes. • Remember that the main idea here is to link different pictures of a same person. In a way that is more efficient than today’s algorithms. • Guess Who? Is played with a set 48 cards, with 24 people in it. Each person has 2 equal pictures.
The Game • The GWAP has 40 different pictures of 20 people, at each match being played. • Players naturally tag different pictures of one person, throughout a GWAP match, • Asking question about his or hers mystery person and asking the other player about the ’s screen.
The Game • Once the mystery person is correctly guessed the game ends, and both players are asked to rate one another. • The rating system tracks the actions, and behavior of one player throughout a match. • Each player has a positive and negative score, and the Reputation of one player is calculating by averaging both.
The Game • The Reputation score serves as means to identify players that bring a positive impact on the matches and the site. • It can also be used to avoid users with low scores, i.e. players that might cheat or behave improperly, to play matches against other players.
Future Plans • Here are some of the ideas that I haven’t yet implemented, or would like to have in the future: • When adding pictures, I would like the users to have the option of either entering a face picture, or entering a normal picture and tagging the faces that appear in it.
Future Plans • Remove the client lobby part that show available rooms, and switch to a random pairing, of people, to play a match. • Create profile pages for an user, so that, he or she can track scores, pictures added by him or her, or another person. • Also, I would like to have a mechanism where people vote or agree on whether new added pictures correspond to the same person.
Future Plans • Ideas for the moment have been: • Users check on their profiles and agree the added pictures are him/her or not. • While loading a game, users can vote on whether newly added pictures correspond to a person, by comparing the new ones with the older. • The last radical measure is to allow users to only add pictures of themselves.
Future Plans • Change the manner questions are made, by either restricting options on written questions, or changing the manner questions are asked. • The GWAP that I have until now, provides me with basic functionality, but it provides enough to base my thesis.
Resources • Luis von Ahn, and Laura Dabbish, “Labeling Images with a Computer Game”, ACM Conference on Human Factors in Computing Systems (CHI 2004), April 2004. • Luis von Ahn, Ruoran Liu and Manuel Blum, “Peekaboom: A Game for Locating Objects in Images”,ACM Conference on Human Factors in Computing Systems (CHI 2006), April 2006. • Luis von Ahn, Mihir Kediaand Manuel Blum, “Verbosity: A Game for Collecting Common-Sense Facts”, ACM Conference on Human Factors in Computing Systems (CHI 2006), April 2006.
Resources • Luis von Ahn, Shiry Ginosar, Mihir Kedia, Ruoran Liu and Manuel Blum, CHI 2006, “Improving Accessibility of the Web with a Computer Game”, April 2006. • George W. Furnas, Kevin Fox, Caterina Fake, Marc Davis, Luis von Ahn, Cameron Marlow, Joshua Schachter, Mor Naaman, Scott Golder, “Why Do Tagging Systems Work?”, CHI 2006 (Panel), April 2006.
Resources • “ReCAPTCHA web site”, http://recaptcha.net, [March 2, 2009] • “Double Metaphone implementation by Adam Nelson”, www.codeproject.com/KB/string/dmetaphone1.aspx , [June 2009] • “Levenshtein algorithm Implementation”, dotnetperls.com/levenshtein, [June 2009] • http://www.ehow.com/how_2054106_play-guess-who.html, [February 2009]