The Turing Test

The Turing Test Conversational AI and the Loebner Prize Competition Jen Brandner for CSCI 405

What is a Turing Test? • It all started with A. M. Turing’s 1950 paper “Computing Machinery and Intelligence.” • Turing described an “imitation game” in which a man and a woman both try to convince an interrogator that he/she is the woman. • Expands this to a computer convincing an interrogator that it is human.

What is conversational AI? • Machines are programmed to carry on a conversation with the user. • “chatbots” • Requires natural language processing. • Examples: • ELIZA (the Rogerian psychotherapist) • AOL Messenger’s “Smarter Child”

What is the Loebner Prize Contest? • Sponsored by Hugh Loebner • Annual event held sense 1991 • 1991: 10 judges and 8 contestants (6 computers and 2 humans) • Judges had short conversations with each contestant and rated their human-ness. • To give the computers a fighting chance, contestants were allowed to select a single topic to converse on.

Results of the First Contest in 1991 • Five judges rated the top contestant as human. • Eight cases in which a computer was misclassified as human. • Winning programmer: Joseph Weintraub’s program PC Therapist III • His topic: whimsical conversation • Relied on non sequiturs in conversation • Awarded $1,500

The 1996 Contest • Jason Hutchens entered two programs • HeX (primary entry) • MegaHAL • HeX was a simple one-month hack. Hutchens’s intent was to show the futility of Loebner’s contest. • “If I can beat those other systems with a program which took only a month to make then there is something wrong with the way the contest is structured.” - Jason • HeX was more complex than MegaHAL, and actually used MegaHAL as just a part of its programming. • Hutchens’s HeX won the contest in 1996, but neither of his creations won again after that year.

HeX’s Algorithm Iterate roughly in this order: • Parse sentences one-by-one, convert to words. Look for keywords in a database of hardwired replies (and use one only if hadn't been used before). • If a stored reply could not be located, evaluate for a trick question, and if detected, give a witty reply. • Call MegaHAL and generate psychobabble. • Reformulate the user's input according to one of several hundred templates and spit it back. • Give a humorous response to silence. • Accuse the user of being ungrammatical etc. • As a last resort, generate more psychobabble with MegaHAL.

MegaHAL’s Algorithm Constructs reply sentences using Markov models (sophisticated state machines) to predict what word should go next in MegaHAL’s reply based on the previous four words in the sentence. The “information” of a word is the “surprise” it causes the Markov model, a function of the probability of the word: I(w|s) = -log2P(w|s) • Read the user's input, and segment it into an alternating sequence of words and non-words. • From this sequence, find an array of keywords and use it to generate many candidate replies. • Display the reply with the highest information to the user. • Use the user's input to update the Markov models, so that MegaHAL can learn from what the user types.

About HeX and MegaHAL • Strengths • HeX was easy to implement (only took one month to develop). • Weaknesses • MegaHAL sometimes generated sentences that did not make sense. Since HeX used MegaHAL in its algorithm, it had the same problem. • Just a glorified random sentence generator.

Most recent Loebner Prize contest • Winner in 2005: Rollo Carpenter’s “Jabberwacky” • Uses a unique learning algorithm that stores previous conversations and uses them as guides in future conversations. • You can talk to Jabberwacky on the web: www.jabberwacky.com • Awarded $3,000

"Jabberwacky learns from what you say and when you say it. Then, if the right moment comes up some time in the future, it says what you said ... and learns what someone ELSE says in response. So it's a giant feedback loop, and an imitator ... if you like, it's an unusually clever parrot. To really 'get' how it works you have to think about it in a rather backwards sort of way. There's no programming to make it claim to be human, yet it does so often - because most of the people speaking to it claim to be human. And it will often accuse users of being a robot...“ - from the website of icogno, an AI company founded by Rollo Carpenter

About Jabberwacky • Strengths: • Learns by experience in past conversations. • Gets its replies from what humans have said to it before, so its replies are sure to be human-like. • Weaknesses: • Requires a huge database to store previous conversations. • Still has no concept of situation (what exactly it is talking about).

2005 Sample Transcripts Transcripts of conversations at 2005 Loebner Prize contest: http://loebner.net/Prizef/2005_Contest/Transcripts.html Excerpt from one of the conversations with Jabberwacky: PROGRAM: Ok then it was nice to share your hair brush. JUDGE: What do mean by that? How have you shared my hair brush? PROGRAM: Good. JUDGE: Good? PROGRAM: Yes. JUDGE: This has become a very zen like conversation. PROGRAM: And worse yet, I gave it to charity. JUDGE: What? WHat did you give to charity? This conversation? PROGRAM: Er.. 1 million Dollars? HUMAN: How is the other entity treating you? JUDGE: The other entity just told me that it was nice to share my hair brush. HUMAN: Share your hair brush? You have a mad, perverted bot... JUDGE: I don't even have a hair brush. Otherwise I would have thought it quite forward indeed.

http://www.biscit.biz/~data/s/george_jabberwacky_televirtual.wmvhttp://www.biscit.biz/~data/s/george_jabberwacky_televirtual.wmv http://www.televirtual.com/movies/george_ITV.wmv Animated George(Jabberwacky’s character)

References “Computing Machinery and Intelligence,” A. M. Turing, http://loebner.net/Prizef/TuringArticle.html, provided by Hugh Loebner “Icogno,” Icogno Ltd, http://www.icogno.com/ "Lessons from a Restricted Turing Test,“ Stuart M. Shieber, 1993, http://www.eecs.harvard.edu/shieber/Biblio/Papers/loebner-rev-html/loebner-rev-html.html “MegaHAL,” Jason Hutchens, http://megahal.alioth.debian.org/ “Home Page of the Loebner Prize in Artificial Intelligence,” 2003, http://loebner.net/Prizef/loebner-prize.html “How to Pass the Turing Test by Cheating,” Jason L. Hutchens, 1997, http://www.agent.ai/doc/upload/200403/hutc97_1.pdf

The Turing Test

The Turing Test

Presentation Transcript

Data on Trial: Lessons from The Turing Test

Creativity, the Turing Test, and the Better Lovelace Test

Last Time: Acting Humanly: The Full Turing Test

Turing Test

Turing/Turing IEP Comparison

Turing Test

The Turing Test for Game AI

Turing test

Turing Test: Mindless Game?

Don’t Take Shortcuts! Computational Lexical Semantics and the Turing Test

Beyond the Turing Test

Not Another Look at The Turing Test!

The Turing Machine

Turing Test & Intelligence

The Turing Test

The Turing Machine

Turing Tests with Turing Machines

The Turing Test: Simulating Intelligence

Data on Trial: Artificial Intelligence and the Turing Test

The Church-Turing Thesis

Turing Test: Mindless Game?

The visual Turing test

The Turing Test

The Turing Test

Presentation Transcript

Data on Trial: Lessons from The Turing Test

Creativity, the Turing Test, and the Better Lovelace Test

Last Time: Acting Humanly: The Full Turing Test

Turing Test

Turing/Turing IEP Comparison

Turing Test

The Turing Test for Game AI

Turing test

Turing Test: Mindless Game?

Don’t Take Shortcuts! Computational Lexical Semantics and the Turing Test

Beyond the Turing Test

Not Another Look at The Turing Test!

The Turing Machine

Turing Test &amp; Intelligence

The Turing Test

The Turing Machine

Turing Tests with Turing Machines

The Turing Test: Simulating Intelligence

Data on Trial: Artificial Intelligence and the Turing Test

The Church-Turing Thesis

Turing Test: Mindless Game?

The visual Turing test

Turing Test & Intelligence