1 / 61

Detecting Deceptive Speech: Humans vs. Machines

This research explores the effectiveness of humans and machines in detecting deceptive speech. It examines various cues and factors that contribute to the detection of deception, including acoustic, prosodic, and lexical cues. The goal is to develop improved deception detection techniques and examine individual differences in deceptive behavior.

pevans
Télécharger la présentation

Detecting Deceptive Speech: Humans vs. Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detecting Deceptive Speech: Humans vs. Machines Julia Hirschberg Columbia University ICASSP 2018

  2. Co-Authors and Collaborators • Sarah ItaLevitan • Michelle Levine • Guozhen An • Andrew Rosenberg • Gideon Mendels • Rivka Levitan • Kara Schechtman • Mandi Wang • NishmarCestero • Kai-Zhan Lee • YochevedLevitan • Angel Maredia • Jessica Xiang • Bingyan Hu • Zoe Baker-Peng • Ivy Chen • Meredith Cox • Leighanne Hsu • Yvonne Missry • Gauri Narayan • Molly Scott • Jennifer Senior • James Shin • Grace Ulinksi

  3. The Columbia Speech Lab

  4. Deceptive Speech • Deliberate choice to mislead • Without prior notification • To gain some advantage or to avoid some penalty • Not: • Self-deception, delusion, pathological behavior • Theater • Falsehoods due to ignorance/error • Everyday (White) Lies very hard to detect • But Serious Lies may be easier to detect

  5. Why would Serious Lies be easier to identify? • Hypotheses in research and among practitioners: • Our cognitive load is increased when lying because we … • Must keep story straight • Must remember what we have said and what we have not said • Our fear of detection is increased if… • We believe our target is difficult to fool • Stakes are high: serious rewards and/or punishments • Makes it hard for us to control potential indicators of deception

  6. But Humans very poor at Recognizing these Cues: Aamodt & Mitchell 2004 Meta-Study ()

  7. Current Approaches to Deception Detection • ‘Automatic’ methods (polygraph, commercial products) no better than chance • Train humans: John Reid & Associates • Behavioral Analysis: Interview/Interrogation no empirical support • Truth: I didn’t take the money vs. Lie: I did not take the money(but non-native speakersrarely use contractions so….) • Laboratory studies: Production and perception (facial expression, body posture/gesture, statement analysis, brain activation, odor,…)

  8. Our Approach • Conduct objective, experimentally verifiedstudies of spoken cues to deceptionon large corpora whichpredict better than humans orpolygraphs • Our method: • Collect speech data and extract acoustic, prosodic, andlexical cues automatically • Take gender, ethnicity, and personality factorsinto account as features in classification • Use Machine Learningtechniques to train models to classify deceptive vs. non-deceptive speech

  9. Questions We Hope to Answer • Can we improvehuman deception detection • By providing new knowledge and training materials • By providing classifiers to aid in deception detection • Can we identify trust in humans –and mistrust • Can we control trust in machines: an ethical question… • When robots and avatars should be trusted • When they should not…

  10. Deception Detection from Text and Speech Corpus collection Classification Annotation, feature extraction Analysis

  11. Outline • Corpus collection • Classification of deception from text and speech • Individual differences in deceptive behavior • New goal: Acoustic-prosodic indicators of trust

  12. Columbia/SRI/Colorado Deception Corpus, 2003-5 • 7h of speech from 32 SAE-speaking subjects performing tasks and asked to lie about half • Lexical and acoustic-prosodic features identified from psycholinguistic literature for classification • Results • Classification accuracy (~70%) significantly better than human performanceon our corpus • Considerable individual differences between speakers: Judges with certain personality traits performed better than our classifiers

  13. Columbia X-cultural Deception Corpus (2011--) • New questions to ask: • Can personality factors help in predicting individual differences in deception? • Can people who detect lies better also lie more successfully? • Do differences in gender and native language influence deceptive behavior? Judgment of deception? • New study: Pair native speakers of Standard American English with Mandarin Chinese speakers, speaking English, interviewing each other

  14. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  15. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  16. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  17. Biographical Questionnaire

  18. Our CxDExperiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  19. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  20. The Big Five NEO-FFI (Costa & McCrae, 1992) • Openness to Experience:“I have a lot of intellectual curiosity.” • Conscientiousness:“I strive for excellence in everything I do.” • Extraversion:“I like to have a lot of people around me.” • Neuroticism:“I often feel inferior to others.” • Agreeableness:“I would rather cooperate with others than compete with them.”

  21. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  22. Our CXD Experiment

  23. Our CXD Experiment NEO-FFI Survey Lying game Background survey Biographical Questionnaire Baseline sample

  24. Motivation and Scoring • Monetary motivation • Success for interviewer: • Add $1 for every correct judgment, truth or lie • Lose $1 for every incorrect judgement • Success for interviewee: • Add $1 for every lie interviewer thinks is true • Lose $1 for every lie interviewers thinks is a lie • Good liars tell the truth as much as possible when lying, so how do we know what’s true or false for follow-up questions? • Interviewees press T/F keys after every phrase

  25. Columbia X-Cultural Deception (CXD) Corpus • 340 subjects, balanced by gender and native language (American English, Mandarin Chinese):122 hours of speech • Crowdsourced transcription, speech alignment • TF keypress alignment • Segmented into • Inter-pausalunits (IPUs) • Speaker turns • Question/answer sequences (Q/A and Q/A+ follow-up)

  26. “Where were you born?” True or False?

  27. “Where were you born?”

  28. “What is the most you have ever spent on a pair of shoes?” True or False?

  29. “What is the most you have ever spent on a pair of shoes?”

  30. Outline • Corpus collection • Classification of deception from text and speech • Individual differences in deceptive behavior • Acoustic-prosodic indicators of trust

  31. Features We Extract • Currently: • Text-based: n-grams, psycholinguistic, Linguistic Inquiry and Word Count (LIWC) (Pennybaker et al), word embeddings (GloVe trained on 2B tweets) • Speech-based: openSMILEIS09 (386) • Gender, native language • Five personality dimensions (NEO-FFI) • Next: Syntactic features (complexity) and all combined

  32. Corpus Segmentation • Inter-pausal unit (IPU) • Turn • Question-level(first answer, first+follow-up answers)

  33. Machine Learning Experiments • What are the best classification models? • What are the optimal segmentation units? • Which feature sets are most useful?

  34. Machine Learning Experiments • What are the best classification models? • Statistical machine learning – Logistic Regression, SVM, Random Forest • Neural networks – DNN, LSTM, hybrid system • What are the optimal segmentation units? • Which feature sets are most useful?

  35. Machine Learning Experiments • What are the best classification models? • What are the optimal segmentation units? • Inter-Pausal Unit (IPU), speaker turns, Q/A, Q/A+follow-up As • Which feature sets are most useful?

  36. Machine Learning Experiments • What are the best classification models? • What are the optimal segmentation units? • Which feature types are most useful? • Text-based: n-grams, psycholinguistic-based, LIWC, GloVe word embeddings trained on 2B tweets • Individual difference features: gender, native language, personality • Speech-based: openSMILEIS09 acoustic-prosodic features (e.g. f0, intensity, speaking rate, VQ)

  37. Segmentation– Longer is Better: IPU, Turn, Question Mendels, Levitan et al. 2017, “Hybrid acoustic lexical deep learning approach for deception detection ”

  38. Deep Learning Approaches • BLSTM-lexical • DNN-openSMILE • Hybrid: BLSTM-lexical + DNN-openSMILE Mendels, Levitan et al. 2017, “Hybrid acoustic lexical deep learning approach for deception detection ”

  39. IPU Classification: Hybrid is Better Mendels, Levitan et al. 2017, “Hybrid acoustic lexical deep learning approach for deception detection ”

  40. Question-Level Random Forest; Context is Better

  41. Analysis – AcousticFeatures

  42. Analysis – LexicalFeatures

  43. Classification with Gender and Native Language: 1st Answer and Chunks: Personal Info Helps with Less Context

  44. Results for Classification • Deception classification experiments • >.72F1-Score when using question segmentation to predict lies • Note that human interviewers’ F1 is 0.43 • Deep learning approaches for IPU segmentation probably most promising route in future • Many more features to examine together

  45. Outline • Corpus collection • Classification of deception from text and speech • Individual differences in deceptive behavior and detection of deception • Acoustic-prosodic indicators of trust

  46. Some Individual Differences • Extroversionis correlated with success at deception, for English male speakers • Native English speakers perform better at deception when paired with a native Chinese interviewer

  47. Individual Differences in Deceptionand Truth-telling Levitan et al. 2018, “Linguistic indicators of deception and perceived deception in spoken dialogue”

  48. Differences in Deceptive Ability: How well did interviewees lie?

  49. Differences in Deception Detection: How well did interviewers judge deception?

  50. Results • There are gender and cultural/native languagedifferences in deceptive behavior • There are differences in ability to deceive and in ability to detect deception • Understandingthese differences can improve deception classification by machines and perhaps by humans…

More Related