1 / 41

Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews

Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews. J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS, France. Outline. Introduction Goals Requirements on annotation Emotional parameters of mm behaviors Coding scheme 1st coding scheme and annotation

clovis
Télécharger la présentation

Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS, France JC Martin - LIMSI/CNRS - WP5 WS

  2. Outline • Introduction • Goals • Requirements on annotation • Emotional parameters of mm behaviors • Coding scheme • 1st coding scheme and annotation • 2nd coding scheme and example on 1 video • Future directions JC Martin - LIMSI/CNRS - WP5 WS

  3. Introduction JC Martin - LIMSI/CNRS - WP5 WS

  4. IntroductionGoals • How modalities correlate in non acted emotions ? • Annotations and models : one source of knowledge • Coordination between modalities during non-acted emotion • Synthesis of non acted spontaneous multimodal emotions in ECAs • How to code/represent multimodal emotional behavior ? • Methodology (which attributes can be annotated easily manually) • Trade-off / intermediate level • Manual global text free whole video • Manual medium/high order signs • Automatic low level signs • WP5 + WP6 + WP4 + (WP3) JC Martin - LIMSI/CNRS - WP5 WS

  5. Introduction Requirements on coding scheme • Enable annotation (or computation) • Literature: Main attributes of emotional behaviors • Corpus based approach: Cover behaviors observed in EmoTV • Multi-level annotation of temporal data • Global annotation: • Manual annotation of multimodal signs for the global sequence • Computations from manual annotations in each modality (mono, red, comp) • Emotional segment level • Computations from manual annotations in each modality (mono, red, comp) • Provide one source of knowledge for ECA specification • Enable reliability and readability • Annotation time JC Martin - LIMSI/CNRS - WP5 WS

  6. IntroductionEmotional parameters of mm behaviors • Psychology & behavior • Montepare, J., Koff, E., Zaitchik, D. and Albert, M. (1999). "The use of body movements and gestures as cues to emotions in younger and older adults." Journal of Nonverbal Behavior. • Wallbott, H. G. (1998). "Bodily expression of emotion." European Journal of Social Psychology • Detection of emotions + relevant non-verbal behaviors • Acted data • +/- Basic emotions • Age, Gender • Facial expression masked • Expressivity in ECAs (Hartman & Pelachaud 2004) JC Martin - LIMSI/CNRS - WP5 WS

  7. IntroductionEmotional parameters of mm behaviors JC Martin - LIMSI/CNRS - WP5 WS

  8. Introduction Multimodal corpora from TV clips • Communicative functions • Kipp (2003) • MUMIN (Alwood et al. 2004) • Musical Score (Magno Caldognetto et al. 2004) • Emotions / informal annotation • Orage (Atifi and Marcoccia 2001) JC Martin - LIMSI/CNRS - WP5 WS

  9. Coding Scheme JC Martin - LIMSI/CNRS - WP5 WS

  10. Current status • 1st • annotation on 35 clips from EmoTV with 2 coders • 2nd • Iterative definition and application to 1 clip of EmoTV using Anvil (SA, JCM) • Annotation guide written • 1 meeting with Catherine Pelachaud Paris 8 for investigating use for WP6 JC Martin - LIMSI/CNRS - WP5 WS

  11. Mouvement quality Annotated vs. computed • Quality (annotated) • Number of repetitions • Fluidity: smooth / normal / jerky • Strength: soft / normal / hard • Speed: slow / normal / fast • Spatial expansion: contracted / normal / expanded • Computed • Start / end / duration • Mvt direction, type, angle approximation • Torso : Computed from Pose track JC Martin - LIMSI/CNRS - WP5 WS

  12. Annotation #1Multimodal coding scheme • Speech • transcription including non-verbal events (laughter, cry, …); • Posture • pose; posture shift including speed and action (4 cues with 3 to 10 attributes per each cue, for instance: cue = action, attribute = walk); • Gestures • phases of gesture (preparation, stroke, retraction), • handedness, speed, energy, spatial region, hand shape, direction of gesture, gesture type (beats, adaptors, deictic…); • Facial expressions • subset Facial Animation Parameters (FAPs) JC Martin - LIMSI/CNRS - WP5 WS

  13. Annotation #1Statistics • most frequently annotated behaviors : facial expressions (78.6% of annotated multimodal behaviors for coder1, 80.4% for coder2), gestures (11.3% for coder1, 11.9% for coder2), posture (10% for coder1, 7.7% for coder2). • most frequent attributes were: gaze direction (26.8% for coder1, 17% for coder2), head movements (23.5% for coder1, 21% for coder2), blinking (15.8% for coder1, 17.6% for coder2), eyebrows movements (10% for coder1, 9.3% for coder2). • quantitatively agreed for some attributes (number of annotations of preparation and stroke gestures phases, number of annotation of speed of posture shift). • Coder1 was more sensitive than coder2 in all the modalities. • Disagreements occurred on body poses, and gesture type and energy. Coder1 annotated subtle body moves, contrary to coder2 who annotated well visible movements. Coder2 associated gesture’s energy with gesture’s speed, while coder1 differentiated both attributes, perceiving that a gesture might have a high energy and a slow motion. JC Martin - LIMSI/CNRS - WP5 WS

  14. Annotation #1 Statistics • Many cues in coder1 annotations are shared by several emotion labels (blinking, head movements…), but there are also typical cues for some emotions such as lowering hands when despaired, slow body movement for serenity. • difference between behaviors linked to strong (anger, exaltation…) and weak (irritation, serenity), attributes for discriminating attributes: are speed and energy for gestures, and speed for body movement. • Serenity involves no gestures, whereas exaltation is often accompanied by fast and energetic gestures. • Anger is correlated with fast and intense gestures, whereas irritation involves slow and low-intensity gestures. JC Martin - LIMSI/CNRS - WP5 WS

  15. Annotation #1Quantitative analysis • Low intercoder agreement on some attributes • Reduce the number of values 7 => 3 • Improve annotation protocole & guide JC Martin - LIMSI/CNRS - WP5 WS

  16. Tracks or group • Tracks • Torso • Head • Facial expressions • Global body • Shoulders • (Arms) • (Gestures) • Alternation of pose and movements • Torso, head, shoulders • Common value for attributes: • Asymetry, other JC Martin - LIMSI/CNRS - WP5 WS

  17. Methodology • Annotation guide • Track per track • Annotate emotion vs. Communication • emotionally rich clips • reduced interaction (monologue in interviews) • exagerated mouth / brows movements JC Martin - LIMSI/CNRS - WP5 WS

  18. Torso • Movement direction to be computed from pose • Poses • 3 dimensions • twist, side-side, bend • rotational, lateral, sagittal • Labels + approximation of angles JC Martin - LIMSI/CNRS - WP5 WS

  19. Torso Pose Twist JC Martin - LIMSI/CNRS - WP5 WS

  20. Torso PoseSide-side / Bend JC Martin - LIMSI/CNRS - WP5 WS

  21. ExampleTorso fast movement JC Martin - LIMSI/CNRS - WP5 WS

  22. Head Mouvements • Numerous and combined => direction annotated in movement track • Primary & secondary • Position • Mouvement • FACS JC Martin - LIMSI/CNRS - WP5 WS

  23. ExampleHead : 2 directions - speed JC Martin - LIMSI/CNRS - WP5 WS

  24. Gestures structural transcription(Kipp 04; Efron 1941; McNeill 92) JC Martin - LIMSI/CNRS - WP5 WS

  25. Gesture functional transcription JC Martin - LIMSI/CNRS - WP5 WS

  26. ExampleHomogeneous sequence of stroke JC Martin - LIMSI/CNRS - WP5 WS

  27. ExampleManipulator gesture JC Martin - LIMSI/CNRS - WP5 WS

  28. Gesture annotation attributes • Deictic target: self / Camera • Manipulator target: Chest / Hairs / Eyebrows / Nose / Mouth • Object in hand: If the character is holding an object, enter the name of the object. • Spatial region: Up / Head / Chest / Down / Extreme periphery • Directness: Linear / Shaped pathway • Vertical direction: Upward / Downward • Horizontal direction: Leftward / Rightward • Sagittal direction: Forward / Backward • Hands relationship: Independent / Mirror / Asymmetric JC Martin - LIMSI/CNRS - WP5 WS

  29. Other annotations • Limited set of annotations for • Facial expression • Label + Action Unit (combination) • Gaze, brows, mouth, chin, nose • Shoulders • Arms • Global pose and mouvement JC Martin - LIMSI/CNRS - WP5 WS

  30. Future directions • Modifications for potential use as one source of knowledge for WP6 / WP4 • Adding temporal evolution in segments • Wrist position • Fluidity only for between gestures or repetitions ? • Integration with other sources of knowledge (temporal) • Validation of annotation • Perceptual tests at the different levels of multimodal annotation • Segment of multimodal behavior • Annotate common segments + intercoder agreement • Annotation of several videos • Evaluation of annotation time • Correlations between emotions and multimodal annotations JC Martin - LIMSI/CNRS - WP5 WS

  31. Architectural Principles of a Software Platform for the Management of Multimodal Emotional Corpora JC Martin - LIMSI/CNRS - WP5 WS

  32. Goals • Guidelines • Illustrative combinations of tools JC Martin - LIMSI/CNRS - WP5 WS

  33. Surveys of annotation tools for multimodal corpora • Tools • Anvil, TasX, • Surveys • ISLE D10, NITE, Harper Eurospeech, NISLab LREC 2004 paper • LREC WS 2002 / 2004 JC Martin - LIMSI/CNRS - WP5 WS

  34. Anvil (Kipp 2001)http://www.dfki.uni-sb.de/~kipp/research/index.html JC Martin - LIMSI/CNRS - WP5 WS

  35. TASXhttp://tasxforce.lili.uni-bielefeld.de/ Panel switch Tiers Start/End-point JC Martin - LIMSI/CNRS - WP5 WS

  36. Meta-dataMPI tools • Editor • Browser JC Martin - LIMSI/CNRS - WP5 WS

  37. Platforms examplesWizard of Oz (Buisine et al. 2003) JC Martin - LIMSI/CNRS - WP5 WS

  38. Requirements / description • Requirements of such a platform for emotion • Continuous / discrete • Replay / validation • Description • Software • Data files: • media, meta data • Annotations: manual, automatic, mixed • Coding schemes • Documentation files • Paper forms JC Martin - LIMSI/CNRS - WP5 WS

  39. Architecture • Tools • Input / output • Use during various iterations • Segmentation • Agreement / vote / reduce number of classes • Re-annotation • Audio only, video only, audio-video JC Martin - LIMSI/CNRS - WP5 WS

  40. Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS, France JC Martin - LIMSI/CNRS - WP5 WS

  41. IntroductionEmotional parameters of mm behaviors JC Martin - LIMSI/CNRS - WP5 WS

More Related