Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions

F. Ringeval, A. Sonderegger, J. Sauer, D. Lalanne Department of Informatics – Psychology Université de Fribourg – Universität Freiburg, Switzerland Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions 2nd International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space, emoSPACE 2013, April 26th, 2013

Corpus Design • Why creating a new corpus of emotion? • Idea originally comes from the EmotiBoard project (enhancing emotional awareness for remote collaborative interactions) • Context of remote collaborationhas not been studied so far • No existing corpus with bothaudio-visual and physiological data, neither with French speakers • Objective of the corpus • Provide rich and consistently annotated multimodal data of natural human behaviour in a context of remote dyadic collaboration Incrustation of emotional feedback into audiovisual data of the SEMAINE database; publication submitted to ACII 2013

Corpus Design • Videoconference situation (2 persons working together) • 2 x 2 between subjects design • Independent variables • Emotion feedback (yes/no): study the impact of EmotiBoard • Emotion manipulation (positive/negative): increase difference in emotional valence between participants of a team • Participants • 46 students (58.7% female) • Mean age: 22 years ± 3 (min: 18, max: 32) • French speakers with different origins: 33 French, 4 Germans, 8 Italians and 1 Portuguese

Corpus Design • EmotiBoard: emotional feedback generation • Vertical interactive surface on which multiple users can interact using different devices • Java library to transmit and display as client/server wizard-of-oz ratings of user’s emotion (arousal & valence) EmotiBoard: emotional feedback generation

Corpus Design • Collaborative task • As simple as possible, while ensuring that people would be both motivated and sufficiently involved with regard to their emotions • Winter survival exercise: 15 items have to be ranked according to their significance for survival in a deserted and hostile area (plane crash)

Corpus Design • Procedure • 1st self-report: emotion questionnaire (SAM) • Individual ranking of the items of the survival task; 10 min. • Display of a film clip for emotion induction; 5 min. • 2nd self-report: emotion questionnaires (SAM & PANAS) • Discussion to agree on the final 15 items’ rank; 20 min. • 3rd self-report: emotion questionnaires (SAM & PANAS), subjective workload, team collaboration and team satisfaction SAM’s manikins for arousal SAM’s manikins for valence

Corpus Design • Participant’s location • Separate rooms in semi basement with thick closed curtains and neon lighting from the ceiling; kept constant all along sessions

Multimodal Recordings • Audio sensor • HQ unidirectional headset + LQ omnidirectional microphones (built in webcam) • External sound cards: (1) Phantom alimentation of microphone, (2) Skype videoconference and (3) biosignals synchronisation • Recording with Audacity software; 44.1kHz, 16bits AKG 520L microphone Lexicon Omega Studio; external sound card Audacity audio recording software

Multimodal Recordings • Video sensor • HD 720p webcam; Logitech C270, 1080x720p, 25Hz • 2 webcams per participant: Skype and video recording • LQ audio signal captured for post-synchronisation of HQ audio with video data • Recording with webcam’s software; gain and contrast fixed once and auto-adjustment turned off Logitech C270 webcam Logitech webcam’s recording software

Multimodal Recordings • Physiological sensors • ECG: palm of right hand, right and left inner ankles • EDA: end of the index and middle fingers • Biopac MP36 unit and Biopac Student Lab software (BSL Pro); 1kHz • Synchronisation pulses are emitted each second to the external sound card when recording begins (DB9 output → Mono Jack) EDA sensors Back of the BIOPAC MP 36 unit BSL Pro recording software; from top to down: EDA, ECG and RR biosignals

Multimodal Recordings • Data Synchronisation • Video and HQ audio signal: localisation of a sync event in both HQ and LQ audio signals + inter-correlation maximisation (20ms); precision of 1ms • Biosignals and HQ audio signal: synchronisation pulses (right channel) make synchronisation trivial; precision of 1ms Inter-correlation signal between HQ and LQ audio data Left (audio) and right (sync pulses) channels of HQ signal

Multimodal Recordings

Data Annotation • ANNEMO: ANNotating EMOtions • Web-based annotation interface; Google Chrome web-browser • Emotional behaviours: arousal and valence (continuous time & values) • Social behaviours: agreement, dominance, engagement, performance and rapport (discrete time & values)

Data Annotation • Annotation Data Collection • 6 French speaking annotators (3M + 3F) annotated all the corpus • Oral instructions (4 pages document) + practice on 4 sequences • Automatic check of annotation data by a dedicated algorithm, e.g., blanks, missing sequences, wrong order of annotation, etc. • Only the first 5 minutes of interaction were annotated

Data Annotation • Post-processing and analysis • Piece-wise cubic interpolation and binning into 40ms frames • Local normalisations: zero-mean and synchronization • Good inter-annotator agreement rate for the affective dimensions,and a fairly good one for the social dimensions

Conclusion • Conclusion: • RECOLA: a new corpus of REmote COLlaborative and Affective interactions in French • 3 well synchronized HQ signals: audio, visual and ECG+EDA • Rich and consistent annotations of socio-affective behaviours; internal (self-reporting) and external (3M+3F) • From 27 subjects (5.5h of multimodal data) to 34 subjects (7h of audiovisual data) considering positive consent forms • ANNEMO: a new web-based annotation tool of emotion ALL WILL BE PUBLICLY MADE AVAILABLE SOON! Stay informed on: http://diuf.unifr.ch/diva/recola

Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions