Distant Speech Recognition in Smart Homes Initiated by Hand Clapping within Noisy Environments. Florian Bacher & Christophe Sourisse [623.400] Seminar in Interactive Systems
Agenda • Introduction • Methodology • Experiment Description • Implementation • Results • Conclusion
Introduction • Smart homes have become a major field of research in information and communication technologies. • Possible way of interaction: Voice commands. • Goal of our experiment:evaluate the possibility of recognizing voice commands initiated by hand claps in a noisy environment. • Gather a set of voice commands uttered by various speakers.
Methodology • Main method: Lecouteux et al.  • Deals with speech recognition within distress situations. • Problem: no background noise was considered. • Chosen methodology: adapt Lecouteux et al. protocol considering: • Noisy settings. • Initiating recognition using hand claps.
Methodological issues • Choice of the room setting • Lecouteux et al. : a whole flat. • Vovos et al. : one-room microphone array. • Choice: one room with 2 microphones. • Choice of background noises • Hirsch and Pierce : NoiseX 92 database. • Moncrieff et al. : “Background noise is defined as consisting of typical regularly occurring sounds.” • Choice: background noises of the daily house life.
Experiment Settings • Performed in a 3m x 3m room. • Sounds were captured by two microphones which were hidden in the room.
Experimental Protocol • 20 participants (10 men, 10 women, 25,5 ± 11 years) participated to a 2-phase exp. • 1st phase: recognize a word (“Jeeves”) as a command • System’s attention is catched by double clapping. • 4 scenarios. • Background noises tested: step noises, opening doors, moving chairs, radio show. • 2nd phase: Gather a set of voicecommands • List of 15 command-words. • Reference record for pronounciation issues. • Eachwordisuttered 10 times.
Implementation • Used technologies: • C# Library System.Speech.Recognition: Interface to the Speech Recognition used by Windows. • Microphones: Two dynamic microphones with cardioid polar pattern (Sennheiser BF812/e8155) • Line6 UX1 Audio Interface • Line6 Pod Farm 2.5
Implementation • Signal is captured in real time. • If there are exactly two signal peaks within a certain timeframe, the software classifies them as a double clap. • After a double clap has been detected, the actual speech recognition engine is activated (i.e. the software is waiting for commands).
Conclusion • A new idea of how to initiate speech recognition in human computer interaction. • An evaluation of the potential influence of a noisy environment. • Results: encouraging, but not yet satisfying. • Next step: perform this experiment in a real smart-home-context.
References •  B. Lecouteux, M. Vacher and F. Portet. Distant speech recognition in a smart home: comparison of several multisouce ASRs in realistic conditions. Interspeech., 2011. •  A. Fleury, N. Noury, M. Vacher, H. Glasson and J.-F. Serignat. Sound and speech detection and classification in a health smart home. 30th Annual International IEEE EMBS Conference, Vancouver, British Columbia, Canada, August 2008. •  M. Vacher, N. Guirand, J.-F. Serignat and A. Fleury. Speech recognition in a smart home: Some experiments for telemonitoring. Proceedings of the 5th Conference on Speech Technology and Human-Computer Dialogue, pages 1 – 10, June 2009. •  J. Rouillard and J.-C. Tarby. How to communicate smartly with your house? Int. J. Ad Hoc and Ubiquitous Computing, 7(3), 2011. •  S. Moncrieff, S. Venkatesh, G. West, and S. Greenhill. Incorporating contextual audio for an actively anxious smart home. Proceedings of the 2005 International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pages 373 – 378, Dec. 2005. •  M. Vacher, D. Istrate, F. Portet, T. Joubert, T. Chevalier, S. Smidtas, B. Meillon, B. Lecouteux, M. Sehili, P. Chahuara and S. Méniard. The sweet-home project: Audio technology in smart homes to improve well-being and reliance. 33rd Annual International IEEE EMBS Conference, Boston, Massachusetts, USA, 2011. •  A. Vovos, B. Kladis and N. Fakotakis, Speech operated smart-home control system for userswithspecialneeds, in Proc. Interspeech 2005, 2005, pp. 193 – 196. •  H.-G. Hirsch and D. Pearce. The AURORA experimentalframework for the performance evaluation of speech recognition systemsundernoisy conditions. In ASR-2000, pages 181 – 188.
Thank you for your attention! Questions