1 / 32

Fidget Detection for Audio Video Meeting Analysis

This research paper outlines a system that combines video and audio processing techniques to capture and analyze meetings. The system includes fidget detection using frame differencing and temporal histograms, as well as fast Bayesian acoustic localization. The results demonstrate the effectiveness of the system in capturing and analyzing meeting data.

daltons
Télécharger la présentation

Fidget Detection for Audio Video Meeting Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fidget Detection for Audio Video Meeting Analysis Prashant K. Oswal Department of Electrical and Computer Engineering, Clemson University, Clemson, SC. 7th July, 2006

  2. OUTLINE • Introduction • Video Processing • Audio Processing • System Architecture • Results • Conclusion

  3. INTRODUCTION Importance of Meeting Capture and Analysis: • Takes care of schedule conflicts • Avoids note taking • Helps in decision making

  4. INTRODUCTION Sub-systems:

  5. INTRODUCTION Related work in Data Capture: Portable Meeting Recorder: Ricoh Innovations Distributed Meetings: Microsoft Research CAMEO: Carnegie Mellon University

  6. INTRODUCTION Related work in Data Analysis: • DM System- SSL; Head and shoulder profiles, face detection, multi-cue tracking, hierarchical verification; Key frame extraction. • Portable Meeting Recorder- SSL; luminance variation and geometric feature analysis; background/foreground extraction. • CAMEO system- Parts-based face detection • Yong Rui et al.- Motion detection and statistical skin-color tracking

  7. INTRODUCTION Related work in UI design: DM System Portable Meeting Recorder

  8. INTRODUCTION Other systems: TeamSpace (Georgia Tech, IBM, Boeing) Media Enriched Conference Room (FX Palo Alto Laboratory) Quindi Meeting Companion

  9. INTRODUCTION Our approach: DATA CAPTURE (off-the-shelf microphones and web cameras) DATA ANALYSIS (Fidget detection and Fast Bayesian acoustic localization) Direction of Sound Source q f

  10. VIDEO PROCESSING Fidget detection: • Frame differencing. • No background image required. • Temporal histograms. • Participant mug-shots. • Short-term histograms • Works well on low resolution images.

  11. Image at time (t-1) Image at time t FIDGET DETECTION:FLOWCHART Difference Image MOTION DETECTION Connected Components Fit 1D Gaussian on largest component TEMPORAL HISTOGRAM Construct Temporal Histogram Differentiate Histogram to detect slope changes PEAK DETECTION Detect Peaks Estimate Height Extract Mug Shot Participant 1 Mug Shot Participant n Mug Shot

  12. VIDEO PROCESSING Extracted Mug-shots Temporal Histograms, Peak Detection Input Images Motion Detection

  13. AUDIO PROCESSING Fast Bayesian acoustic localization: • Computationally efficient approach. • Sampled hemisphere. • Cross-correlation. • Azimuth and elevation angles.

  14. SAMPLED HEMISPHERE ANDCORRELATION VECTORINDICES Number of Latitudes Number of Longitudes FIND CANDIDATE LOCATIONS Microphone 1 Location Microphone 2 Location FIND CANDIDATE TO MICROPHONE TIME FIND CANDIDATE TO MICROPHONE TIME Speed of Sound Speed of Sound Sampling Rate FIND CORRELATION INDICES Correlation Vector Indices

  15. Microphone 1 Signal Microphone 2 Signal FASTBAYESIANACOUSTICLOCALIZATION PRE-FILTER PRE-FILTER CORRELATE Find Probability of Source at Each Candidate Location Correlation Vector Indices Source Probability Vector Source Probability Vector by correlating signals from: Mic 1 Mic 2 Mic 1 Mic 3 Mic 1 Mic 4 Mic 2 Mic 3 Mic 2 Mic 4 Mic 3 Mic 4 SUM PROBABILITIES Find Candidate Location with Highest Probability Estimated θ, φ

  16. AUDIO PROCESSING 2 Mapping audio results onto image frame: • Azimuth → Column • Elevation → Row Top View of Compact Array with Camera at centre X 3 1 4 Y Side View of Compact Array with Camera at centre 4 2 Camera field of view

  17. SYSTEM ARCHITECTURE VIDEO CAPTURE (Logitech QuickCam web camera) AUDIO CAPTURE (Delta44 sound card, Microphone element, Pre-amplifier) MEETING CAPTURE SYSTEM

  18. CPU SYSTEMBLOCK DIAGRAM M-Audio Sound Card (PCI Slot) USB Port Pre-Amplifier Pre-Amplifier Microphone 1 Microphone 4 Web Camera Microphone 2 Microphone 3 MEETING AREA

  19. SYSTEM ARCHITECTURE Software architecture consists of five classes: • Meeting Analysis • Capture Direct Show • Capture M-Audio • Fidget Detector • Acoustic Localizer

  20. S/W DIAGRAM:AUDIO CAPTUREAND PROCESSING

  21. S/W DIAGRAM:VIDEO CAPTUREAND PROCESSING

  22. RESULTS

  23. FRAMES 192 3600 3837 4255 SHORT-TERM HISTOGRAM LONG-TERM HISTOGRAM

  24. FRAMES 80 1425 2851 3583 SHORT-TERM HISTOGRAM LONG-TERM HISTOGRAM

  25. Frames 63, 217, 1440 Frames 2396, 2821, 2963 HEIGHT ESTIMATION

  26. ACOUSTIC LOCALIZATION & FIDGET DETECTION Frames 1613, 1813, 2217, 2648 Frames 11, 139, 812, 1477

  27. ACTUAL RESULTS FILTERED ACOUSTIC LOCALIZATION RESULTS

  28. CONCLUSION • New approach to meeting analysis. • Meeting capture using off-the-shelf microphones and web cameras. • Fidget detection - no clean background image required. • Fast Bayesian acoustic localization.

  29. FUTURE WORK • Omni-directional camera system. • Tracking participants. • Mapping audio results to image frame. • User interface design.

  30. QUESTIONS?

  31. ACKNOWLEDGEMENTS • Dr. Stanley T. Birchfield (Adviser) • Dr. John N. Gowdy (Committee) • Dr. Stephen J. Hubbard (Committee) • Miheer Gurjar and Prashanth Govindaraju – Experiments.

  32. THANK YOU.

More Related