1 / 46

Xuan Bao Romit Roy Choudhury

MoVi : Mobile Phone based Video Highlights via Collaborative Sensing. Xuan Bao Romit Roy Choudhury. Context. Next generation smart phones will have large number of sensors Cameras, microphones, accelerometers, GPS, compasses, health monitors, …. Mobile phone can be regarded as a

pascha
Télécharger la présentation

Xuan Bao Romit Roy Choudhury

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MoVi: Mobile Phone based Video Highlights via Collaborative Sensing Xuan Bao Romit Roy Choudhury

  2. Context Next generation smart phones will have large number of sensors Cameras, microphones, accelerometers, GPS, compasses, health monitors, …

  3. Mobile phone can be regarded as a window of views Retrieve views from the physical world

  4. Abundance of electronic information Which view is of “interest”?

  5. Can “interesting” views be automatically pulled out?

  6. Information distillation is a broad topic To get a handle, we narrow down the problem space Can phones create a highlight movie of a social gathering without human intervention?

  7. MoVi Goal • Envisioning the end product • Imagine a social party of the future • Assume phones are wearable • GOAL: Create 10-min movie highlights without human intervention • The Idea: • Mobile phones sense ambience • Collaboratively infer an “interesting event” • Select phone with good view of the event • Stitch the recorded clips to form the highlights Nokia Morph Apple IPod Nano Microsoft SenseCam 9

  8. The problem is analogous to “event monitoring” in conventional sensor networks Except, events are social … And occur over multiple sensing dimensions

  9. MoVi: Mobile Video Highlights • Introduction • Challenges • System Design • Experiment Results • Limitation and Ongoing Work

  10. 1. Who collaborates with whom? • Identifying Social Groups • Dynamic over time, not necessarily spatial Collaboration Group

  11. Notion of “interesting events” is subjective Need patterns … training … or a rule book 2. What is interesting?

  12. View Selection Need to select best view from available cameras 3. Which camera angle to choose? Face sky Good Blur Blocked

  13. 4. When did the event happen and end? • Need to select the right time span of an event • Clues may come after start of event • Require rewinding to logical start of event Joe talks Alice jokes Bob talks Sense Laughter Recording should start here

  14. MoVi: Mobile Video Highlights • Introduction • Challenges • System Design • Experiment Results • Limitation and Ongoing Work

  15. MoVi Architecture Phones are grouped according to ambience Identify multi-modal event triggers Select video with the best view Select the time span for events

  16. Who collaborates with whom? Group Management 18

  17. Group management Visual Acoustic Ambience Ringtone View Similarity Light

  18. Acoustic: Ringtone High frequency ringtone Like a wireless beacon … measures distance to transmitter Grouping based on overhearing

  19. Acoustic: Ambience Grouping based on ambient sound correlation Use MFCC (Mel-Frequency Cepstral Coefficients) as features Classify using SVM Couples together most times Few misclassifications

  20. Visual: View Similarity & Light Grouping based on view similarity Exploiting spatiograms (spatial + color histogram) Originally used to track objects Classify based on light Intensity 3 simple buckets

  21. Identifying interesting social events Group Management Event Detection 23

  22. Event Detection Specific signature Laughter Group behavior View similarity Ambience fluctuation Group rotation Neighbor assistance

  23. Specific Event Signature - Laughter Specific event signature: laughter Detected using MFCC and SVM over audio training set Detecting laughter happening time

  24. Group Behavior - View Similarity Detecting people paying unusual attention to the same object

  25. Group Behavior - Ambience Fluctuation Detecting burst of sound, fluctuation in accelerometer reading, unusual change of light…

  26. Neighbor Assistance Neighbor assistance If human express interest, phones can follow Phones taking pictures will send out signals Brings human into the loop Human choices are given priority

  27. Which camera view to use? Group Management Event Detection View Selector 29

  28. View Selection Face count Accelerometer ranking Human assistance Light intensity ranking

  29. View Selection Four heuristics (1) Face count: often interesting to humans (2) Accelerometer ranking: stable camera (3) Light intensity: rule out blocked views (4) Human in the loop: manually taken pictures are better Good view

  30. When did the event start … end? Group Management Event Detection View Selector Event Segmentation 32

  31. Event Segmentation Classify sound states (voice gender, music, pitch …) Search for distinct transitions before/after the trigger Joe talks Alice jokes Bob talks Sense Laughter

  32. MoVi: Mobile Video Highlights • Introduction • Challenges • System Design • Experiment Results • Limitation and Ongoing Work

  33. Field Experiments Field experiments Real social gatherings Thanksgiving party SmartHome tour All system features Thanksgiving Duke Smart Home

  34. Field Experiments Experiment Set Up 5 students taped iPod Nano on shirt pocket Carried Nokia N95 phones on belt clip 1 dedicated video camera recorded entire party (2 hours) Offline Evaluation Automatic highlights (20 min) Manually created highlights Evaluate overlap

  35. Zoom In View MoVi Human

  36. Thanksgiving Party MoVi selected Non-Overlap Human selected Captured

  37. SmartHome Tour MoVi selected Non-Overlap Human selected Captured

  38. Metrics Thanksgiving: 38% SmartHome: 31% Thanksgiving: 39% SmartHome: 48% Thanksgiving: 21% SmartHome: 23%

  39. MoVi: Mobile Video Highlights • Introduction • Challenges and Solutions • System Design • Experiment Results • Limitation and Ongoing Work

  40. Limitations MoVi in very early stage Limited trigger space … sensor set Difficult to infer “socially interesting” Battery, Privacy, computation power Ongoing work Exploring larger set of triggers More sensors (virtual sensors) Energy efficiency of phones Possibility to combine with other devices Wall mounted cameras, webcams, …

  41. Take Away Sensing, Computing, Communications … converging on the mobile platform Future will allow users to zoom into the world and look at it at much higher resolution …. However, lets not take this for granted. Our lives already have excessive information … Let’s not add more noise.

  42. Questions? Thank You! Visit the SyNRG research group @ http://synrg.ee.duke.edu/

  43. Controlled Experiment Artificial social gathering of students 5 students taped phones on shirt pocket Gathered in a group Watching movies Playing video games Triggers used to select “interesting” clips Mainly to test triggers performing correctly

  44. Ctrled. Exp. Results Trigger detection Event name Time of occurrence Effective trigger Time of detection

More Related