1 / 42

Multimedia Grand Challenge 2012

Multimedia Grand Challenge 2012. Mei-Chen Yeh 04/24/2012. Midterm Report. Submission due date: May 8 report short presentation (10-mins) Max 4 pages, double column Word template Latex Come up with a solution to one of the grand challenges

tavia
Télécharger la présentation

Multimedia Grand Challenge 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimedia Grand Challenge 2012 Mei-Chen Yeh 04/24/2012

  2. Midterm Report • Submission due date: May 8 • report • short presentation (10-mins) • Max 4 pages, double column • Word template • Latex • Come up with a solution to one of the grand challenges • http://www.acmmm12.org/call-for-multimedia-grand-challenge-solutions/

  3. Why should I care about this? • I want to pass this course. • Look for ideas for your final project / thesis. • Writing a report and doing a project always take time. Why not turn the report/project into something beneficial?

  4. Here comes the opportunity! • 6 problems that Google, HP, NHK and other companies see in the future of multimedia • Cash Award • 3 prizes for last year • For every finalist team this year

  5. Make your resume stand out! 勝 …………………………. …………………………. …………………………. …………………………. …………………………. …………………………. Education Education master, NTNU master, NXU …………………………. …………………………. …………………………. …………………………. Publication Experiences xxx, “A new approach for automatic music video generation”, ACM Multimedia Grand Challenges, 2012. …………………………………………… …………………………………………… Experiences …………………………………………… …………………………………………… …………………………………………… …………………………………………… …………………………………………… ……………………………………………

  6. Great experience and great location! Beijing, 2009 Florence, 2010 Scottsdale, 2011 Nara, 2012

  7. 2012 Challenges • Google: Automatic Music Video Generation • 3DLife / Huawei: Realistic Interaction In Online Virtual Environments • HP: Understanding the Emotional Impact of Images and Videos • NHK: “Where is beauty?” Video Segment Extraction Based on Aesthetic Quality Assessment • NTT Docomo: Event Understanding through Social Media and its Text-Visual Summarization • Technicolor: Audiovisual Recognition of Specific Events

  8. Google Challenge:Automatic Music Video Generation

  9. Google Challenge • Music Vide = Visual + Audio • A befitting soundtrack makes a video compelling and likewise Lady Gaga’s music videos greatly enhance her songs. • Automatic Music Video Generation • How to auto-suggest a cool soundtrack to a user-generated video? • How to auto-generate interesting music videos?

  10. Use Case 1 • You have shot a few family videos on your smartphone, but you don’t want to upload them to YouTube because they look boring. • What if you could find a matching soundtrack? Wouldn’t it improve the appeal of the video and make you want to upload it? • Goal: make a video much more attractive for sharing by adding a matching soundtrack to it. • Bonus point: the application runs on Android or iPhone.

  11. Use Case 2 • Consider the case that you are hosting a home party. You have a playlist of party music, but you don’t have any matching music videos to show on your 50 inch TV. • Goal: automatically generate entertaining music videos that match the songs. • Bonus point: personalize the music videos to the people who are viewing them. You may focus on either of the two use cases.

  12. Evaluation • Novelty of the music video generation system • Entertainment value of the produced music videos

  13. http://www.mtv.com/

  14. http://www.mtv.com.tw/

  15. HP Challenge:Understanding the Emotional Impact of Images and Videos

  16. HP Challenge • Images and videos can serve as a powerful communications vehicle, conveying a wealth of information as well as emotional impact.

  17. HP Challenge • Images and videos are used extensively by professionals on web sites, magazine covers and printed advertisements to draw attention, communicate a message and leave a lasting emotional impression.

  18. HP Challenge • Understanding the Emotional Impact of Images and Videos: 6 research problems: • How do we characterize the response categories and levels of emotional impact? • What attributes of images and videos are associated with their emotional impact? • The color, composition, content, lighting, sharpness, and movement of an image or a video, … • What affective models can be used to predict the emotional impact of images?

  19. HP Challenge • How can we use the affective models to rank images and videos? • Can we use image and video transformations to change the emotional impact? • What are the applications of affective models?

  20. HP Challenge • Evaluation • how well the deep understanding of the emotional impact is used to create novel and compelling applications on the web, for the mobile devices, and for social networks.

  21. NHK Challenge: “Where is beauty?” Video Segment Extraction Based on Aesthetic Quality Assessment 美學的

  22. NHK Challenge • Goal: • “Where is beauty?” -- Automatic recognition of beautiful scenes in broadcast programs • Two key questions: • how beauty is defined • how to approach beauty • Dataset provided!

  23. NHK Challenge • Input • Broadcast video program “Japan’s Scenic Beauty ” (25 min x 10 programs) • Video format: MPEG1 (704 x 480 pixels) • Audio: MPEG Audio 44.1 kHz stereo 224 kbps (English) • Shot boundary data (xml file) • Output • List of extracted beautiful scenes that were ranked in the top 10% • The scenes should be described by the shot number that we provide or frame number and its duration • Recommended video: • 1 to 2-minute short video that is composed of extracted beautiful scenes

  24. NHK Challenge • Evaluation • Originality and adequacy of proposed algorithm • Reliability and variety of submitted beautiful scenes • Quality of the submitted short video (if submitted)

  25. NTT Docomo Challenge: Event Understanding through Social Media and its Text-Visual Summarization

  26. NTT Docomo Challenge • Goal: • Data-mining on social media to retrieve, summarize, and visualize events for a selected topic • Example • Topic: “local events for New York City” • Summarize twitter/flickr data to have the magazine like “New York of the Day.”

  27. NTT Docomo Challenge • InputResearcher working on this challenge should collect necessary data from Twitter or Flicker. There will be at least three types of data requirement for this challenge. • Images: Twitter or Flicker, or both • Text: Tweets from Twitter • 3rd party contents: News website such as New York Times, Blog, and others. • OutputThe output could be in a format of magazine, in which each article represents an event and each article is associated with either/both related images and texts. These images and texts should be self-explanatory to the article. The magazine could be summarized as daily basis, hourly basis, or even shorter.

  28. NTT Docomo Challenge • Research problems • Extract the local events from the Twitter data • Assign the location information to the image • Create a text summary of each local event with tweets and other 3rd party contents • Assign the most relevant images to each local event • Layout the articles and design the magazine

  29. NTT Docomo Challenge • Evaluation • Relevance of the summary/article to the actual topic • Relevance of the related images to the abstract text, or vice-versa • Quality of magazine design

  30. Technicolor Challenge: Audiovisual Recognition of Specific Events

  31. Technicolor Challenge • Goal • given a short video sequence, with audio, stemming from the coverage of a public event, the system should produce precise textual information on it.

  32. Technicolor Challenge • A description at the event identity level: • Which event is it? • When and where did it take place? • What is its context? • What is precisely happening in the audio-visual scene? • In particular, who are the persons in the scene? • Where are they in the image? • What are they doing or saying?

  33. Example

  34. Key ideas • Extract automatically as much information as possible from the audio-visual query and to use it to search the intertwined textual, audio and visual data available online! • Extraction of compact low-level audio-visual signatures • Detection and recognition of text present in the images • Detection and recognition of speech present in the audio track • Semantic analysis of the audio-visual content

  35. Huawei/3DLife Challenge: Realistic Interaction in Online Virtual Environments

  36. Huawei/3DLife Challenge • Goal • Support real-time realistic interaction between humans in online virtual environments • Scenario • An online dance class where a dance teacher and a student perform a series of movements

  37. Huawei/3DLife Challenge • Not limited to certain capture technology • Visual sensing techniques: a single camera, a camera network, wearable inertial motion sensing • Gaming controllers: the Nintendo Wii, the Microsoft Kinect

  38. Huawei/3DLife Challenge • Work with the provided data set to illustrate key technical components that would be required to realize this kind of online interaction and communication: • 3D data acquisition and processing from multiple sensor data sources • Realistic (optionally real-time) rendering of 3D data based on noisy or incomplete sources • Realistic and naturalistic marker-less motion capture • Human factors around interaction modalities in virtual worlds http://perso.telecom-paristech.fr/~essid/3dlife-gc-12

  39. Huawei/3DLife Challenge • A data set is provided, including: • Synchronization data between each of the multiple calibrated sources capturing the students movements; • Original music excerpts consisting of a few tracks at different tempos varying from low to fast; • Inertial (accelerometer + gyroscope + magnometer) sensor data captured from multiple sensors on the student’s body; • Depth maps for student performance captured using a Microsoft Kinect; • Ratings of the student performances by the teacher; • A form of annotation of the choreographies (mostly basic steps and movements for salsa beginners) performed.

  40. Start Early! • Upload your report on Moodle by 11:55pm, May 8, 2012 • Less than 4 pages, using the ACM MM template • Prepare for a short presentation (<10-mins) for sharing your ideas on a challenge • More information: • http://www.acmmm12.org/call-for-multimedia-grand-challenge-solutions/

More Related