Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Digital Video Library PowerPoint Presentation
Download Presentation
Digital Video Library

Digital Video Library

112 Vues Download Presentation
Télécharger la présentation

Digital Video Library

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Digital Video Library Experience in Large Scale Content Management VIEW Technologies Symposium – CUHK – August 2002 Howard Wactlar Carnegie Mellon University, USA

  2. Acquisition Distribution Analysis and Organization ………………………. Digital Compression ………………………. Broadcast TV Radio Surveillance PDA Cell Phone Speech Recognition Image Analysis Natural Language Interpretation 0 1 1 1 0 1 0 1 0 0 0 1 1 0 …… …… …… …… …… …… …… …… Segmentation Cable Database Satellite Internet Training Film Video Life Cycle

  3. Informedia Overview Establishment of large video libraries as a network searchable information resource Mission: Enable Search and Discovery in the Video Medium REQUIREMENTS: • Automated process for information extraction from video • Full-content search and retrieval from all spoken language and visual documents • APPROACH: • Integration of machine speech, image and natural language understanding for library creation and exploration

  4. Sample Corpora CNN News Broadcasts 1997-2002 (2050 hours) • 68,000 segments/stories • 1.7 Million “shots” • China Historical and Cultural Documentaries (100 hours) • English language • Western perspective

  5. Some Examples

  6. Why is Multimedia Difficult?

  7. Challenges of Data Extraction

  8. Recognizing Scene Text and Faces Scene Text Detection

  9. Interpreting Images Containing Similar Content

  10. Understanding Speech in Natural Settings Style Variations careful, clear, articulated, formal, casual spontaneous, normal, read, dictated, intimate Voice Quality breathy, creaky, whispery, tense, lax, modal Speaking Rate normal, slow, fast, very fast Context sport, professional, interview, free conversation, man-machine dialogue Stress in noise, with increased vocal effort (Lombard reflex), emotional factors (e.g. angry), under cognitive load

  11. Gathering Information with Faulty Technology • Retrieval performance in the presence of inaccuracy and ambiguity in the underlying cognitive processing • Approximate match in meaning and visualization • Presentation and reuse of library content • New data type with space and time dimensions • Restricted use intellectual property • Interoperability in the absence of standards

  12. Challenge of Continuous Production

  13. Annual Video and Audio Production Commercial • 4500 motion pictures -> 9,000 hours/year (4.5 TB) • 33,000 TV stations x 4 hrs/day -> 48,000,000 hrs/yr (24,000 TB) • 44,000 radio stations x 4 hrs/day -> 65,500,000 hrs/yr (3,275 TB) Personal • Photographs: 80 billion images -> 410,000 TB/yr • Home videos: 1.4 billion tapes -> 300,000 TB/yr • X-rays: 2 billion -> 17,000 TB/yr Surveillance • Airports: 14,000 terminals x 140 cameras x 24 hrs/day -> 48 M hrs/day

  14. Annual Print Production Commercial • 22,600 newspapers x 30 pgs/day -> 124 TB/year • 80,000 periodicals x 5,000 pgs/yr -> 52 TB/yr • 40,000 scholarly journals x 1,700 pgs/yr -> 9 TB/yr

  15. Video Visualization ____ Summarizing and Visualizing the Result Set

  16. Summarizing Thousands of VideosExample: Map Collage Drought Drought North Pacific Ocean South Pacific Ocean Fire Floods Map collage summarizing “El Niño effects” showing distribution by nation with overlaid thumbnails

  17. The Need for Visualization Strategies • As digital video assets grow, so do possible result sets • We transmit with limited bandwidths to limited screen “real estate” • As automated processing improves, more metadata enables more dimensions and interfaces into the video content • Users want to apply multiple perspectives interchangeably • Direct manipulation interfaces are required to place the user in control

  18. Some Examples

  19. Video Digests Overview first, zoom and filter, then details-on-demand • Concatenate scene elements into a single panoramic view • Visualize word-based relationships • Establish timelines showing trends against time • Present maps (or diagrams) showing geographic (or spatial) correlations • Combine digests into a single view or animated into a temporal presentation (the auto-documentary)

  20. Metadata Extractor Summarizer Content-based Metadata Extraction Enables Video Visualization and Summarization People Event Affiliation Location Topics Time Personalized Presentation User Perspective Templates

  21. Information Goals • Generate information perspectives on-demand: • e.g., by time, location, personalities, events • Eliminate redundancy • Link all the way back to source content to interactively and dynamically provide any level of detail and summarization • Communicate results

  22. Knowledge Goals • Detect trends • Reveal relationships • Infer causality • Discover anomalies • ….

  23. Acquisition Distribution Analysis and Organization ………………………. Digital Compression ………………………. Broadcast TV Radio Surveillance PDA Cell Phone Speech Recognition Image Analysis Natural Language Interpretation 0 1 1 1 0 1 0 1 0 0 0 1 1 0 …… …… …… …… …… …… …… …… Segmentation Cable Database Satellite Internet Training Film Video Life Cycle $$ $$

  24. Application Space Consumer and Business • Evolving and archived news and information • Education and training • Sports and entertainment • Interactive television • Personal memory aids Professional and Enterprise • Conventions and tradeshows • Meetings/corporate memory

  25. Digital Video Library Thank you