1 / 19

Multimodal Alignment of Scholarly Documents and Their Presentations

Multimodal Alignment of Scholarly Documents and Their Presentations. Bamdad Bahrani JCDL 2013 Submission. Feb 2013. Motivation. How many papers do you read every week? How many you read deeply? How many you just skim? Title, abstract and conclusion  Enough?

Télécharger la présentation

Multimodal Alignment of Scholarly Documents and Their Presentations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimodal Alignment of Scholarly Documents and Their Presentations BamdadBahrani JCDL 2013 Submission Feb 2013

  2. Motivation • How many papers do you read every week? • How many you read deeply? • How many you just skim? • Title, abstract and conclusion  Enough? • A summary of the paper  Most important issues

  3. Motivation • Slide Presentation as a summary • It includes important contents from paper • It is made by the same author • But • Not detailed enough • Misses some technical parts of the paper

  4. Introduction • The Paper • and its Slide Presentation • Alignment map

  5. Previous Works • Hayamaet al. • 2005 • Japanese technical papers and presentation sheets • Using HMM • Kan • 2007 • SlideSeer • Crawling of paper-presentation pairs, aligning them and GUI • Beamer and Girju • 2009 • Detailed analysis of different similarity measures Only Textual Content

  6. Slide Analysis

  7. Error Analysis Around 70% are showing “Evaluation and Result”

  8. Alignment Modals • Text Similarity • Between each slide and each section • The core aligner unit • The baseline • A cosine similarity measure: TF . IDF • Linear Ordering • Ordering between slides and sections are monotonic • Visual appearance of slides

  9. Text Extraction Unit • Presentation • Paper Slide Title text Slide Body text Slide Number Slides MS PowerPoint VB compiler Section Title Section Body PDF XML PDFx Parser (via Python)

  10. Slide Image Classifier Unit • 1. Text • 2. Outline • 3. Drawing • 4. Results Slides Image Take Snapshot Image Classifier

  11. Image Class Instructions • 1. Text • Text similarity alignment weight  Increase 2/3 • 2. Outline • Text similarity alignment weight  Decrease 1/3 • Linear ordering alignment weight  Decrease 1/3 • 3. Drawing • Uniform probability for all weights • 4. Result • Exceptional rule: Align directly to “Experiment and Result” section

  12. Image Classifier experiment and result • 750 Manually annotated slides • Linear SVM • Feature extraction: Histogram of Oriented Gradiants • Blurring filters • Normalization • 10 fold cross validation

  13. Experiments • Experiment 1: • Baseline • Paragraph-to-slide alignment • Only textual data • Experiment 2: • Section-to-slide alignment • Only textual data

  14. Experiments • Experiment 3: • The effect of Linear Ordering alignment was added. • Textual data and ordering information • Experiment 4: • The effect of Image Classification was added. • Textual data, ordering information and visual content

  15. Results 25% Ordering Baseline Section Image Class

  16. Conclusion • Many slides with images and drawings • Textual data is not enough • Taking advantage of graphical features of slides

  17. Future Tasks • Bigger dataset • More efficient text similarity measures • Differentiate between Title and Body text weights • Support more input file format • A GUI to view aligned documents

  18. Thank you…!

  19. System Architcture Input: Presentation Multimodal Fusion Slide Image Classifier 1. Text 3. Drawing nil Text Extraction Textual Similarity 2. Index 4. Results Linear Ordering Output: Alignment Input: Document

More Related