1 / 23

Hierarchical Segmentation: Finding Changes in a Text Signal

Hierarchical Segmentation: Finding Changes in a Text Signal. Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center. Problem Statement. Problem How do we browse video? Goal Create a table-of-contents Solution Look for topic changes in text. Chapter 1. Chapter 2. TOC Example.

katyb
Télécharger la présentation

Hierarchical Segmentation: Finding Changes in a Text Signal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Segmentation:Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center

  2. Problem Statement • Problem • How do we browse video? • Goal • Create a table-of-contents • Solution • Look for topic changes in text

  3. Chapter 1 Chapter 2 TOC Example

  4. Scale Space Filter LSI Segment Overview of This Talk • Goal and approach • Latent semantic indexing (LSI) • Scale space • Combination • Results

  5. Approach • Sentences -> Semantic Space • Filter at multiple scales • Look for large jumps • Three subjects (loops) shown • Loop 1: Polychromaticity Artifacts • Loop 2: Emission Tomography • Loop 3: Ultrasound Tomography

  6. Courtesy of Jianbo Shi (CMU) Building on Previous Work • LSI and clustering • Text tiling • Change point analysis • Segmentation • Scale space

  7. Docs 10D Latent Semantic Indexing • Collect histogram of word frequencies • Use SVD to capture frequent combinations • Orthogonal decomposition • Represent in low-dimensional space Docs Words

  8. LSI Within a Document • Split into chunks • Fixed size • Sentences • Compute histograms • Perform SVD • Look at results • Sources • “Principles of Computerized Tomographic Imaging” • PBS News Hour

  9. LSI – 2D Projection Chapter 4 of Principles of Computerized Tomographic Imaging

  10. LSI – Self-similarity • Measure similarity • Cosine of angle between “documents” • Plot all pairs of chunks/sentences • Look for block diagonal Chapter 4 of Principles of Computerized Tomographic Imaging

  11. Scale-space Filtering • What size are the features? • Look at different scales! • Continuous scale • Used for • Object Recognition • Feature Detection

  12. Green line marks best high-level segmentation 10d semantic space Scale varies from 1 to 400 sentences Scale-space Movie

  13. Scale-space Segmentation • Low pass filter signal • Form image of scale vs. time • Look for changes • Track peaks of vector derivative across scale

  14. Scale-space Example • Derivative as function of scale and sentence

  15. LSI and Scale Space • Putting it all together • Split document/transcript • Perform LSI analysis • Look at change in angle • Perform scale-space segmentation • Show tree

  16. Peaks in scale-space derivative Peaks traced to their origin Scale-Space Image

  17. Results – CT • Comparison • Scale-Space • Book Headings

  18. Results – News • Comparison • Scale-Space • Ground Truth

  19. Results – Autocorrelation • Block sentences • Measure correlation • Positive Peak • Anti-correlation

  20. Discussion Issues • Evaluation (and ground truth) • Lafferty’s measure • Temporal properties • Histogram/SVD chunking size • Autocorrelation

  21. Computational Effort • Histogram: O(N) • SVD: O(N3) • Scale space: O(N2) • N < 1000 • Number of sentences in a video or document is not large

  22. LSI Document Lookup • Histogram documents • Entropy term weighting • Compute SVD • Use first 10-100 vectors to model space • Encode query as histogram • Look for documents in similar direction

  23. LSI Example • Collection of book titles • Differential equations vs. algorithms and applications

More Related