1 / 49

Visualization Taxonomies and Techniques Text: Documents and Collections

Visualization Taxonomies and Techniques Text: Documents and Collections. University of Texas – Pan American CSCI 6361, Spring 2014. Text: Documents and Collection Overview. Visualizations for Document sets Words & sentences Analysis metrics Concepts and themes Recall, sensemaking

fruma
Télécharger la présentation

Visualization Taxonomies and Techniques Text: Documents and Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualization Taxonomies and TechniquesText: Documents and Collections University of Texas – Pan American CSCI 6361, Spring 2014

  2. Text: Documents and CollectionOverview • Visualizations for • Document sets • Words & sentences • Analysis metrics • Concepts and themes • Recall, sensemaking • Gaining a better understanding of the facts at hand in order to take some next steps • (Betterdefinitions in VA lecture) • Information Visualization can help make a large document collection more understandable more rapidly

  3. Overviews of DocumentsDocument Cards • Provide a quick browsing, overview UI, maybe especially useful for small screens • Paper has useful detail about text processing • Document Cards • Strobelt et al. TVCG (InfoVis) ‘09 • Compact visual representation of a document • Show key terms and important images

  4. Document Cards • Video without sound • http://www.informatik.uni-konstanz.de/en/deussen/publications/#c87338

  5. Representation Layout algorithm searches for empty space rectangles to put things

  6. Interaction • Hover over non-image space shows abstract in tooltip • Hover over image and see caption as tooltip • Click on page number to get full page • Click on image goes to page containing it • Clicking on a term highlights it in overview and all tooltips

  7. Interaction • Hover over non-image space shows abstract in tooltip • Hover over image and see caption as tooltip • Click on page number to get full page • Click on image goes to page containing it • Clicking on a term highlights it in overview and all tooltips

  8. Bohemian Bookshelf • Video • http://vimeo.com/39034060

  9. Bohemian Bookshelf Serendipitous browsing, Video, Thudt et al CHI ‘12

  10. Themail • Visualize one’s email history • With whom and when has a person corresponded • What words were used • Answer questions like: • What sorts of things do I (the owner of the archive) talk about with each of my email contacts? • How do my email conversations with one person differ from those with other people?

  11. Themail Text analysis to seed visualization Monthly & yearly words

  12. Themail Query UI

  13. PaperLens Video, but no sound

  14. PaperLens • Focus on academic papers • Visualize doc metadata such as author, keywords, date, … • Multiple tightly-coupled views • Analytics questions • Effective in answering questions regarding: • Patterns such as frequency of authors and papers cited • Themes • Trends such as number of papers published in a topic area over time • Correlations between authors, topics and citations

  15. PaperLens a) Popularity of topic b) Selected authors c) Author list d) Degrees of separation of links e) Paper list f) Year-by-year top ten cited papers/ authors – can be sorted by topic

  16. NetLens • More Document Info • Highlight entities within documents • People, places, organizations • Document summaries • Document similarity and clustering • Document sentiment

  17. Jigsaw

  18. Jigsaw • Targeting sense-making scenarios • Visual analytics • Variety of visualizations ranging from word-specific, to entity connections, to document clusters • Primary focus is on entity-document and entity-entity connection • Search capability coupled with interactive exploration • Stasko, Görg, & Liu Information Visualization ‘08 • Will see video after Visual Analytics section

  19. JigsawDocument view

  20. JigSawList View Entities listed by type

  21. JigSawDocument Cluster View Entities listed by type

  22. JigsawDocument Grid View Here showing sentiment analysis of docs

  23. Jigsaw Video

  24. Kohonen’s Feature MapsOrganizing Document Collections • AKA Self-Organizing Maps (SOMs) • Complex, non-linear relationships between high dimensional data items into simple geometric relationships on a 2-d display • Uses neural network techniques • Lin Visualization ‘92 • Different, colored areas correspond to different concepts in collection • Size of area corresponds to its relative importance in set • Neighboring regions indicate commonalities in concepts • Dots in regions can represent documents

  25. Kohonen’s Feature MapsOrganizing Document Collections

  26. Kohonen’s Feature MapsOrganizing Document Collections

  27. Kohonen’s Feature MapsOrganizing Document Collections

  28. Work at PNNL

  29. Work at PNNL • Group has developed a number of visualization techniques for document collections • http://www.pnl.gov/infoviz • Galaxies • Themescapes • ThemeRiver • ...

  30. Galaxies Presentation of documents where similar ones cluster together

  31. Themescapes Self-organizing maps didn’t reflect density of regions all that well Use 3D representation, and have height represent density or number of documents in region

  32. WebTheme

  33. Spire Video

  34. Maps of Science • Maps of Science • http://scimaps.org • Visualize the relationships of areas of science, emerging research disciplines, the impact of particular researchers or institutions, etc. • Often use documents as the “input data”

  35. Maps of Science • Book and web site • http://scimaps.org

  36. Map of Science http://scimaps.org/maps/map/map_of_scientific_pa_55/

  37. Map of Science http://scimaps.org/maps/map/maps_of_science_fore_50/

  38. Map of Science • Science Related Wikipedia Activity • http://scimaps.org/maps/map/science_related_wiki_49/

  39. Temporal Issues Semantic map gives no indication of the chronology of documents Can we show themes and how they rise or fall over time? ThemeRiver

  40. ThemeRiver Time flows from left->right Each band/current is a topic or theme Width of band is “strength” of that topic in documents at that time

  41. Topic Modeling High interest topic in text analysis and visualization Latent Dirichlet Allocation Unsupervised learning Produces “topics” evident throughout doc collection, each modeled by sets of words/terms Describes how each document contributes to each topic

  42. TIARA • Keeps basic ThemeRiver metaphor • Liu et al CIKM ‘09, KDD ‘10, VAST ‘10 • Embed word clouds into bands to tell more about what is in each • Magnifier lens for getting more details • Uses Latent Dirichlet Allocation to do text analysis and summarization

  43. TIARA

  44. TiaraFeatures Lens shows email senders & receivers Documents containing “cotable”

  45. TextFlow • Showing how topics merge and split • Cui et al TVCG (InfoVis) ‘11

  46. ParallelTopics Dou et al VAST ‘11

  47. End .

  48. Sources • Stasko, J. (2013). CS 7450 - Information Visualization at Georgia Tech • Hearst, M. (2009) Search User Interfaces http://www.searchuserinterfaces.com/ • Ch. 10: Information Visualization for Search Interfaces: 10.3, 10.4, 10.9, 10.10; http://searchuserinterfaces.com/book/sui_ch10_visualization.html • Ch. 11: Information Visualization for Text Analysis; http://searchuserinterfaces.com/book/sui_ch11_text_analysis_visualization.html • F. Viegas, M. Wattenberg, "Tag Clouds and the Case for Vernacular Visualization", interactions, Vol. 15, No. 4, Jul-Aug 2008, pp. 49-52.

  49. Video Resources • TileBars (file) 5+ min • “document cards” video (vimeo) – no sound • Parallel Tag Clouds http://www.youtube.com/watch?v=rL3Ga6xBgLw (long) • FeatureLenshttp://www.cs.umd.edu/hcil/textvis/featurelens/ (long – find spots) • FacetAtlashttp://vimeo.com/76069684 - 17 minutes • http://in-spire.pnnl.gov/ - 3:50 • Galaxy, etc. • Jigsaw (file) – VISUAL ANALYTICS EXAMPLE • PaperLens (file) • FeatureLens (file) • Mani-Wordle - Koh et al TVCG (InfoVis) ‘10 -- video

More Related