1 / 20

Worcester Polytechnic Institute

Worcester Polytechnic Institute. XmdvTool Interactive Visual Data Exploration System for High-dimensional Data Sets. http://davis.wpi.edu/~xmdv. Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine Rosario, Allen R. Martin, Ying-Huey Fua, Daniel Stroe .

virote
Télécharger la présentation

Worcester Polytechnic Institute

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Worcester Polytechnic Institute XmdvTool Interactive Visual Data Exploration System for High-dimensional Data Sets http://davis.wpi.edu/~xmdv Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine Rosario, Allen R. Martin, Ying-Huey Fua, Daniel Stroe This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276

  2. XmdvTool Features • Hierarchical visualization and interaction tools for exploring very large high-dimensional data sets to discover patterns, trends and outliers • Applications: • Bioterrorism Detection • Bioinformatics and Drug Discovery • Space Science • Geology and Geochemistry • Systems Monitoring and Performance Evaluation • Economics and Business • Simulation Design and Analysis • Multi-platform support (Unix, Linux, Windows) • Public domain software:http://davis.wpi.edu/~xmdv

  3. Xmdv: Main Features • Scale-up to High Dimensions: Visual Hierarchical Dimension Reduction • Scale-up to Large Data Sets: Interactive Hierarchical Displays, Database Backend with Minmax Encoding, Semantic Caching and Adaptive Prefetching • Interlinked Multi-Displays: Parallel Coordinates, Glyphs, Scatterplot Matrices, Dimensional Stacking • Visual Interaction Tools:N-Dimensional Brushes, Structure-Based Brushing, InterRing

  4. Scale-Up for Large Number of Dimensions Solution to High Dimensional Datasets: • Group Similar Dimensions into Dimension Hierarchy • Navigate Dimension Hierarchy by InterRing • Form Lower Dimensional Spaces by Dimension Clusters • Convey Dimension Cluster Information by Dissimilarity Display

  5. Visual Hierarchical Dimension Reduction Process

  6. Visual Hierarchical Dimension Reduction Process A 42-dimensional Data Set A 4-Dimensional Subspace Dimension Hierarchy Interaction Tool: InterRing

  7. InterRing - Dimension Hierarchy Navigation and Manipulation Roll-up/Drill-down Rotate Zoom in/out Modify Distort

  8. Dissimilarity Display Three Axes Method Diagonal Plot Method Axis Width Method Mean-Band Method

  9. Scale-up for Large Number of Records Solution to Large Scale Datasets: • Group Similar Records into Data Hierarchy • Navigate Data Hierarchy by Structure-Based Brushing • Represent Data Clusters by Mean-Band Method • Provide Database Backend Support using MinMax Tree, Caching, Prefetching

  10. Interactive Hierarchical Display 2D example Hierarchical Clustering Structure-Based Brushing

  11. Interactive Hierarchical Display Flat Display Hierarchical Display Mean-Band Method in Parallel Coordinates

  12. Interactive Hierarchical Display Flat Display Hierarchical Display Mean-Band Method in Parallel Coordinates

  13. Scalability of Data Access • Approach • Attach database system to visualization front-end • MinMax hierarchy encoding • Key idea: avoid recursive processing • Pre-computed • Caching • Key idea: reduce response time and network traffic • Prefetching • Key idea: use application hints and predict user patterns • Performed during idle time

  14. Pre-compute object positions level-of-detail (L) extent values (x,y) preserve tree structure New query semantics objects are now rectangles select objects that touch L select objects that touch (x, y) structure-based brush = intersection of two selections level of detail L x y extent values L query = (x, y, L) x y Scalability of Data Access:MinMax Hierarchy Encoding

  15. Scalability of Data Access: Caching • Purpose • reduce response time and network traffic • Issues • visual query cannot directly translate into object IDs • high-level cache specification to avoid complete scans • Semantic caching • queries are cached rather than objects • minimize cost of cache lookup • dynamically adapt cached queries to patterns of queries

  16. Scalability of Data Access: Prefetching • Strategy • Speculative (no specific hints) • navigation remains local • both user and data set influence exploration • Adaptive (strategy changes over time) • Evolves as more knowledge becomes available • Non-pure (interruptible prefetching) • leave buffer in consistent state • Requirements • non-pure prefetching + large transactions & small object size + semantic caching  small granularity (object level) • speculative, non-pure prefetcher  cache replacement policy + guessing method

  17. Scalability of Data Access: Experimental Evaluation • Conclusions: • Caching reduces response time by 80% • Prefetching further reduces response time by 30% • Designing better prefetching strategies might help further reduce response time

  18. m(n) (m-1) m (m+1) m(n+1) m(n-1) m(n-2) m(n) Hot Regions Current Navigation Window m(n-1) m(n+1) m(n-2) Scalability of Data Access: Prefetching Mean Strategy Random Strategy Direction Strategy Localized Speculative Strategies Exponential Weight Average Strategy Focus Strategy Data Set Driven Strategy Vector Strategies

  19. OFF-LINE PROCESS MinMax Labeling Hierarchical Data DB DB DB Flat Data Loader Schema Info User Translator GUI Rewriter MEMORY Exploration Variables Buffer Queries Prefetcher Library: Buffer Estimator ON-LINE PROCESS Random Direction Focus Mean EWA Xmdv System Implementation • Tools • C/C++ • TCL/TK • OpenGL • Oracle 8i • Pro*C

  20. Publications (available at http://davis.wpi.edu/~xmdv) • Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, "InterRing: An Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures",InfoVis 2002, to appear • Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward and Daniel Stroe, “Prefetching For Visual Data Exploration.” Technical Report #: WPI-CS-TR-02-07, 2002 • Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, “Interactive Hierarchical Displays: A General Framework for Visualization and Exploration of Large Multivariate Data Sets”, Computers and Graphics Journal, 2002, to appear • Daniel Stroe, Elke A. Rundensteiner and Matthew O. Ward, “Scalable Visual Hierarchy Exploration”, Database and Expert Systems Applications, pages 784-793, Sept. 2000 • Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Hierarchical Parallel Coordinates for Exploration of LargeDatasets”, IEEE Proc. of Visualization, pages 43-50, Oct. 1999 • Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Navigating Hierarchies with Structure-Based Brushes”, IEEE Proceedings of Visualization, pages 43-50, Oct. 1999

More Related