1 / 41

Mining Complex Evolutionary Phenomena

Mining Complex Evolutionary Phenomena. D. Thompson, B. Gatlin Center for Computational Sytems Mississippi State University. M. Jiang, M. Coatney, S. Mehta, S. Parthasarthy, R. Machiraju Computer and Information Science The Ohio State University. T-S. Choy, S. Barr, J. Wilkins

Télécharger la présentation

Mining Complex Evolutionary Phenomena

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Mining Complex Evolutionary Phenomena D. Thompson, B. Gatlin Center for Computational SytemsMississippi State University M. Jiang, M. Coatney, S. Mehta, S. Parthasarthy, R. Machiraju Computer and Information Science The Ohio State University T-S. Choy, S. Barr, J. Wilkins Department of Physics Ohio State University

  2. Complex Evolutionary Data

  3. Insights Into Evolutions • Study evolution through simulations • Model them using continuum models • Obtain discrete models and solve • Generate data • However, …

  4. Data Horror Stories … 4.5 million points 1500 time steps with full volume output every 4 time steps (375 solutions) 750 MB per solution 281.25GB of data O(108) grid points Generates >10 Terabytes per day (every day) Write to disk every 1/1000 time steps (99.9% discarded) Final database ~1 Terabyte All analysis is done after final database is obtained …

  5. Analysis, Mining Visualization

  6. Solutions ! • Get the rings of the smoke • Track them in time • Mine their properties • Use some science drivers

  7. Driver 1 - CFD Vortices

  8. Swirling Features

  9. Swirling Features

  10. CFD Of Interest – Bronchial Flow • Complex Non-rigid, Fractal-like Geometry • Deep recursive branching structure • Need insights into how flow changes • Study Vortices, swirling flow • Q: Persistence of vortex ? • Implications • Pulmonary drug delivery • Carcinogen Deposition

  11. Flow Evolution – Internal Flow

  12. Object of Study: Vortices • Swirling regions • Core (Center of vortex) and swirling streamlines …

  13. Driver 2 - Material Formation Grain GrainBoundary

  14. MD Of Interest – Defect Evolution • Active Device sizes (Si-based transistors) passive components (alloys) are shrinking • At sub-micron levels extended defects effect performance • Extended defects • Si is doped with Boron in a “Hot Bath” • Non-uniform solidification • Arise from point defects • Study evolution of point defects and formation of extended defects • Q: What structures finally remain ?

  15. Defect Evolution

  16. Object of Study: Defects Defect Atoms - Red ! • Point defects – interstitial and vacancy • Interstitial – Si atoms located at non-bulk position

  17. Problem Statement • Need – Locating, Characterizing & Tracking Structures in Large Domains. • Acts of Discovery and Perseverance! • Approach desired • Tied to simulations • Multiple time scales • Organized Search • Encode Structure, dynamics and relationships • Incorporate complex physics in discovery • Classification and categorization (similarity) • Verification of discovered entities for veracity • Generalize to other domains

  18. Framework ApplicationCFD, MD, … Sensor Multires Transforms Meta-stability Detection Transient Detection Feature Mining Event Detection Feature Tracking Catalog Spatio-temporal Rule Mining

  19. Components • Sensors – • Monitoring a stream • Swirl (CFD), Energy (MD) • Multiresolution Analysis • Temporal wavelet transform • Casual transforms • Eulerian Framework • Can be used with a spatial sub-division • Event Detection • Changes in Feature Demographics • Birth, death, continuation • Aggregation, bifurcation • Has impact on tracking

  20. Tracking - Correspondence Lagrangian Framework

  21. Feature Mining Mechanics • Do not just use raw data • Features – A feature is a manifestation of the correlations between various parameters • Feature Mining – • Extract meta-stable features using underlying physics • Describe features as tangible shapes

  22. Shapes Point cloud Proximity graphs Conical frusta

  23. Similar Efforts - CFD Marusic, Kumar, Karypis, Interrante, U of Minn. Frequent subgraphs

  24. Similar Efforts - MD • Defect is infrequent, atomsets of bulk are not ! • Run common substructure discovery algorithm • Get bulk ! • Remove atoms contained in common substructure atomsets • Remainder of structure is defect! Alloys (Ni3Al) I1 Defect !

  25. Our Efforts Finding Needles In HayStack

  26. Feature Mining 1 Data Transform Tour Grid Operator Aggregate Classify Points Denoise Track Rank Catalog ROIs Classify-Aggregate

  27. Applying To Defect Detection Visit all atom sites Atom-site: Is it part of defect ? Spatially aggregate atomsin located areas ! Works for quenched defects (local equilibria)

  28. Feature Mining for Defects • Build spatially local classifiers • Define Bulk • Form Rules to define Bulk --- C1, C2,…,Cn • Typical Rules: • C1 = prescribed bond length • C2 = prescribed bond angle • Defect is not bulk

  29. Feature Mining for Defects • Core Defect Atoms will satisfy C = ~C1 AND ~C2AND ~C3 … AND ~Cn • Find neighborhood by locating atoms which satisfy D = ~C1 OR ~ C2OR ~C3 …. OR ~Cn • Defect = Embed C graph in D graph • D is needed to deal with noise and uncertainty of conditions Ci • Cluster all atoms in D

  30. Results – I3 Defects I3A Defect I3B Defect

  31. Related Work - SAL Aggregate Classify Original Redescribe Yip&Zhao 96

  32. Does It Work Always ? • Compute Swirl • Local Classification Method • Swirling regions contain vortices • False Positives ! • Cannot extract structures! Classify-Aggregate

  33. Solution - Feature Mining 2 Data Transform Tour Grid Operator Verify Aggregate Denoise Track Rank Catalog ROIs Aggregate-Classify (Verify)

  34. Classify-Aggregate Yellow: Good Green:Bad Yellow ones really swirl !

  35. Classifier • Simple and efficient ! • Can be error prone  • Since One verifies • Point-based approach: • Label neighbors • Combinatorial: • Locally check for complete triangles

  36. 2 Swirling Criteria Verification Tools Verification

  37. Non-verifiable Regions

  38. Defects at Finite Temp. Visit all atom sites Atom-site Is part of defect ? Spatially aggregate atomsin located areas ! Quench defect to verify

  39. Current Work • Streaming • Tracking and Correspondence • Shape Descriptors • Data Structures for Data Management • Spatio-temporal associations

  40. Summary • Computational Sciences need computational instruments • Need to be scalable and use all lessons learned from parallel, distributed, streaming and out-of-core implemenations • Need to exploit underlying source of data • Should provide good hooks to data-mining and intelligent systems • Need very Interdisciplinary work !

  41. Questions ?

More Related