1 / 11

Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities

Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities. Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities. Christine Sweeney, LANL, Co-Lead cahrens@lanl.gov Sean Hearne, ORNL, Co-Lead Thomas Proffen , ORNL, Co-Lead

ervin
Télécharger la présentation

Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities Data Collection, Reduction, Analysis and Imaging for Scientific User Facilities Christine Sweeney, LANL, Co-Lead cahrens@lanl.gov Sean Hearne, ORNL, Co-Lead Thomas Proffen, ORNL, Co-Lead Jack Wells, ORNL, Co-Lead August 20, 2019 Breakout for ORNL AI for Science Town Hall

  2. Slides Submitted to ORNL Breakout Science Data-related: • ML-generated metadata Imaging: • Leave No Measured Bit Behind: Solving Inverse Imaging/Scattering Problems with Machine Learning Facilities Control: • Analytics for Nuclear Facilities • Prognostic Nuclear System Health Management • AI for Accelerator Physics • Application of ML to HPC Performance Enhancement Through Monitoring, Analysis, and Feedback

  3. Subgroups • Making every user a nobel prize expert • Imaging/scattering workflow • Accelerator facility resilience

  4. BreakThrough: Making Every User a Nobel Prize Level Expert Three Core Components: • Personalized AI Suit - Captures Researchers Scientific Intention and Hypothesis - Supports Experiment preparation, execution and evaluation with knowledge, tools and decision support • AI-assisted Instrument - Understands everything about a specific instrument, correlates experience with that of other similar experiments, can provide advice and guidance to users, captures all required data for its own purpose and in collaboration with the personalized AI Suit, optimizes its own operation • Extreme Data Reduction - What if all data was used to train models, can these be reused to inform future experiments - where are the boundaries, decide what data to keep, where to use it • Foundational Qualities - Trust, Integrity, Explainability, Interpretability, Provenance, Reproducibility, Validation, Boundaries of Model Transferability and Theory Transferability, Bias

  5. IMPACT • Rather than furthering one or two grand challenges, BES facilities support 30,000 individual users each year, each with the potential to create a breakthrough, to make a discovery that will win the next Nobel prize. Each of these is different from all its peers and uses the instrument in the individual way. • Impacts will be seen across a broad spectrum of challenges from materials by design, new catalysts or active design of efficient bio crops.

  6. Imaging/Scattering Breakthrough Utilize AI to iteratively optimize a scattering/imaging experiment workflow: • Before experiment: Experimental design to combine AI and modeling • During experiment: AI-guided real time decisions, changing multiple parameters during an experiment, live multidimensional visualization of scattering/imaging data • After experiment: Bounding problem, denoising, background and artifact removal/reduction Challenges of current approaches: • Large scale facilities are expensive and difficult to access before oversubscription, thus efficient usage is essential. • Oversampling is often chosen to ensure publishable quality data. • Making uninformed decision during experiment AI approaches: • Model bank for denoising, and background and artifact removal/reduction • Medical Research: Propagation physics to predict what we should see in an image; develop predicted models for building synthetic training data • Materials Research: Solve 3D Bragg-edge imaging (limited boundary conditions, partial information lead to multiple answers)? Holy Grail is to map residual stress in 3D at micron resolution – impact in materials behavior • Training synthetic data generated via HPC simulations for scattering data TomoGAN: Low-Dose X-Ray Tomography with Generative Adversarial Networks Zhengchun Liu1,*, Tekin Bicer1,2, Rajkumar Kettimuthu1 , Doga Gursoy2,3, Francesco De Carlo2 , and Ian Foster1,4

  7. Impact of Imaging/Scattering Breakthrough on Science Grand Challenge(s) What science grand challenge(s) can you enable via this breakthrough? • Spatially resolved applied/residual stress in 3D in advanced materials • Evolution of materials characteristics (magnetic field, applied/residual stress, etc.) over time to yield better material design. Turbine blade is a great example. • Identifying new topological and quantum phases in materials • Nanosecond resolution of time-resolved manufacturing processes • Materials design. Eg. Design turbine blade so it cannot fail When can breakthrough impact science grand challenges (next 3-5 years, 5-10 years or later)?    • 5-10 years (stress) • 3-5 years (quantum) • 3-5 years (manufacturing) Why must AI be used for this breakthrough to impact science grand challenge – can it happen without AI? • Reaching information limit when pushing spatial and temporal resolution • Partial experimental data, model-based solutions are too complex to make sense. Need AI to categorize them.

  8. Accelerator Facility Resilience Breakthrough Eliminate unscheduled downtime: • Identify failure precursors • Understand what accelerator sensor data is good and what data indicates an upcoming failure • High uptime (99%+) of accelerator supports ADR (accelerator driven reactors) • This is urgent as the end science user’s experiments get preempted if the accelerator fails. Current Data Available: • Beam properties, status of accelerator equipment, • Unspecified failure modes (“check engine light”) • Maintenance inspection of the accelerator facility • Human entered electronic logbook

  9. Accelerator Facility Resilience (cont) What do you expect to find in the data that has not been analyzed? • Identification of what piece of equipment is about to fail. Anomalous data. AI Approaches: use other fields with similar problems to motivate approaches

  10. Impact of Accelerator Facility Resilience on Science Grand Challenge(s) What science grand challenge(s) can you enable via this breakthrough? • Keeping accelerators up for over 99% of the time would enable Accelerator Driver Reactor that would allow changing nuclear waste products into less dangerous elements • Additional impact is more productive users and facility utilization improvements (eg build better next-gen facilities) When can breakthrough impact science grand challenges (next 3-5 years, 5-10 years or later)?    • Equipment is cycled over time, but can be upgraded for better collection when life cycled, which can be many years. Why must AI be used for this breakthrough to impact science grand challenge – can it happen without AI? • Looking for patterns in sensor data. Since data is many dimensional a human is not capable of sorting through this size and scope of data. Automating eliminates bias. There is no model for this today. Other fields have similar challenges with complex machines: • Z-machine and other light sources • Cyber-security • HPC data centers • Power-grid • ESNet

  11. Takeaways, Common Themes • Accelerating discoveries at user facilities, build ecosystem of AI-enabled technologies to make facility user’s job easier • Making facility more capable of doing new science, faster time/space resolution • Infrastructure for collecting data has additional needs over HPC facility • Common question: High-value data is generated at user facilities -- who should host it and make it available?

More Related