A Framework for Visualizing Science at the Petascale and Beyond

A Framework for Visualizing Science at the Petascale and Beyond Kelly Gaither Research Scientist Associate Director, Data and Information Analysis Texas Advanced Computing Center

Outline of Presentation • Science at the Petascale • Scaling Resources • Scaling Applications • Access Mechanisms • Issues and Impediments • Framework

Science at the Petascale Global Weather Prediction Formation and Evolution of Stars and Galaxies in the Early Universe Understanding Chains of Reaction with Living Cells

Scaling HPC Resources • Mission • Provide greater computational capacity/capability to the science community to compute ever larger simulations • Enablers: • Commodity multi-core chip sets with lowpower/cooling requirements • Efficient packaging for a compact footprint • High-speed commodity interconnects for fast communications • Affordable! (Nodes with 8 cores, 2GB/core memory, in the 2K/node price range)

TeraGrid Network Map: BT2 Largest single machine 96 TF

TeraGrid Network Map: AT2 Ranger Feb 2008 (0.5 PF)

TeraGrid Network Map: AT2 Kraken June 2008 (~1PF) Ranger Feb 2008 (0.5 PF)

TeraGrid Network Map: AT2 Track2C 2010 (>1PF) Kraken June 2008 (~1PF) Ranger Feb 2008 (0.5PF)

TeraGrid Network Map: AT1 Track1 2010 (10PF) Track2C 2010 (>1PF) Kraken June 2008 (~1PF) Ranger Feb 2008 (0.5PF)

Scaling Analysis Resources • Mission • Provide an interactive interface allowing users to manipulate/view the results of their science • Enablers: • Commodity chips with lowpower/cooling requirements? Commodity graphics chips yes, low power/cooling no! • Efficient packaging for a compact footprint? Until recently, desktop box packaging, now available in rack mounted 2U boxes • High-speed commodity interconnects for fast communications? Yes! • Affordable?(Nodes with 8 cores, 6GB/core memory, in the 10K/node price range) No!

TeraGrid Network Map: BT2 UIC/ANL Cluster: 96 nodes, 4GB/node, 96 GPUs Largest single machine 96 TF Maverick: 0.5TB shared memory, 16 GPUs, 128 cores

TeraGrid Network Map: AT2 Track1 2010 (10PF) Track2C 2010 (>1PF) UIC/ANL Cluster: 96 nodes, 4GB/node, 96 GPUs Kraken June 2008 (~1PF) Ranger Feb 2008 (0.5PF) Spur: 1TB aggregate memory, 32 GPUs, 128 cores

Impediments to Scaling Analysis Resources • Power and cooling requirements: (10x more power needed for analysis resource)! • Footprint: (2x more space needed for analysis resource)! • Cost: (5x more money needed for comparable analysis resource)!

Scaling HPC Applications • Mission • As the number of processing cores increases, scale as close to linearly as possible • Enablers: • Science driven need to solve larger and larger problems – so significant intellectual body of work applied to scaling applications • There is basic information that you know ahead of time • Size of the problem you want to solve • Number of unknowns that you are trying to solve for • Decomposition Strategy • Communication patterns between nodes

Application Examples: DNS/Turbulence

Application Examples: DNS/Turbulence Courtesy: P.K. Yeung, Diego Donzis, TG 2008

Application Example: Earth Sciences Mantle Convection, AMR Method Courtesy: Omar Ghattas, et. al.

Application Example: Earth Sciences Courtesy: Omar Ghattas, et. al.

Scaling Analysis Applications • Mission • As the number of processing cores increases, scale as close to linearly as possible • Enablers: • Science driven need to solve larger and larger problems? Yes, but it’s more complicated than that • Is there basic information that you know ahead of time? • Size of the problem you want to analyze? Yes • Decomposition Strategy? Tricky! • Communication patterns between nodes? Dependent on your decomposition strategy!

Impediments to Scaling Analysis Applications • Decomposition Strategy is a moving target! Tied to the viewpoint. • Have an additional requirement for interactive frame rate performance!

Accessing HPC Applications • Mission: • Provide mechanisms for submitting and perhaps monitoring job performance • Enablers: • Schedulers for submitting jobs – comes with a price! • Impediments: • Weak support for interactive applications • Still in the mode of hypothesize, run, check…

Accessing Analysis Applications • Mission: • Provide mechanisms for interactively running applications to analyze data • Enablers: • Lots of intellectual capital in remote and collaborative access mechanisms – this is where we are ahead of the HPC community • Remote desktop • VNC • AccessGrid

Impediments to Reaching the Petascale • 10x power requirement • 2x space requirement • 5x more expensive • Tenuous balance between requirement for interactive performance and need to scale to more processing cores • Retrofitting our access mechanisms to work with batch schedulers

Requirements for Designing a Framework for Visualizing at the Petascale • 10x power requirement • 2x space requirement • 5x more expensive • Address balance between requirement for interactive performance and the need to scale to more processing cores • Retrofit our access mechanisms to work with batch schedulers Not something I can address short term

Requirements for Designing a Framework for Visualizing at the Petascale • Minimize Data Movement – users can generate 100s of TB of data, but can’t move it off the storage local to the machine it was generated on • Optimize for the platforms that we can run on – data starved cores become much more apparent • Reduce the barriers to entry

SCOREVIS Software Stack • Scalable, Collaborative and Remote Visualization • NSF STCI funded project that began March 1 • Balance Goals: • Accessibility: provide remote and collaborative access to visualization applications over common networks with standard communications protocols. • Rendering: include data decomposition, the transformation from data primitives to geometric primitives, and the transformation from geometric primitives to pixels. • Scalability: choose between image decomposition or data decomposition depending on underlying size of the data and number of processors available.

SCOREVIS Requirements • Minimize Data Movement • Address balance between requirement for interactive performance and the need to scale to more processing cores • Retrofit our access mechanisms to work with batch schedulers • Optimize for the platforms that we can run on – data starved cores become much more apparent • Reduce the barriers to entry

SCOREVIS Approach • Minimize Data Movement – move analysis to where the data is generated • Address balance between requirement for interactive performance and the need to scale to more processing cores – address data decomposition and scaling of applications and core algorithms • Retrofit our access mechanisms to work with batch schedulers – allow remote and collaborative access • Reduce the barriers to entry – phased approach providing access to familiar applications – OpenGL based

Application OpenGL Hardware Screen Application Node GLX and X Protocol Application libGL/Xlib X Server OpenGL Hardware Screen Application/Client Host User’s Local Display Traditional OpenGL Architecture

VNC Server on Login Node Compute Node Application/UI Rendering Application/UI Rendering Compositing Chromium Hardware Accelerated OpenGL or Mesa Software Rendering Application Processing (in some cases) … … VNC Client VNC Client VNC Client Chromium VNC Client Chromium Server SCoReViS Architecture

SCOREVIS To Date • Been successful in providing remote and collaborative access to visualization applications based on OpenGL – with a caveat (ParaView and VisIt and HomeGrown) • Did get “interactive” frame rates – 6-10 fps • Been successful in profiling to better understand where the bottlenecks exist in the analysis pipeline: • I/O (Lustre Parallel file system ~32GB/sec) • Core visualization algorithms (current apps do not do a good job for load balancing) • Rendering in mesa – quickly find out that native mesa does not handle multiple cores • Also developing quick and dirty ways to handle in-situ analysis

Acknowledgements: • Thanks to the National Science Foundation Office of Cyberinfrastructure for supporting this work through STCI Grant #0751397.

For more information, contact: Kelly Gaither kelly@tacc.utexas.edu Texas Advanced Computing Center Questions?

A Framework for Visualizing Science at the Petascale and Beyond

A Framework for Visualizing Science at the Petascale and Beyond

Presentation Transcript

Practical science: A framework for reflection

Visualizing Vocabulary: Moving Beyond Lists and Definitions

Preparing for Petascale and Beyond

End to End Scientific Data Management Framework for Petascale Science

The New Discovery Paradigm For Science and Beyond…

Scalable Spectral Transforms at Petascale

Visualizing Vocabulary: Moving Beyond Lists and Definitions

RTM at Petascale and Beyond

Real Science at the Petascale

Petascale

Towards Petascale Computing for Science Horst Simon

A Global Petascale Computing Environment for Fundamental Science

Cyber-Infrastructure for Materials Simulation at the Petascale

Petascale astronomy and the SKA

Petascale Science with GTC/ADIOS

Visualizing the Science of Climate Change

Beyond The Science…

Visualizing Vocabulary: Moving Beyond Lists and Definitions

Visualizing Earth Science