1 / 18

UTAM Reproducible Research Package and C++ Seismic Library

UTAM Reproducible Research Package and C++ Seismic Library. Samuel Brown November 5, 2008. Outline . URRP: UTAM Reproducible Research Package UCSL: UTAM C++ Seismic Library. URRP Motivation.

ciro
Télécharger la présentation

UTAM Reproducible Research Package and C++ Seismic Library

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UTAM Reproducible Research Package and C++ Seismic Library Samuel Brown November 5, 2008

  2. Outline • URRP: UTAM Reproducible Research Package • UCSL: UTAM C++ Seismic Library

  3. URRP Motivation • Create a centralized, version-controlled, software release with regression tests, promoting code reuse by students and sponsors. • Provide the ability to reproduce results from all UTAM reports in a uniform and comprehensible manner.

  4. URRP Distribution • URRP is available through a secure subversion repository. • Reproducible scripts have an interface for downloading images and intermediate results through sftp.

  5. Firewall? No svn, no sftp? • Compressed repository snapshots for annual and midyear releases will be available on the UTAM website. • Compressed images and intermediate results reside in a directory tree on the UTAM website, which mirrors the URRP reports directory.

  6. Source Code/Compilation • bash and csh scripts for environment setup. • Top-level SConstruct file with autoconf functionality for C/C++/F90 programs. • Matlab library.

  7. Reproducibility • Report directories contain: • paper.tex – latex report • run.py - python script • other scripts, directories, etc.

  8. run.py • run.py: uniform interface for reproducing results • python only, does not use scons • imperative, not declarative • generates and runs shell scripts interactively • simple mechanisms for downloading data and controlling computation with sources and targets • can interface with PBS

  9. run.py • run.py consists of a number of individual processes • A ‘process’ is a call to the URRP python function process( ) • 1 process for downloading images • 1 process for compiling latex paper • 1 or more processed for reproducing results or downloading intermediate results

  10. A Simple Process Process( cmds=[’matlab –nosplash –nodisplay < xcorr.m’] )

  11. A More Involved Process Process( cmds=[‘ ucsl_fdac par=mod1.par’], sources=[ ‘vp.rsf’, ‘recv_coord.txt’], targets=[ ‘csg1.su’], docmds=1, wdir=‘./csgs’, bdir=‘./batch’, sdir=‘./mod’, pbs=1, nodes=4, ppn=2, walltime=0:30:00 ) )

  12. Outline • URRP: UTAM Reproducible Research Package • UCSL: UTAM C++ Seismic Library

  13. UCSL Motivation • Provide library for rapid development of flexible, robust, high-performance research codes. • Find an appropriate balance between imperative and object-oriented programming. • Provide a high level of abstraction to enable performance and flexibility, ie task-specific file objects with optional MPI I/O.

  14. Applications • Development initially driven by: • Flexible 2D/3D modeling/RTM. • Ray tracing and interferometric imaging of earthquake data.

  15. Forward Modeling – PML • Problem: • When implementing PMLs, there are up to 26 regions requiring a different combination of fields/damping. • To get the best result, all valid regions should be implemented, with full ghost region communication. • For simplicity and performance in the FD kernel, PML regions should reside in separate volumes. • This greatly complicates domain decomposition and communications.

  16. cart_decomp • C++ domain decomposition object. • Computes balanced domain decomposition distributed along any combination of axes using a boundary condition cost function. • Uses MPI topologies. • Builds arrays of subdomain volumes for each PE. • FD application only has to ask for as many volume arrays as are necessary for a given implementation.

  17. cart_halo • C++ ghost region communication object. • Each subdomain group requiring ghost region communication creates a halo object. For 2nd order time acoustic modeling, the subdomain group would be a 2 x nsubdomain volume array. • cart_halo handles all communication between local and remote subdomains with two functions: • start_update(int tslice, int half) • finish_update() • Ability to overlap communication and computation with option to split subdomains along the z axis.

  18. Flexible Implementations • Source and receiver groups are also abstracted as objects. • Implementing a new parallel modeling code is simply a matter of: • Providing kernels for interior and boundary regions. • Providing a time-stepping loop. • Writing a small amount of initialization code.

More Related