html5-img
1 / 24

DiFX Performance Testing

DiFX Performance Testing. Chris Phillips eVLBI Project Scientist 25 June 2009. DiFX history. Developed by Adam Deller at Swinburne University of Technology (now NRAO) to replace LBA S2 correlator to allow disk based correlation Production correlator of the LBA (Australia) since 2007

cera
Télécharger la présentation

DiFX Performance Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DiFX Performance Testing Chris Phillips eVLBI Project Scientist 25 June 2009

  2. DiFX history • Developed by Adam Deller at Swinburne University of Technology (now NRAO) to replace LBA S2 correlator to allow disk based correlation • Production correlator of the LBA (Australia) since 2007 • Verified against LBA, VLBA and Bonn hardware correlators

  3. DiFX overview • FX-style correlator implemented in C++ • 95% optimised C vector function call (Heavy reliance of Intel IPP libraries) • Non-clocked system, unlike HWCs • Maximum performance without compromising generality or ease of maintenance • Modular design to support generality and enable “3rd party” contributors and local system optimisation

  4. Capabilities • Near-arbitrary time and frequency resolution • Advanced pulsar gating • eVLBI (LBA has done 1 Gbps eVLBI) • Correlate anything it can unpack (1/2/4/X Gbps) • Most new formats easy to implement

  5. Supported formats • Input • LBA • Mk5A (Mk4/VLBA) • K5 (via translation) • Mk5B • VDIF(end 2009) • Output • RPFITS, FITS-IDI

  6. Current users • Long Baseline Array (Australia) • VLBA (USA) • MPIfR (Bonn, Germany) • AuScope geodetic array (Australia/NZ, 2009) • E-LOFAR (EU)

  7. Future/Imminent Capabilities • Single pass, multiple phase center's • Improved (faster) fringe rotation • Band matching • eg 2x64MHz with 1x128MHz • Baseband pulsar "folder" • Native geodetic output format • Phase cal extraction • Frequency division multiplexing of VDIF • Polyphase filterbank

  8. Baseband data DataStream 1 Core 1 DataStream 2 Core 2 … … processing buffer processing buffer DataStream N Core M processing buffer Visibilities Timerange, destination Source data Master Node Visbility buffer Visbility buffer Visbility buffer DiFX architecture Large, segmented ring buffer Up to 100s MB/ a few or more seconds MPI is used for inter-process communications Each data transfer is double buffered

  9. Computational Distribution • Currently: only time division multiplexing • VDIF will allow frequency division multiplexing: implementation style? • As currently implemented all baselines must still be correlated on one Core

  10. Benchmarking • Need to eliminate disk i/o go get clear indication of potential speed of specific setup • eVLBI! • Live eVLBI not suitable as fixed data rate • VLBIFAKE program generates eVLBI data stream • LBADR, Mark5B and VDIF • TCP and UDP • Only TCP usable for benchmarking • Shell script to run correlator and save logs • Rate determined by median transfer from VLBIFAKE CSIRO. eVLBI-Aus

  11. Cuppa • 20 nodes, dual CPU Quad core • 6 stations • Up to 12 processing nodes • Testing number of threads and processing cores CSIRO. eVLBI-Aus

  12. Scaling with Cores

  13. Date Rate Per Compute Node

  14. Scaling with Threads

  15. Scaling with Threads

  16. Scaling with Spectral Points

  17. Scaling with Stations

  18. APSR • 18 compute nodes, dual CPU Quad core • 5 i/o nodes dual CPU dual core • 4 stations • Up to 18 processing nodes CSIRO. eVLBI-Aus

  19. APSR • 18 compute nodes, dual CPU Quad core • 5 i/o nodes dual CPU dual core • 4 stations • Up to 18 processing nodes CSIRO. eVLBI-Aus

  20. Date Rate Per Compute Node

  21. Code collaboration status • Entire codebase has been organised on SVN (hosted by ATNF) • DiFX wiki (hosted by Curtin): http://cira.ivec.org/dokuwiki/doku.php/difx/index • Mailing list: difx-users@googlegroups.com • To get on the difx-users list, search out difx-users on google groups and request access, or email me

  22. Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: enquiries@csiro.au Web: www.csiro.au Thank you ATNF Chris Phillips eVLBI Project Scientist Phone: +61 2 93724608 Email: Chris.Phillips@csiro.au Web: www.atnf.csiro.au/vlbi

  23. Benchmarks • Non-clocked system, unlike HWCs • Indicative number of CPU cores required to correlate at real time: • LBA @ 1 Gbps (256 MHz agg. b/w, 2 bit): 100 • VLBA @ 4 Gbps (1 GHz agg. b/w, 2 bit): 800 • Weak dependencies on e.g. num. channels • 160 CPU core system (exceeding VLBA HWC capacity) costs <$100k inc. networking, annual electricity ~$10k

More Related