320 likes | 414 Vues
HST Pipeline Project Review March 14, 2003. Review Objectives. Re-familiarize Project (and others) with production data processing done by STScI Familiarize everyone with new processing hardware and how we plan to use it
E N D
Review Objectives • Re-familiarize Project (and others) with production data processing done by STScI • Familiarize everyone with new processing hardware and how we plan to use it • Describe the steps we will be taking to shift development, I&T, and production data processing from the old systems to the new systems
Introduction • History – Long View • History – Last Year • Data processing requirements • Goals of this project • Overall plan
What do we mean by “data processing” ? • Receipt of science and engineering data • Reformatting, quality checking, calibration, etc. needed to prepare data for the archive • Archiving the data • Retrieving the data • Processing and calibration of retrieved data • Sending data off to the user • User access tools
History – Long View • Original plan (1981) • TRW provides OSS and PODPS as two of three major pieces of SOGS • OSS to be used for real-time decision making • PODPS to process science data, including calibration, for users • STScI provides SDAS (analysis tools) • Established FITS format as basic science data format • Data provided to users on 1600/9600 bpi tapes • No archive
History – Long View • Pre-launch changes (1981-1990) • Astrometry and Engineering Data to come to STScI • PODPS to run STSDAS-based calibrations • STScI to develop CDBS (calibration data base system) • Archive activities started • STScI developed DMF, a prototype optical disk based archive, pressed into service ~ L+ 1 year • DADS development started at Loral • StarView development started at STScI
History – Long View • Post-Launch changes I (1990-1996) • DADS delivered, data transferred from DMF to DADS • Starview released • OMS developed for engineering data and jitter files • OPUS replaced OSS and PODPS • Consolidated software systems • Important technology upgrade to support future growth • Pipeline development for STIS and NICMOS started
History – Long View • Post-Launch Changes II (1996 – 2001) • Data volume doubled with STIS and NICMOS • Archive utilization increased substantially • UNIX version of OPUS developed for FUSE • Archive upgraded • Magneto-Optical media replaced Optical Disks • NSA project opened DADS architecture to multiple storage media • Spinning disks considered, but judged too expensive • CDBS re-implemented • OTFR deployed • Reduced archive volume • Provided up-to-date calibrations to users
History – Long View • Additional improvements and consolidations have been in our plans over the last few years • DADS evolution • Remove VMS dependencies • Make future technology migrations easier • Improve services based on community usage of HST archive • Replace OMS • Remove VMS dependency • Simplify system • ACS data and data processing • Increased volume • Drizzle algorithms for geometric correction and image co-addition
History – Last Year • Several parts of the system exhibited unacceptable performance • Processing of data from HST to the Archive • Response time to user requests for data from Archive • Several specific causes • NFS mount problems • Disk corruption in OPUS • Jukebox problems • Other specific hardware problems • Symptomatic of more general problems with the data processing systems
(1 day/week = 14%) Goal: <5% History – Last Year
History – Last Year • Immediate steps were taken to upgrade available hardware • Added 6 CPUs and memory to Tru64 systems • Added CPUs and memory to Sun/Solaris systems • Added and reconfigured disk space • Large ACS data sets moved to an ftp site to avoid load on archive system • EROs and GOODs data sets • Ftp site off-loaded ~10GBytes/day from archive in last several months (~20% effect)
Current status • System keeping up with demands • Running ~50% capacity on average • Loading in various places is very spiky • Instability of system, and diversion of resources, has put delivery of data to ECF and CADC substantially behind schedule • Expect load to increase in spring as ACS data become non-proprietary
Bulk distribution backlog • In absolute numbers: ~40,000 POD files • Archive Branch does not believe the current AutoBD can keep up with the current data volume, much less catch up. • Implement ftp tool to augment transfer. Tool accesses data on MO directly. May be able to bypass DADS by using safestores and development JB or stand-alone reader • Distribution re-design • CADC/ECF will be included as beta test sites in parallel operations starting ~ April 1, 2003. • New engine allows operators to prioritize requests • New engine supports transfer of compressed data • Consolidation of operating systems should improve reliability • With all these solutions, preliminary estimate is that backlog could be eliminated in a few months
Data Processing Requirements • Performance requirements (Astronomy community expectations) • Data volume requirements • Into system from HST • Out of system to Astronomy community • Programmatic goals • Fit within declining HST budget at STScI • Expect archive to live beyond HST operational lifetime • Expect archive will be used to support JWST
Performance Requirements-I • Average time from observation execution to data receipt < 1 day • Average time from observation execution to data availability in archive < 2 days • 98% of data available in archive in < 3days
Archive availability: 95% Median retrieval times Defined as time from request to when data is ready for transmission. Does not include transmission time. Non-OTFR data (not recalibrated): 5 hours OTFR data (recalibrated): 10 hours (1 day/week = 14%) Goal: <5% Performance Requirements-II • Median retrieval times • Defined as time from request to when data is ready for transmission. Does not include transmission time. • Non-OTFR data (not recalibrated): 5 hours • OTFR data (recalibrated): 10 hours
Performance Requirements-III • User support • Unlimited number of registered users • Support increased level of requests • Currently ~2000/month • Expect to grow at 20% per year (guess) • Reduce unsuccessful requests to <5% • Routinely handle highly variable demand • Daily request volume varies by more than factor of 10 • Insulate pre-archive processing from OTFR load
Data Volume Requirements-I • Data volume from HST - now • Currently receive ~120 GBits/week from HST • Currently ingest ~100 GBytes/week into the archive • Currently handle ~2000 observations/week • Data volume from HST – after SM4 • Expect ~200 GBits/week from HST • Expect to ingest ~160 GBytes/week into archive • Expect to handle ~2000 observations/week
Data Volume Requirements-II • Data distribution today • More than 300 GBytes/week from archive • More than 70 GBytes/week from ftp site • Data distribution projection • Distribution volume determined by world-wide Astronomy community – very unpredictable • Large increase expected as Cycle 11 data become non-proprietary • Should expect 500-1000 GBytes/week in a few years
Programmatic Goals • Reduce total cost of data processing activities • Simplify hardware and network architecture • Reduce Operating Systems from 3 to 1 • Terminate use of VMS and Tru64 • Eliminate passing of data through various OSs • Consolidate many boxes into two highly reliable boxes • Flexible allocation of computing resources • Support easy re-allocation of CPU and Disk resources among tasks • Provide simple growth paths, if needed
TRU64 VMS SOLARIS Current Architecture
Programmatic Goals • Provide common development, test, and operational environments • Current development and test systems cannot replicate load of operational systems • Reduce complexity of development and test environments (drop VMS, Tru64) • Improve ability to capture performance data, metrics, etc. • Current systems too diverse • Difficult to transfer performance measurement on development/test systems to operations
TRU64 VMS Current Development and I&T Environment
7 Dynamically Re-Configurable Domains EMC OPUS/Archive OPS Opus/Archive EMC Databases OPS EMC Databases OPS EMC Code Development EMC System Test EMC Database Test EMC OS/Security Test SUN FIRE 15K Domain Config New Architecture
Programmatic Goals • Continue planned pipeline evolution • DADS Distribution redesign provides more flexibility to users and operators • Reflect advent of OTFR • Reflect community utilization of the archive • Provide operators more control over priority and loadings • Storing copy of Raw Data on EMC will dramatically reduce load and reliance on Jukeboxes • Ingest redesign provides opportunity to finally end the arbitrary boundary between OPUS and DADS
Programmatic Goals • Future growth paths for HST • To first order, we expect HST to live within the capabilities of this architecture through SM4 to EOL • Input data volume will increase some, but not a lot • Plan to adjust distribution techniques and user expectations to live within the 15K/EMC resources • However, we will encourage ever more and better use of HST science data • Beyond HST End-of-Life • HST data distribution would need to be revisited based on utilization at the time (seven years from now) and progress of NVO initiatives • Architecture is planned starting point for JWST, hardware is very likely to need major upgrades
Remainder of the Review • Architecture • New hardware (Sunfire 15K, EMC) • What it is, how it works • Steps to make it operational • Moving development, I&T, databases • Moving operational processing • OPUS processing • Raw data off Jukeboxes onto EMC • Archive software upgrades