190 likes | 336 Vues
ASTRO-WISE- federation OmegaCEN. AstroWise a Virtual Survey System OmegaCAM – Lofar – AstroGrid –((G)A) VO. Edwin A. Valentijn. VST. Paranal. Mid 2005. 2007. Handling of the data is non-trivial Pipeline data reduction Calibration with very limited resources Things change in time:
E N D
ASTRO-WISE- federation OmegaCEN AstroWise a Virtual Survey System OmegaCAM – Lofar – AstroGrid –((G)A) VO Edwin A. Valentijn
Paranal Mid 2005 2007
Handling of the data is non-trivial • Pipeline data reduction • Calibration with very limited resources • Things change in time: • Physical changes (atmosphere, various gains) • Code (new methods, bugs) • Human insight in changes • Working with source lists Science can only be archive based Large Data Volume • Wide-field imaging = vast amounts of data • VST sees equivalent of Southern sky with 0.2” pixels in 3 years. 100 Tbyte of image data and Tbytes of source list data A project like KIDS (1000 Sq Deg) has >10^6 8Mpix raw data images
ASTRO-WISE Mission Virtual Survey System • Environment that provides systematic and controlled • Access to all raw and calibration data • Execution and modification image/calibration pipelines • Execution of source extraction algorithms- catalogues • Archiving or regenerate on the fly dynamically • Paradigm: no static data releases but dynamic on request data • federated to link different data centers • Dynamical archive continuously grows, can be used for • small or large science projects • generating and checking calibration data • exchanging methods, scripts and configuration raw pixel data pipelines/cal files catalogues
Keys -Solution • Procedurizing • Data taking at telescope for both science and calibration data • Observing Modes: —Stare—Jitter —Dither —SSO • Observing Strategies: —Stan —Deep —Freq —Mosaic • Full integration with data reduction • Design- ADD • Data model (classes) defined for data reduction and calibration • View pipeline as an administrative problem
Calibration procedures Sanity checks Image pipeline Source pipeline Calibration procedures Quality control
Bias pipeline Source pipeline Flatfield pipeline Photometric pipeline Image pipeline AstroWise Pipelines
DB – engine Long search: Objectivity, CERN HEP July 2002 Oracle contract – reference – licenses - consultancy Oracle 9i -> 10g: • Full oo support + Python to db persistency • Terabyte scalability through “partitioned tables” • Administration tools • federation through “Advanced replication” to evolve into cross site links (=references= pointers) • Interoperability: STREAMS connects to others: SyBase, MySQL • Python I/F SQL – OCI- Oracle db (Python binding)
Concepts of federation • Federation maintained by a single database- Oracle9i • Full history tracking by linking (joints, references) • of all input that went into result • providing on-the fly reprocessing • Dynamical archive - Context as object attributes • Project: Calibration, Science, Survey, Personal • Owner: Pipeline, Developer, User • Strategy: —Standard —Deep —Freq — Mosaïc • Mode: —Stare —Jitter — Dither —SSO • Time: Time stamping VO I/F Publish • Software standards • Classes/data model/procedures • 00 – inheritance/ persistency • Python scripts/ c-libraries USER Python
Contents of federation • Raw data • Observed images • Ancillary information • Calibration results • Calibration files time stamped • Reduced images • Single observation • Coadded images • Software • Methods (pipelines) for processing calibration • Configuration files • Source lists – catalogues • Extracted source information • Associated among different data objects
Tbyte source lists brains make the associations Link -lists as fast as possible
Status • From abstract design to working prototype FIRST VIRTUAL LIGHT 17 April 2003 • WFI@2.2 • INT • Federated • USM • OAC • Summer 2004 • First massive ingest
New components • Python binding to Oracle – 5LS ۷ • Scripting paradigm-۷ • Wrappers • Make metaphor • Persistency, links as attributes • Fileserver ۷ • Parallelisation code ۷ • Catalogue class/ associating HTM ۷ • Federating with Oracle Streams • User interfaces • Tell me everything tool<- >on the fly reprocessing
New technologies-paradigm • raw data centralized – on the fly reprocessing = VSS • sociology- project management • ++ • Python at top layer(s) = VSS • parallel – IP - seti@home • user tunable pipelines = VSS • ++ db • 5 LS – 5 lines script = VSS • Distribution – VO = VSS
development issues • Indexing Terabytes on the sky • Associating entries • World wide linking entries/distribution • (referencing) – registries • centralized – replication – p2p • Other p2p’s? • Visualisation - imaging • Visualisation -multidimensional data • VO world - Euro-VO