html5-img
1 / 8

Sky Query : A distributed query engine for astronomy

László Dobos 1 , Tamás Budavári 2 , Alex Szalay 2 , István Csabai 1 1 Eötvös Loránd University, Hungary 2 Johns Hopkins University, Baltimore. Sky Query : A distributed query engine for astronomy. The multiwavelength sky. infrared (2MASS). visible (DSS). ultraviolet ( Galex ).

zeno
Télécharger la présentation

Sky Query : A distributed query engine for astronomy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. László Dobos1, Tamás Budavári2, Alex Szalay2, István Csabai1 1 Eötvös Loránd University, Hungary 2JohnsHopkins University, Baltimore SkyQuery:A distributedqueryengineforastronomy

  2. The multiwavelengthsky infrared (2MASS) visible (DSS) ultraviolet (Galex)

  3. Crossmatching • Astronomicalcatalogs • in RDBMS • o(100 million) objects • o(1TB – 10TB) DB size • Donebycoordinates • RA, Dec • Astrometricerror • Differentskycoverage • Differentwavelengthrange • Movingobjects etc.

  4. Crossmatchingondemand • Crossmatchanynumber of catalogs • Allcombinationscannot be precomputed • Maybe catalogpairs? • Usercanspecify • List of catalogstomatch • Region of interes • Priorsfornon-coordinate-basedmatching

  5. Problemdescription • Astronomers„script” whattheydo • multiplere-runs, tweakparameters etc. • huge web forms: no-no • Alldatain RDBMS • runcomputationinsidethedatabase • usemultiple servers and parallelize • must be transparentforusers • Problemdescriptionin SQL • functions and languageextensionstosupportastronomy • syntaxtoformulatethecoordinate-basedprobabilisticjoin • spatialconstraints: celestialregions

  6. Sample SQL query SELECTs.objId, g.objID, t.objID, s.ra, s.dec, g.ra, g.dec, t.ra, t.dec, x.ra, x.decFROMSDSSDR7:GalaxiesAS sCROSS JOIN Galex:GalaxiesAS g CROSS JOIN TwoMASS:ExtendedSourcesAS tXMATCH BAYESIAN AS xMUST s ONPOINT(s.cx, s.cy, s.cz), 0.1MUST g ONPOINT(g.ra, g.dec), 0.2 MAY t ONPOINT(t.ra, t.dec), 0.5HAVING LIMIT 1e3 REGIONCIRCLE J2000 165.7, 0.3, 60 Standard SQL Probabilisticcrossmatch Spatialconstraint

  7. Zonealgorithms • Pure SQL:Can leverage from query optimizer of SQL Server • Divide sphere into zones • ZoneID: very simple hash on declination • Indexes built on ZoneID and right ascension help very quick pre-filtering of match candidates • very well parallelized on multi-core machines • [Gray, Szalay & Nieto-Santisteban 2006, The Zones Algorithm for Finding Points-Near-a-Point or Cross-Matching Spatial Datasets]

More Related