230 likes | 353 Vues
The latest update from the Max Planck Institute for Astrophysics (MPA) presents significant advancements in their databases and astrophysical research methodologies. This includes the introduction of the milli-Millennium halo merger tree database, enhanced galaxy properties, and the incorporation of various environmental correlation analyses. Access methods to databases and future development plans are also detailed, such as improved SDSS mirror functionality and the application of PCA for spectral analysis. These efforts aim to unite observational data with theoretical models for galaxy formation.
E N D
Databases@MPA, access methods and plans With contributions from JHU : Alex Szalay, Jan Vanderberg MPA: Jeremy Blaizot, Jarle Brinchmann, Guinevere Kauffmann, Anja von der Linden, Ben Panter, Guo Qi, Volker Springel, Vivienne Wild Databases @ MPA
Last year, Budapest • Presented milli-Millennium halo merger tree database • Requests: • More properties (lambda, ...) X • Galaxies V • Correlation with environment (galaxies in voids) V • Millennium • Why use databases ? Ask Alex. Databases @ MPA
Current status • milli-Millennium • Galaxies added: merger trees, links to their parent halos • Density field at various smoothings • Updated web site (demo) • Millennium subset • Subset (~2%, 10x milli-Mil) of halo and galaxy trees • Z=0 density field • Millennium • Halo trees in database (proprietary) • SAM galaxies under way (settle on model etc) • Density fields at all Z will be added: 1056964608 rows • Durham • milli_Millennium mirror (Postgres) • Durham halo tree and galaxy catalogues Databases @ MPA
Other databases • ROSAT: source catalogues and RASS photons (~100 million) • SDSS Peripherals • SDSS_MPA (Brinchman, Kauffmann, Tremonti et al) • MOPED (Ben Panter) • SDSS_PCA (Vivienne Wild et al) • GalICS (Jeremy Blaizot) • HEALPix all sky maps (Alex Szalay, Tony Banday) • wmap (3 year data soon !) • extinction maps • radio maps (Bonn) • ROSAT background (hopefully) Databases @ MPA
Access • Public: http://www.g-vo.org/mpasims • Local web apps to Millennium, BESTDR3 and peripherals: http://www.g-vo.org/sdssdr3/ • Public web browser queries limited (1min, 10000 rows) • Local databases + web apps less limited Databases @ MPA
Streaming • Query results temporarily buffered on server: memory • Streaming queries: faster, less limited (only timeout) • Access: • IDL (with Ben Panter) • wget –http-user=*** --http-password=*** -O localfile.csv http://www.g-vo.org/sdssdr3/DBQueryStream?SQL=select * from moped..agebin • GUI asking for username/password • Interprets CSV stream, turned into IDL components • TOPCAT Databases @ MPA
Plans: Millennium • Millennium: • Tune database • 750000000 halos • N x 1000000000 galaxies • 63 x 256^3 density field grid cells • More halo properties (shape, λ, ...) • More galaxy catalogues • different parameters • different algorithms (GalICS, Durham, ...) • Light cone mock catalogues • Galaxy spectra (+ PCA) • Links to SDSS mirror and peripherals • Proper metadata handling (ala SkyServer) • "SAM online„ • Move webapps to MPA • Use JHU services, install CAS jobs Databases @ MPA
Plans: SDSS mirror + peripherals • Make mirror web site public • Upgrade SDSS mirror to DR4 … • Stabilize, document, publish SDSS peripherals • Proper metadata handling • Links to Millennium • Personal databases: MyDB (ala SkyServer) • Add logos Databases @ MPA
Theory VO: spectra • Combine theory and observations • Example: query-by-example on theory spectra • Find similar spectra, from these the actual galaxy formation history • Chi-squared on all stored spectra ? Slow, requires storing all of them • Idea (not original, see HVO/JHU talks): use PCA to compress data Databases @ MPA
PCA • Need training sample of theory spectra to create eigenspectra • Project all spectra • Store PCA amplitudes in DB • Provide web service: • Upload (observational) spectrum (IVOA SSA/SED) • Project onto theory eigenspectra • Use amplitudes as parameters in query for “nearby” amplitudes • Return corresponding theory spectra • Return corresponding galaxy formation histories, or their halos, or their environment … Databases @ MPA
Issues • Dealing with errors, gaps: “gappy PCA” (Connolly & Szalay) • Normalization: • incoming spectrum in general from very different dataset, needs common normalization • Incoming set will have gaps, errors • Ad hoc normalization possible (and works quite good) • Indexing of complex multi-dimensional point set for quick nearest k neigbours search (Voronoi ? See Laszlo‘s work) Databases @ MPA
Normalized gappy PCA • Fit normalization factor at same time as PCA amplitudes. Model: • Minimize (over aiand N ) : Databases @ MPA
So far • Ran PCA on BC03 stochastic bursts (Vivienne) • On first GalICS+milli-Millennium spectra (Jeremy) • Projected SDSS spectra on both • Defined a PCA data model/schema • Stored PCAs in database • TOPCAT Databases @ MPA
PCA data model (RDB schema available) Databases @ MPA
milliMil-GalICS PC1 vs PC2 Voronoi tesselation Databases @ MPA
Issues for query-by-example • Overlap quite good, but good enough ? • GalICS spread less than SDSS. • BC03 comparable with SDSS, but different slope. • Systematics • Model: • physics very preliminary (see Blaizot & de Lucia?) • resolution effects • Preprocessing SDSS galaxies • Rebinning: different algorithms give comparable results • (slightly) wrong redshift ? Can be easily simulated • Projection algorithm: normalization does not affect outcome • Observational systematics: use virtual telescope (+virtual spectrograph) to test on the theory spectra.Easier to blow up simulation than to shrink observation cloud Databases @ MPA
Comments • Millennium database being used for science projects (Guo Qi) • SDSS peripherals used for science projects (see Vivienne’s talk, Ben Panter) • Use of mydb for debugging and testing (Jeremy) • Please give comments, feedback. Databases @ MPA