1 / 10

ALMA Archive Operations

ALMA Archive Operations. Impact on the ARC Facilities. Data volume. 1 year nominal ALMA is 200 TB. we’ve planned for 200 TB during the ‘early’ years in total. no backups! The ARCs are the backups. Replication to ARCs using media if necessary and over network once possible. Deliverables.

miron
Télécharger la présentation

ALMA Archive Operations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ALMA Archive Operations • Impact on the ARC Facilities

  2. Data volume • 1 year nominal ALMAis 200 TB. • we’ve planned for200 TB during the ‘early’years in total. • no backups! The ARCs arethe backups. • Replication to ARCs using mediaif necessary and over network once possible.

  3. Deliverables • ARCs will receive the software and support for the procurement of the hardware and the Oracle licenses. We are also including the ARCs in the planning of the archive operational concepts under the assumption that both the hardware and the software are very similar. • Database is Oracle, if an ARC for whatever reason decides to deviate from this it has to carry the development and additional operational costs. Essentially this means that it is technically feasible but.... • Hardware is essentially the same story, although a bit more relaxed (in particular if we manage to replicate over the network). • The NGAS software plays an essential role in the operational concepts.

  4. High level concepts • SCO is hub for bulk andmeta-data. • OSF archive is hidden. Dataare first replicated to SCO andfrom there to the ARCs. • In general everything is replicatedto the ARCs; in practice part of the monitorand log data might be irrelevant. • Proposals are submitted to the SCO and replicatedto the ARCs. OT submission interface talks to SCO.

  5. Nominal Operations • Database replication will be done using Oracle streams replication technology to the various sites. This means that meta-data will be available at the ARCs within seconds. • Bulk data replication will be done using the NGAS mirroring service (network) or the NGAS cloning service (hard disks). • The NGAS archives at the ARCs are virtually independent from the SCO NGAS, i.e. they don’t share DB or any other resources, but they ‘know’ of each other. • Network transfer: During periods of average data rate the bulk data should arrive at the ARCs within a few minutes as well (network bandwidth limitation).

  6. nominal operations • Media transfer: Assuming that we would send media twice a week and that they get delivered within one week, the maximum delay would be 1.5 weeks after the observation. This has to be defined and implemented. • Important data could still be replicated through the network. • Access to data is always transparently possible, i.e. user accessing ARC can request data even if it has not yet been replicated.

  7. Hardware • Full blown ALMA ARC will consist of four 19” racks. • 3 x 8 NGAS servers (4 HU, 24 disks) • 3 NGAS disk handling and front-end servers (3 HU, 16 disks) • 2 database machines (3 HU, 16 disks) • Potentially additional disk array for database • Terminal server for all machines • Network switch

  8. hardware • This equipment requires sufficient cooling and good racks • One of the 4 HU NGAS servers more than 100 kg, i.e. one of the three racks will be approximately 1 ton. • Since one rack with current disk capacity holds about 1 year of ALMA data it should be possible to keep the full ALMA archive stable in terms of total space required by replacing disks with new double capacity disks after 2-3 years (no increase of data rate). • This not only saves space, but also power and maintenance.

  9. Deployment • Layout of the ARCs is essentially the same as for the SCO. • We just asked for prices for the first full installation of hardware at the OSF, which is very similar to 0.75 of an ARC: Price is about 135,000 $ including 110 TB of disk space. 8 x 24 slot machines, plus 14 x 16 slot machines. Only 35% of the slots filled. • 1TB disk ~ 300$ (really low dollar :-(()

  10. Prices • Totals per TB: 300 $/disk + 240 $/slot in computer = 540 $/TB including auxiliary machines (disk handling, DB, front-end). • No infrastructure included (racks, cooling, network, UPS...) • At the time when we have to procure the hardware for the ARCs this should have gone down to about 270 $/TB. We have to buy about 1 year worth of capacity (200 TB) initially and that should by that time fit in half of the number of machines/slots.

More Related