1 / 11

TCD Site Report

TCD Site Report. Stuart Kenny*, Stephen Childs, Brian Coghlan, Geoff Quigley. TCD. Two roles: UKI grid site Grid-Ireland operations centre 18 sites centrally managed by operations team (8 members, soon to be 7) Responsible for TCD site and Grid-Ireland central services

susan
Télécharger la présentation

TCD Site Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TCD Site Report Stuart Kenny*, Stephen Childs, Brian Coghlan, Geoff Quigley

  2. TCD • Two roles: • UKI grid site • Grid-Ireland operations centre • 18 sites centrally managed by operations team (8 members, soon to be 7) • Responsible for TCD site and Grid-Ireland central services • Quattor deployed and managed • Extensive use of Xen VMs

  3. Hardware • Dell 2950 gateway host [16GB DRAM + 6TB RAID6] • Xen host (CE, UI, R-GMA MON, test WNs) • Dell 2950 SE host [16GB DRAM + 6TB RAID6] • 96 x Dell 1950 WNs [16GB DRAM + 500GB] • 50 x U/G lab Condor pool WNs • 8 x Dell 2950 central server hosts [16GB DRAM + 16TB RAID6] • host01: webserver + rt • host02: repository • host03: VOMS, myproxy, gLite WMS • Host04: BDII, R-GMA, WMS • host05: monitoring server, oracle server • host06: portal servers • host07: datamgt servers • host08: alternate middleware • 8 x Dell 2950 redundant central server hosts [16GB DRAM + 16TB RAID6] • 1 Ge networking, with 3 x 10Ge uplinks

  4. Storage • Grid-Ireland @ TCD already had some • Dell Poweredge 2950 (2xQuad Xeon)‏ • Dell MD1000 (SAS - JBOD) • After procurement data store has total • 8x Dell PE2950 • 30x MD1000, each with 15x 1TB disks • ~11.6 TiB after RAID6 and XFS format (~348 TiB) • Dell Blade Chassis with 8x M600 blades • Dell tape library (24x Ultrium 4 tapes)‏ • HP ExDS9100 with 4 capacity blocks of 82x 1TB disks and 4 blades • ~ 233 TiB total available for NFS/http export Storage Workshop - Geoff Quigley Thurs 13:50

  5. Infrastructure • Room needed upgrade • Another cooler • UPS maxed out • New high-current AC circuits added • 2x 3kVA UPS per rack acquired for Dell equipment • ExDS has 4x 16A 3Ø - 2 on room UPS, 2 raw • 10 GbE to move data! Storage Workshop - Geoff Quigley Thurs 13:50

  6. Redundant Operations Centre • Aim is to keep up-to-date replicas of core server VMs to allow failover in case of network or hardware failures • Design decisions • Replicate storage “underneath” Xen VMs • Replicate at block level: avoid need for service-specific replication policies • Manual failover initially

  7. Monitoring • A lot of work recently on monitoring configuration • Want to configure as much as possible from common Quattor templates • Nagios • Submitting local WLCG grid probes for G-I VOs • Lemon • Ganglia • Also used • Weathermap • Cacti • ASI (Security Day talk) • …

  8. Grid-Ireland Setup Site admins EGEE SAM GI SAM Issue alarms Get site status Quattor templates Monitoring server TCD Site Nagios NSCA Nagios NRPE GI Sites gridui Lemon Agent Lemon Host Check

  9. Lemon-Nagios Integration • Lemon service added • Additional lemon metrics added to hosts • Cron executes lemon-host-check • Output sent to nagios via nsca • Exception results in Lemon service failure

  10. Monitoring - Weathermap

  11. Active Security • Existing Grid security activities focused on prevention • Authentication, authorization • Active security focused on • Detection • Reaction • 3 components • Security monitoring • Alert Analysis • Control Engine Security Day – Stuart Kenny Wed 10:15

More Related