RACF: Bridging Science & Technology - Innovations in Computing Infrastructure

RACF: an overview • Formed in the mid-1990’s to provide centralized computing resources for the four RHIC experiments (BRAHMS, PHOBOS, STAR, PHENIX) • Role was expanded in the late 1990’s to act as the US Tier-1 computing center for the ATLAS experiment at the LHC • Small but growing astrophysics presence (Daya Bay, LSST) • Located in Brookhaven Computing Facility • 35 FTEs providing a full range of scientific computing services for more than 4000 users

RACF: setting the scale • RHIC • 1200 Compute Servers (130 kHS06, 16k job slots) • 7 PB of Distributed Storage on Compute Nodes, up to 16 GB/s between compute servers and distributed storage servers • 4 Robotic Tape Libraries w/ 40 tape drives and 38k cartridge slots, 20 PB of active data • ATLAS • 1150 Compute Servers (115 kHS06, 12k job slots) • 90 Storage Servers driving 8500 disk drives (10 PB), up to 18 GB/s observed in production between compute and storage farm • 3 Robotic Tape Libraries w/ 30 tape drives and 26k cartridge slots, 7 PB of active data • Magnetic Tape Archive • Data inventory of currently 27 PB managed by High Performance Storage System (HPSS), archive layer below dCache or xrootd • Up to 4GB/s tape/HPSS dCache/xrootd throughput • Network • LAN – 13 enterprise switches w/ 5800 active ports (750 10GE ports), 160 Gbps inter-switch Bandwidth • WAN – 70 Gbps in production (20 Gbps to CERN and other ATLAS T1s, 10 Gbps dedicated for LHCONE) + 20 Gbps for US ATLAS T1/T2 traffic and up to 20 Gbps serving domestic and international data transfer needs

ATLAS Cloud support and R&D beyond Tier-1 Core Services Grid job submission Deployment & operations of grid job submission infrastructure and pandamover (to serve T2 input datasets) Deployment & operations of AutoPyFactory (APF) for pilot submission in US and other regions Includes work on PanDA job wrapper (Jose C.) and local pilot submission for T3s Condor-G performance improvements In close collaboration w/ Condor developers gLexec to run analysis payload using user proxy Lead effort ATLAS-wide and part of the OSG Software Integration activities CREAM Computing Element (CE) replacing GT2 based CE As part of the OSG Software Integration activities Primary GUMS (Grid Identity Mapping Service) developer (John Hover) Used OSG-wide (incl. US ATLAS (~600 accounts) and US CMS (~250 accounts)) Leading OSG Architecture (Blue Print), Native Packaging (RPMs replacing Packman) / Configuration Management OSG Integration and Validation effort Support for T2 & T3 (DQ2 site services, FTS, LFC, Network optimization & monitoring Coordination of worldwide Frontier deployment (ended in Nov 2011) Worldwide ATLAS S/W installation and validation service (Yuri Smirnov) Participation in ATLAS computing R&D projects Cloud computing Federated Xrootd storage system NoSQL Database evaluation (e.g. Cassandra, now moving to Hadoop) Storage System performance optimization Linux & Solaris kernel, I/O driver and file system tweaks Support for Tier-3 at BNL User accounts, interactive services, Fabric (all hardware components, OS, batch system), Xrootd and PROOF (in collaboration w/ Sergey and Shuwei)

Reprocessing from 11/02 – 11/14 MWT2 Included 11/06

Contribution to Simulation (Aug-Oct) ~1000 opportunistic job slots from NP/RHIC, ~ 2M CPU hours since August Avg # of fully Utilized cores (3843) (1867) (1762) (1067) (896)

Facility Operations during Hurricane Sandy

Facility Operations During Hurricane Sandy

Configuration Management(Jason Smith et al)

Configuration management - Components

Benefits

RACF and OSG • RACF Staff is heavily engaged in the Open Science Grid • Major contributor to Technology Investigation area and the architectural development of OSG’s Fabric of Services • Member of the Management Team and represents BNL on the OSG Council • Committed to develop OSG as a National Computational Infrastructure, jointly with other providers like XSEDE • ATLAS Tier-1 center fully integrated with OSG • Provides opportunistic cycles to other OSG VOs

OSG continuing for another 5 years • Besides focus on physics and the momentum of the LHC, there is a broad spectrum of different science applications making use of OSG • Very strong support from DOE and NSF to continue • Extend support to more stakeholders – communities (e.g. nuclear physics and astrophysics) and scientists local to the campuses • Partnership/Participation in NSF as an XSEDE Service Provider • XSEDE is a comprehensive, expertly managed and evolving set of advanced heterogeneous high-end digital services, integrated into a general-purpose infrastructure • XSEDE is about increased user productivity • Emergence of NSF Institutes – centers of excellence on particular cyberinfrastructure topics (e.g. Distributed High Throughput Computing (DHTC)) across all science domains • Evolution of DOE ASCR SciDAC program • Strong participant in Extreme Scale Collaboratories initiative

R&D at the RACF: Cloud Computing ATLAS-wide activity - Motivation New and emerging paradigm in the delivery of IT services Improved approach to managing and provisioning resources, allowing applications to easily adapt and scale to varied usage demands New, increasingly competitive market offers cost effective computing resources; companies small and large already make extensive use of them By providing an “Infrastructure as a Service” (IaaS), clouds aim to efficiently share the hardware resources for storage and processing without sacrificing flexibility in services offered to applications BNL_CLOUD: Standard production Panda site with ~500 Virtual Machines (VM) Configured to use wide-area stagein/out, so same cluster can be extended transparently to commercial cloud (e.g. Amazon) or other public academic clouds Steadily running production jobs on auto-built VMs The key feature of the work has been to make all processes and configurations general and public, so they can be (re-)used outside BNL (e.g. at T2s to dynamically establish analysis facilities using beyond-pledge resources) Standard, widely used technology (Linux, Condor, OpenStack, etc.) is used. 14 DOE/NSF Bi-Weekly Operations Meeting - US ATLAS

Flexible algorithms decide when to start and terminate running VMs Cloud hierarchies: programmatic scheduling of jobs on site-local -> other private -> commercial clouds based on job priority and cloud cost Cloud Integration in U.S. ATLAS Facilities 15 DOE/NSF Bi-Weekly Operations Meeting - US ATLAS

Looking into Hadoop-based Storage Management • Hiro, Shigeki Misawa, Tejas Rao, and Doug helping w/ tests • Reasons why we are interested • Internet Industry is developing scalable storage management solutions much faster than we will ever be able to • We just have to make them work for us • HTTP-based data access works well for them, why shouldn’t it for us? • With ever increasing storage capacity/drive we expect performance bottlenecks • With disk-heavy WNs we could provide many more spindles which helps scaling up the I/O performance and improve resilience against failures • Apache Distribution Open Source • Several significant limitations (performance & resilience) • Several commercial products • Free/unsupported downloads besides value-added/supported licensed versions

MapR (.com)

Performance vs. CPU Consumption Maxing out a 10 Gbps Network Interface … at the expense of 10% of the server CPU for Write and ~5% for Read Operations

RACF: Bridging Science & Technology - Innovations in Computing Infrastructure

RACF: Bridging Science & Technology - Innovations in Computing Infrastructure

Presentation Transcript

Condor at the RACF

An overview

AN OVERVIEW

RACF - Tool

HTCondor at the RACF

An Overview

An Overview

An overview

An overview

An Overview

An overview

AN OVERVIEW

An Overview

An Overview

An overview

An Overview

An overview

An overview

An Overview

RACF: an overview

an overview