150 likes | 262 Vues
The presentation discusses the history and developments of the Illinois Tier-2 computing project within the Midwest Tier-2 Collaboration as of November 2012. Key milestones include the establishment of the T3GS cluster, successful pilot tests of ATLAS production jobs, and the integration of advanced hardware within the National Center for Supercomputing Applications (NCSA). With a focus on production operations, resource configurations, and shared campus cluster integration, the talk emphasizes the successes and challenges faced by the project, along with future plans for expansion and collaboration.
E N D
Site Lightning Report: MWT2 Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities Meeting @ UC Santa Cruz Nov 14, 2012
Midwest Tier-2 0101001011110… The Team: Rob Gardner, Dave Lesny, Mark Neubauer, Sarah Williams, IllijaVukotic, Lincoln Bryant, Fred Luehring Three site Tier-2 consortia
Midwest Tier-2 Focus of this talk: Illinois Tier-2
Tier-2 @ Illinois History of the project: • Fall 2007−: Development/operation of T3gs • 08/26/10: Torre’s US ATLAS IB talk • 10/26/10: Tier2@Illinois Proposal submitted to US ATLAS Computing Mgmt • 11/23/20: Proposal formally accepted • 10/5/11: First successful test of ATLAS production jobs run on Campus Cluster(CC) • Jobs read data from our Tier3gs cluster
Tier-2 @ Illinois History of the project (cont): • 03/1/12: Successful T2@Illinois Pilot • Squid proxy cache, Condor head node job flocking from UC • 4/4/12: First hardware into Taub cluster • 16 compute nodes (dual x5650, 48 GB memory, 160 GB disk, IB) 196 cores • 60 2TB drives in DDN array 120 TB raw • 4/17/12: PerfSONAR nodes online
Illinois Tier-2 History of the project (cont) • 4/18/12: T2@Illinois in production T2onTaub
Illinois Tier-2 Stable operation: Last two weeks
Illinois Tier-2 Last day on MWT2:
NCSA Building ACB NPCF Why at Illinois? • National Center for Supercomputing Applications (NCSA) • National Petascale Computing Facility (NPCF): Blue Waters • Advanced Computation Building • 7000 sq. ft with 70” raised floor • 2.3 MW of power capacity • 250 kW UPS • 750 tons of cooling capacity • Experience in HEP computing
Tier-2 @ Illinois • Pros (ATLAS perspective) • Free building, power, cooling, core infrastructure support w/ plenty of room for future expansion • Pool of Expertise, heterogeneous HW • Bulk Pricing important given DDD (Dell Deal Demise) • Opportunistic resources • Challenges • Constraints on hardware, pricing, architecture, timing • Deployed in a shared campus cluster (CC) in ACB • “Taub” first instance of CC • Tier2@Illinois on Taub in production within MWT2
Tier-2 @ Illinois Current CPU and disk resources: • 16 compute nodes (taubXXX) • dual x5650, 48 GB memory, 160 GB disk, IB) 196 cores ~400 js • 60 2TB drives in Data Direct Networks (DDN) array 120 TB raw ~70 TB usable
Tier-2 @ Illinois • Utility nodes / services (.campuscluster.illinois.edu): • Gatekeeper (mwt2-gt) • Primary schedd for Taubcondor pool • Flocks other jobs to UC and IU Condor Pools • Condor Head Node (mwt2-condor) • Collector and Negotiator for Taub condor pool • Accepts flocked jobs from other MWT2 Gatekeepers • Squid (mwt2-squid) • Proxy cache for CVMFS, Frontier for Taub (backup for IU/UC) • CVMFS Replica server (mwt2-cvmfs) • CVMFS replica server for Master CVMFS server • dCache s-node (mwt2-s1) • Pool node for GPFS data storage (installed, dCache in progress)
Next CC Instance (to be named) Overview • Mix of Ethernet-only and Ethernet + InfiniBand connected nodes • assume 50-100% will be IB enabled • Mix of CPU-only and CPU+GPU nodes • assume up to 25% of nodes will have GPUs • New storage device and support nodes • added to shared storage environment • Allow for other protocols (SAMBA, NFS, GridFTP, GPFS) • VM hosting and related services • persistent services and other needs directly related to use of compute/storage resources
Next CC Instance (basic configuration) • Dell PowerEdge C8220 2-socket Intel Xeon E5-2670 • 8-core Sandy Bridge processors @ 2.60GHz • 1 “sled” : 2 SB processors • 8 sleds in 4U : 128 cores Dell C8220 compute sled • Memory configuration options: • 2GB/core, 4GB/core, 8GB/core • Options: • InfiniBand FDR (GigE otherwise) • NVIDIA M2090 (Fermi GPU) Accelerators • Storage via DDN SFA12000 • can add in 30TB (raw) increments
Summary and Plans • New Tier-2 @ Illinois • Modest (currently) resource integrated into MWT2 and in production use • Cautious optimism: Deploying an Tier-2 within a shared campus cluster a success • Near term plans • Buy into 2nd campus cluster instance • $160k of FY12 funds with 60/40 CPU/disk split • Continue dCache deployment • LHCONE @ Illinois due to turn on 11/20/12 • Virtualization of Tier-2 utility services • Better integration into MWT2 monitoring