1 / 10

HPC Activities at LBNL

HPC Activities at LBNL. April 1, 2009. Krishna Muriki Jackie Scoggins Lawrence Berkeley National Laboratory. Cluster Support at LBNL and UCB. 35 Clusters in production (5800 processors) Over 500 scientific users 1400 compute nodes Managed by 5.15 FTE

riona
Télécharger la présentation

HPC Activities at LBNL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HPC Activities at LBNL April 1, 2009 Krishna MurikiJackie ScogginsLawrence Berkeley National Laboratory

  2. Cluster Support at LBNL and UCB 35 Clusters in production (5800 processors) • Over 500 scientific users • 1400 compute nodes • Managed by 5.15 FTE • (2.65 FTE PI Cluster support, 2 FTE LRC, 0.5 FTE mgmt) • Perceus supercluster infrastructure allows us to scale up cluster support. • 33% increase in # of cores projected for FY09 Recent clusters include: • John Chiang, Zack Powell, Inez Fung, Bill Collins (UCB) Climate Modeling 128 processor Infiniband cluster (Jan 2009) • Lab-wide cluster “Lawrencium” • 198-node, 1584 core IB cluster (Dec 2008) April 1, 2009

  3. Laboratory Reseach Computing The Lawrencium Cluster project Goal: Provide a large shared cluster to make high performance parallel computing more accessible to LBNL researchers Extend computing services to • Meet intermittent need for compute cycles • Provide resource for those who cannot purchase their own cluster • Provide additional compute cycles • Provide ability to run larger-scale jobs for those who currently own smaller clusters • Provide environment for scientists to benchmark applications in preparation for running on larger systems or applying for grants and supercomputing center allocations April 1, 2009

  4. LRC At a Glance April 1, 2009

  5. What’s Next Most of our PI-owned clusters have been optimized for tightly-coupled parallel computation. new institutional cluster (16Tflp, 1500 core) also tightly-coupled But researchers also have ‘serial’ computing needs. Can we use commercial ‘cloud’ services (like Amazon) to meet those needs more cheaply, flexibly? Use cases: accommodating very long serial jobs absorbing demand peaks scaling up existing serial resource April 1, 2009

  6. Service Model (envisioned) InstitutionalSerial InteractiveNode Q1 Q2 Scheduler User InstitutionalParallel Q3 Q4 Q5 PI–Owned Computing Resources Cloud Other Sites • run jobs on most productive, cost-effective resource • augment local resources with cloud services • transparently? April 1, 2009

  7. Moab Grid Suite Project Compute Nodes on Cluster 1 Moab & Torque Server Cluster 1 Moab Access Web Portal (map) Login Nodes OTP https://mapserver.lbl.gov/map Login Nodes OTP Job Migration msub Cluster Login Nodes Moab & Torque Server Cluster 2 Login Nodes OTP Compute Nodes on Cluster 2 > ssh login-rr • Meta-Scheduling between clusters • 1 Master with one or more slaves Grid or • Peer-to-peer Grid April 1, 2009

  8. Moab Grid Suite Project Compute Nodes on Cluster 1 • User submits a job on the login node using msub or thru the web portal. • The job is sent to the scheduler. • The scheduler can be a Master that migrate the job to a Slave server or it can be a Peer server that can migrate it to another Peer. • All of this seamlessly to the user unless cluster selection is made at submission time by the user. • msub myscript 19 free nodes Compute Nodes on Cluster 2 • cat myscript • #PBS -l nodes=12:ppn=8 • #PBS -l walltime=00:30:00 • #PBS -N jackie • cd $HOME • mpirun -np 96 ./a.out April 1, 2009

  9. Moab Grid Suite Project • Web-based Moab Portals • Moab Access Portal • Central map server allowing https connections. • Users can • Submit jobs • View jobs • Create jobs • Migrate data • View resources • Moab Workload Manager • Administrators can • Manage resources • Manage jobs • Manage accounts/groups • Remote Access • Methods • Moab to Moab: • Secret Key-based server authentication for Moab Grid server and clients • Users • Access to cluster site can be done via ssh or web portal using One-Time Password (OTP) April 1, 2009

  10. HPC Activities at LBNL QUESTIONS? Contact information: User Service ConsultantSystems and Scheduler Krishna Muriki Jacqueline Scoggins (510) 486-4007 (510) 486-8651 Kmuriki@lbl.govJscoggins@lbl.gov April 1, 2009

More Related