1 / 6

Worker Node Requirements

Worker Node Requirements. TCO – biggest bang for the buck Efficiency per $ important (ie cost per unit of work) Processor speed (faster is not necessarily better) HT enabled (or not) Memory per job slot Local disk I/O per job slot Networking. Dell R410 is the workhorse

cai
Télécharger la présentation

Worker Node Requirements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Worker Node Requirements • TCO – biggest bang for the buck • Efficiency per $ important (ie cost per unit of work) • Processor speed (faster is not necessarily better) • HT enabled (or not) • Memory per job slot • Local disk I/O per job slot • Networking

  2. Dell R410 is the workhorse • X5660 cpu with HT (10-15% more work), 24 job slots • 48GB memory (but now should be 72G or more) • Multiple local drives (H200, three 1TB SATA 7200RPM) • 1Gb NICS • CC WN node would use IB for private interface • Good connection to local SE • Must have NAT access to public network (IB or 1Gb ?) • BNL tests show Dell CP6100 is bad for Atlas jobs

  3. SE Pool Nodes • dCache 1.9.12 with locality cache (like AGLT2) • Use GPFS for backing store • IB used to get to DDN • 10Ge to public network • Fast, non-firewalled access to UC and IU sites • WNs connect to SE via IB (dcap, direct access) • Dell R710 type node with 10Ge and IB • AGLT2 use 48GB • Small amount of local disk • How many pool nodes is not known and will grow

  4. Integration and Operation • Personnel on the CC who will help • Condor • Condor with Flocking from UC • Need local Condor master on public and IB networks • Non-firewall to UC • Could be on a VM or separate node (like an R410) • Squid • Local cache for CVMFS and Frontier • Public and IB • Can also serve UC and IU • Not a fast CPU, but needs memory and disk • Some CERN access for monitoring

  5. Networking Requirements • WAN improvements between UC, IU and UIUC in works • 10Ge public connection for SE pools • IB from pool nodes to DDN and WN to pool nodes • Condor master and Squid also need public (1Gb) • WNs can be private but need fast NAT access • Firewalling between UC, IU and UIUC need to be open

  6. Other Items of Interest • UC monitoring of UIUC nodes • Very fluid environment • Who has root • How can we make changes (work in progress) • One bad WN can spoil the whole lot

More Related