1 / 24

CE + WN installation and configuration

CE + WN installation and configuration. Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela 12 th EELA Tutorial Lima, 24-29 September,2007. Outline. What is a Computing Element (CE) ? What is a Torque Server ? What is a Worker Node?

pisces
Télécharger la présentation

CE + WN installation and configuration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CE + WN installation and configuration Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela 12th EELA Tutorial Lima, 24-29 September,2007

  2. Outline • What is a Computing Element (CE) ? • What is a Torque Server ? • What is a Worker Node? • How to install and configure a Computing Element with Torque Server. • How to install and configure a Worker Node with Torque

  3. What is CE? • The CE is a service representing a computing resource. • Its main functionality is job management (job submission, job control, etc.). • For job submission, the CE can work in: • push model (where the job is pushed to a CE for its execution). • pull model (where the CE asks the WMS for jobs).

  4. What is Torque? • TORQUE(Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource. • The Torque System is composed by a: • pbs_server which provides the basic batch services such as receiving/creating a batch job or protecting the job against system crashes. • job_scheduler which contains the site's policy used to decide which job must be executed. • pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user.

  5. What is a Worker Node? • The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the: • gLite I/O Client, • the Logging and Bookkeeping Client, • the R-GMA Client and • the WMS Checkpointing library.

  6. Installing CE + Torque Server WN + Torque

  7. Preliminary and common steps • Start from an instalation of SLC 3.0.8 • Install JAVA SDK • Remove LAM and Postfix • Check the hostname • Install and configure ntp daemon • Install X.509 host certificates /etc/grid-security and check their file permissions. • Install the latest version of glite-yaim • Install the middleware

  8. Installing pre-requisites • JAVA is not included in distribution. Install it separately (>= 1.4.2_08) • apt-get install j2sdk

  9. Installing pre-requisites • Depending on the packages set you selected when installing the operating system, it may be possible that lam package is installed on your WN. Please remove lam. apt-get remove lam • There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix package apt-get remove postfix

  10. Installing pre-requisites • Check the FQDN hostname • Ensure that the hostnames of your machines are correctly set. Run the command: hostname -f

  11. Installing pre-requisites • Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system: • apt-get install ntp • Add your time server in /etc/ntp.conf • restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap noquery • server <time_server_name> • (you can use ntp-1.infn.it – IP 193.206.144.10) • Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname • If you are running a firewall, you will have to allow inbound comminication on the NTP port: • -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT • Activate the ntpd service with the following commands: • ntpdate <your ntp server name> • service ntpd start • chkconfig ntpd on • You can check ntpd’s status with: • ntpq -p

  12. Installing pre-requisites • Install glite-yaim • apt-get install glite-yaim-core • apt-get install glite-yaim-clients

  13. Installing pre-requisites • Request host certificates for the CE to a CA • Copy host certificate (hostcert.pem and hostkey.pem) in /etc/grid-certificates. • Change the permisions • chmod 644 hostcert.pem • chmod 400 hostkey.pem

  14. Installing CE+Torque Server via apt • All the configuration values to sites have to be configured in a site configuration file using key-value pairs. • This file is shared among all the different gLite node types. So edit once and keep it in a safe place • Create a copy of /opt/glite/yaim/examples/site-info.def template (coming from the glite-yaim-core package) to your reference directory for the installation (e.g. /root/siteinfo): • cp /opt/glite/yaim/examples/site-info.def /root/siteinfo/site-info.def • A good syntax test for your site configuration file is to try to source it manually running the command: • source site-info.def

  15. Installing CE+Torque Server via apt • The configuration is stored in a directory structure which will be extended in the near future. Currently the following files are used: site-info.def and the vo.d directory.

  16. Installing CE+Torque Server via apt • The /root/siteinfo/vo.d directory • Each file name in this directory has to be the lower-cased version of e VO name defined in site-info.def. The matching file should contain the definitions for that VO and will overwrite the ones which are defined in site-info.def. • SW_DIR=$VO_SW_DIR/eela DEFAULT_SE=$CLASSIC_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/eela

  17. Installing CE+Torque Server via apt • vi /opt/glite/yaim/etc/wn-list.conf limaXX.ring.pucp.edu.pe limaXX.ring.pucp.edu.pe …..

  18. Installing CE+Torque Server via apt • Install the node • /opt/glite/yaim/bin/yaim -i -s /root/siteinfo/site-info.def -m glite-CE • Configure the node • /opt/glite/yaim/bin/yaim -c -s /root/siteinfo/site-info.def -n lcg-CE_torque -n MPI_CE -n BDII_site

  19. Installing CE+Torque Server via apt • If the installation is performed successfully, the following components are installed: • gLite in /opt/glite • Condor in /opt/condor-x.y.x (where x.y.z is the current condor version) • Globus in /opt/globus • Tomcat in /var/lib/tomcat5 • Torque in /var/spool/pbs

  20. Installing CE+Torque Server via apt • Edit /etc/ssh/sshd_config and add the following lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes • Restart the server with: /sbin/service sshd restart

  21. Installing CE+Torque Server via apt • On the CE generate an updated version of /etc/ssh/ssh_know_hosts by running: • edg-pbs-shostsequiv • edg-pbs-knownhosts • Copy that file into all the WorkerNodes.

  22. Installing WN Server via apt Install the node /opt/glite/yaim/bin/yaim -i -s /root/siteinfo/site-info.def -m glite-WN -m glite-torque-client-config Configure the node /opt/glite/yaim/bin/yaim -c -s /root/siteinfo/site-info.def -n WN_torque

  23. References • https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide301 • https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide310

More Related