1 / 36

Condor-G Making Condor Grid Enabled

Condor-G Making Condor Grid Enabled. Outline. Why use Condor-G Globus Universe GlideIn Status & Future Work. What is Condor-G?. Extensions to Condor to allow access to the Grid through Globus Two Parts Globus Universe GlideIn. Why Use Condor-G. Condor

Télécharger la présentation

Condor-G Making Condor Grid Enabled

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Condor-GMaking Condor Grid Enabled

  2. Outline • Why use Condor-G • Globus Universe • GlideIn • Status & Future Work

  3. What is Condor-G? • Extensions to Condor to allow access to the Grid through Globus • Two Parts • Globus Universe • GlideIn

  4. Why Use Condor-G • Condor • Designed to run jobs within a single administrative domain • Globus • Designed to run jobs across many administrative domains • Condor-G • Combine the strengths of both

  5. Condor-G Helps Condor Users • Machines available to Condor users are limited • Local Condor Pool • Friendly Condor Pools (via Flocking) • Through Globus, many more machines become available to run your jobs

  6. Condor-G Helps Globus Users • Globus is primarily an infrastructure upon which to develop distributed applications • Command-line tools are limited • Some users don’t want to rewrite their applications to use Globus • Condor-G provides them a powerful interface to the Grid to run their existing applications

  7. Globus Universe • Advantages of using Condor as a front-end to Globus • Full-featured queuing service • Fault-tolerance • Credential Management

  8. Full-Featured Queue • Persistent queue • Many queue-manipulation tools • Set up job dependencies (DAGman) • E-mail notification of events • Log files

  9. Fault-Tolerance • Local Crash • Queue state kept on disk • Condor Master restarts other daemons • Remote Crash • Condor will resubmit jobs • Globus jobmanager enhanced to improve recoverability

  10. Credential Management • Authentication in Globus is done with limited-lifetime X509 proxies • Proxy may expire before jobs finish executing • Condor can put jobs on hold and e-mail user to refresh proxy

  11. How It Works Personal Condor Globus Resource Schedd LSF

  12. 600 Globus jobs How It Works Personal Condor Globus Resource Schedd LSF

  13. 600 Globus jobs How It Works Personal Condor Globus Resource Schedd LSF GridManager

  14. 600 Globus jobs How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager

  15. 600 Globus jobs How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager User Job

  16. Globus Universe • Disadvantages • No matchmaking or dynamic scheduling of jobs • No job checkpoint or migration • No remote system calls

  17. Solution: GlideIn • Use the Globus Universe to run the Condor daemons on Globus resources • When the resources run these GlideIn jobs, they will join your Condor Pool • Submit your jobs as Standard or Vanilla Universe jobs and they will be matched and run on the Globus resources

  18. 600 Condor jobs How It Works Personal Condor Globus Resource Schedd LSF Collector

  19. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource Schedd LSF Collector

  20. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource Schedd LSF GridManager Collector

  21. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager Collector

  22. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager Startd Collector

  23. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager Startd Collector

  24. 600 Condor jobs glide-ins How It Works Personal Condor Globus Resource JobManager Schedd LSF GridManager Startd Collector User Job

  25. GlideIn Concerns • What if a Globus resource kills my GlideIn? • That resource will disappear from your pool and you jobs will be rescheduled on other machines • What if all my jobs are done before a GlideIn runs? • If the glided-in Condor daemons are not matched with a job in 10 minutes, they terminate

  26. personal Condor Globus Grid your workstation LSF PBS Condor Group Condor

  27. personal Condor Globus Grid your workstation 600 Condor jobs LSF PBS Condor Group Condor

  28. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS Condor

  29. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS glide-ins Condor

  30. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS glide-ins Condor

  31. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS glide-ins Condor

  32. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS glide-ins Condor

  33. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS glide-ins Condor

  34. Current Status • First version of GridManager ready • Runs jobs using Globus GRAM • Stages executable and standard I/O using Globus GASS • Jobmanager changes will be folded into a future release of Globus • Credential management in progress

  35. Future Work • GridManager • Stage user jobs’ data files • Automatic GlideIn • Condor creates GlideIn jobs when more resources are needed • Matchmaking in Globus Universe • Use Globus GRIS to create ClassAds for Globus resources and match them to job ClassAds

  36. QuestionsandThank You!

More Related