220 likes | 244 Vues
Explore the capabilities of Condor as a Grid manager, handling distributed resources efficiently. Learn how to deploy Condor for high computation tasks with a step-by-step guide. Discover the benefits of turning your workstation into a Personal Condor and expanding into Grid networks. Leverage Condor for load balancing and remote tasks, optimizing resource usage and job management. Dive into a hands-on scenario to study behavior functions with Condor's help, making complex computations manageable. Maximize your computing potential with Condor's robust features and streamline your workflow with its remote execution capabilities.
E N D
“ … Since the early days of mankind the primary motivation for the establishment of communities has been the idea that by being part of an organized group the capabilities of an individual are improved. The great progress in the area of inter-computer communication led to the development of means by which stand-alone processing sub-systems can be integrated into multi-computer ‘communities’. … “ Miron Livny, “Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems.”, Ph.D thesis, July 1983.
Condor as a ... • … Grid • … window to the Grid • … manager of Grid resources • … a source of Grid technology
Main Condor capabilities • Management of large collections of distributively owned heterogeneous resources (CPU, storage, network, software) • Management of large (10K) collections of jobs. • Remote Execution • Remote I/O • Checkpointing • Matchmaking • System administration
Condor Deployment(that we know of) • More than 4000 CPUs world-wide • More than 1200 CPUs at UW • More than 200 CPUs at INFN • More than 800 CPUs in industry.
A Simple Scenario Study the behavior of F(x,y,z) for 20 values of x, 10 values of y and 3 values of z (20*10*3 = 600) • F takes on the average 3 hours to compute on a “typical” workstation (total = 1800 hours) • F requires a “moderate” (128MB) amount of memory • F performs “little” I/O - (x,y,z) is 15 MB and F(x,y,z) is 40 MB
Step I - get organized! • Turn your workstation into a “Personal Condor” • Write a script that creates 600 input files for each of the (x,y,z) combinations • Submit a cluster of 600 jobs to your personal Condor • Write a script that collects the data from the 600 output files • Go on a long vacation … (2.5 months)
Your Personal Condor will ... • ... keep an eye on your jobs and will keep you posted on their progress • ... implement your policy on when the jobs can run on your workstation • ... implement your policy on the execution order of the jobs • .. add fault tolerance to your jobs • … keep a log of your job activities
personal Condor your workstation 600 Condor jobs
Step II - build a Grid • Install Condor on the machine next door. • Install Condor on the machines in the class room. • Install Condor on the O2K in the basement. • Configure these machines to be part of your Condor pool/grid. • Go on a shorter vacation ...
personal Condor Group Condor your workstation 600 Condor jobs
Step III - Take advantage of your friends • Get permission from “friendly” Condor pools/Grids to access their resources • Configure your personal Condor to “flock” to these pools/grids • reconsider your vacation plans ...
personal Condor Group Condor your workstation 600 Condor jobs friendly Condor
Step IV - Think big! • Get access (account(s) + certificate(s)) to Globus managed Grid resources • Submit 599 “To Globus” Condor glide-in jobs to your personal Condor • When all your jobs are done, remove any pending glide-in jobs • Take the rest of the afternoon off ...
A “To-Globus” glide-in job will ... • … transform itself into a Globus job, • submit itself to Globus managed Grid resource, • be monitored by your personal Condor, • once the Globus job is allocated a resource, it will use a GSIFTP server to fetch Condor agents, start them, and add the resource to your personal Condor, • vacate the resource before it is revoked by the remote scheduler
personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS 599 glide-ins friendly Condor Condor
VizBench - send us your dataand we will send you back a movie(a SC99 demo by NCSA)
Frame Rendering Managed and Powered by a Personal Condor A locally installed Personal Condor is used by the VizBench server to • manage, monitor and control the execution of frame rendering tasks, • manage local rendering resources and • locate remote and Grid resources that are capable and willing to render frames
UW Condor UNM Supercluster Condor jobs VizBench Web Server Viz- Bench Globus Gatekeeper Globus Gatekeeper Personal Condor BU O2K NCSA Condor
Grid Obstacles (Sociology) (Education) (Robustness) (Portability) (Technology) • Ownership Distribution • Customer Awareness • Size and Uncertainties • Technology Evolution • Physical Distribution
C High Throughput Computing ondor Visit us at http://www.cs.wisc.edu/condor