1 / 33

Celebrating 20 Years of Condor: Distributed Computing Research

Join us in celebrating 20 years of the Condor Project, a distributed computing research initiative. Our team faces software engineering challenges in diverse environments and collaborates with academia and industry. We maintain a distributed production environment and educate students. Funding from DOE, NIH, NSF, and industry partners. Download statistics, recent contributions, and research highlights provided.

wmcvey
Télécharger la présentation

Celebrating 20 Years of Condor: Distributed Computing Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome!!! Condor Week 2006

  2. 1986-2006Celebrating 20 years since we first installed Condor in our department

  3. The Condor Project(Established ‘85) Distributed Computing research performed by a team of ~40 faculty, full time staff and students who • face software/middleware engineering challenges in a UNIX/Linux/Windows/OS X environment, • involved in national and international collaborations, • interact with users in academia and industry, • maintain and support a distributed production environment (more than 3800 CPUs at UW), • and educate and train students. Funding – DOE, NIH, NSF, INTEL, Micron, Microsoft and the UW Graduate School

  4. Excellence S u p p o r t Software Functionality Research

  5. “ … Since the early days of mankind the primary motivation for the establishment of communitieshas been the idea that by being part of an organized group the capabilities of an individual are improved. The great progress in the area of inter-computer communication led to the development of means by which stand-alone processing sub-systems can be integrated into multi-computer ‘communities’. … “ Miron Livny, “Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems.”, Ph.D thesis, July 1983.

  6. Main Threads of Activities • Distributed Computing Research – develop and evaluate new concepts, frameworks and technologies • The Open Science Grid (OSG) – build and operate a national distributed computing and storage infrastructure • Keep Condor “flight worthy” and support our users • The NSF Middleware Initiative (NMI) – develop, build and operate a national Build and Test facility • The Grid Laboratory Of Wisconsin (GLOW) – build, maintain and operate a distributed computing and storage infrastructure on the UW campus

  7. Downloads per month 900 X86/Linux 600 X86/Windows

  8. Condor-Users –Messages per month Condor Team Contributions

  9. The past year: • Two Ph.D students graduated: • Tevfik Kosar went to LSU • Sonny (Sechang) Son went to NetApp • Three staff members left to start graduate studies • Released Condor 6.6.9-.11 • Released Condor 6.7.6-.18 • Contributed to the formation of the Open Science Grid (OSG) consortium and the OSG Facility • Interfaced Condor with BOINC • Started the NSF funded CondorDB project • Released Virtual Data Toolkit (VDT) 1.3.3-.10 • Distributed five instances of the NSF Middleware Initiative (NMI) Build and Test facility

  10. The search for SUSY • Sanjay Padhi is a UW Chancellor Fellow who is working at the group of Prof. Sau Lan Wu at CERN • Using Condor Technologies he established a “grid access point” in his office at CERN • Through this access-point he managed to harness in 3 month (12/05-2/06) more that 500 CPU years from the LHC Computing Grid (LCG) the Open Science Grid (OSG) and UW Condor resources

  11. GAMS/Grid + CPLEX • Work of Prof. Michael Ferris from the optimization group in our department • Commercial modeling system – abundance of real life models to solve • Any model types allowed • Scheduling problems • Radiotherapy treatment planning • World trade (economic) models • New decomposition features facilitate use of grid/condor solution • Mixed Integer Programs can be extremely hard to solve to optimality • MIPLIB has 13 unsolved examples

  12. Tool and expertise combined • Various decomposition schemes coupled with • Fastest commercial solver - CPLEX • Shared file system / condor_chirp for inter-process communication • Sophisticated problem domain branching and cuts • Takes over 1 year of computation and goes nowhere – but knowledge gained! • Adaptive refinement strategy • Dedicated resources • “Timtab2” and “a1c1s1” problems solved to optimality (using over 650 machines running tasks each of which take between 1 hour and 5 days)

  13. Function Shipping,Data Shipping,or maybe simplyObject Shipping?

  14. Customer requests:Place y@S at L!System delivers.

  15. Basic step for y@SL • Allocate size(y) at L, • Allocate resources (disk bandwidth, memory, CPU, outgoing network bandwidth) on S • Allocate resources (disk bandwidth, memory, CPU, incoming network bandwidth) on L • Match S and L

  16. Or in other words, it takes two (or more) to Tango (or to place data)!

  17. When the “source” plays “nice”it “asks” for permission to place data at “destination”in advance

  18. S L GridFTP Control GridFTP Put File MatchMaker Match! Match! I am S and am looking for L to place a file I am L and I have what it takes to place a file

  19. The SC’05 effortJoint with the Globus GridFTP team

  20. Stork controls number of outgoing connections Destination advertises incoming connections

  21. A Master Workerview of the same effort

  22. Master Files Worker For Workers

  23. When the “source” does not play “nice”, destination must protect itself

  24. NeST Manages storage space and connections for a GridFTP server with commands like: • ADD_NEST_USER • ADD_USER_TO_LOT • ATTACH_LOT_TO_FILE • TERMINATE_LOT

  25. GridFTP Chirp

  26. Thank you for building such a wonderful community

More Related