1 / 41

Highly Distributed Parallel Computing

Neil Skrypuch COSC 3P93 3/21/2007. Highly Distributed Parallel Computing. Overview. a network of computers all working towards a similar goal network consists of many nodes, few servers nodes perform computing and send results to a server servers distribute jobs

haile
Télécharger la présentation

Highly Distributed Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neil Skrypuch COSC 3P93 3/21/2007 Highly DistributedParallel Computing

  2. Overview • a network of computers all working towards a similar goal • network consists of many nodes, few servers • nodes perform computing and send results to a server • servers distribute jobs • node machines do not communicate with eachother

  3. Pros

  4. Relatively Simple • don't need to worry about special interconnections • don't need to worry about cluster booting

  5. Non-Homogeneous Network • can work across different computer architectures, OSes, etc • computers can be of varying speeds • doesn't require the fastest or most expensive computers • computers can be distributed anywhere in the world

  6. Infrastructure • infrastructure for HDPC already exists almost everywhere • anyone with a network of computers is already ready for HDPC • lots of programs already exist that take advantage of HDPC

  7. Expansion • expansion is painless • there are no special constraints on the “shape” of the network • not fast enough yet? keep adding more computers until it is

  8. Resilience to Failure • it doesn't matter if one or more nodes die • only the reliability of the central server(s) matter

  9. Cons

  10. Suitability • not all problems are suited to HDPC • highly communication bound problems are a poor fit for HDPC

  11. Server Dependence • central server dependence is a double edged sword • if the central server becomes unavailable, everything grinds to a halt

  12. Network (In)security • how to verify if a client should be allowed to join the network? • protecting data sent over the network • verifying integrity and authenticity of data sent over the network

  13. Network (Un)reliability • nodes temporarily losing connectivity may make them temporarily useless

  14. Dealing With the Issues

  15. Server Dependence • the central server need not be a single server • server itself may be clustered • countless ways to cluster servers

  16. Clustering With a Database • allow nodes to talk directly to the database • cluster the database over multiple servers • multi-master replication • single master replication • lots more...

  17. Server Hierarchy • multiple tiers of servers may also be used • could be considered recursive HDPC • very similar to the tree architecture of supercomputers

  18. Lost Nodes • define a maximum amount of time to wait for a node's response • use redundancy • assume some nodes will always be lost • send duplicate jobs to multiple nodes simultaneously

  19. Network (In)security • not as big of an issue as one might think • encryption and public key infrastructures mitigate most confidentiality and authenticity concerns • redundancy is useful for both reliability and security

  20. Work Buffering • taking larger portions of work at a time • temporary connectivity issues pose less of a problem this way • a node can continue working without talking to a central server for longer

  21. Where is HDPC Useful?

  22. Combinatorics • search • enumeration • generation

  23. Cryptography • brute force cipher cracking • gives a glimpse of the future, in terms of what the average person will be able to crack

  24. Artificial Intelligence • genetic algorithms • genetic programming • alpha-beta search

  25. Graphics • ray tracing • animation • fractal generation and calculation

  26. Simulation • weather and climate modeling • particle physics

  27. Guidelines for Suitability • most problems involving a large search tree are well suited to HDPC • anything that can be broken down into smaller, self-contained, chunks is a good candidate for HDPC

  28. How Well Does HDPC Work?

  29. Folding@Home • ~200,000 non-dedicated nodes • 240 TFLOPS • approximately 40 central servers, unknown speeds

  30. SETI@Home • ~200,000 non-dedicated nodes • 288 TFLOPS • 10 central servers, all relatively modest

  31. Blue Gene/L • currently the fastest supercomputer • not HDPC • 65,536 dedicated nodes • 280 TFLOPS • cost about $100,000,000 US

  32. HDPC Works Well • typical speedup is close to linear • cost is substantially less than a comparable supercomputer • nodes can also be general purpose computers

  33. Why Does HDPC Work Well?

  34. Infrastructure Reuse • in general, new hardware investments are not necessary • creating new infrastructure is expensive and time consuming • it's easy to justify using things you already have for additional purposes • there are tons of idle CPUs at any given time, why not use them?

  35. Low Barrier to Entry • anyone with a couple of networked computers can start experimenting

  36. Painlessly Scalable • smooth curve upwards for both cost and performance

  37. Simpler to Program • doesn't require as much “thinking in parallel” in comparison to other approaches • thinking in parallel is hard and fundamentally different than thinking serially • pushes the heavy lifting onto the database instead of the application programmer

  38. Commodity Hardware is Fast • a typical desktop machine today is more powerful than a supercomputer from 15 years ago • and costs orders of magnitude less • and outputs much less heat • and takes up much less space • and consumes much less power

  39. The Future • supercomputers will become faster • HDPC will become even faster than supercomputers • as both number of computers and speed increases • both supercomputers and HDPC will fill their own separate niche

  40. Questions and Discussion

  41. References • http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats • http://www.boincstats.com/stats/project_graph.php?pr=sah • http://www.boincstats.com/stats/project_graph.php?pr=bo • http://www.itjungle.com/tlb/tlb033004-story04.html • http://setiathome.berkeley.edu/sah_status.html • http://fah-web.stanford.edu/serverstat.html • http://top500.org/list/2006/11/100

More Related