Download
neil skrypuch cosc 3p93 3 21 2007 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Highly Distributed Parallel Computing PowerPoint Presentation
Download Presentation
Highly Distributed Parallel Computing

Highly Distributed Parallel Computing

75 Views Download Presentation
Download Presentation

Highly Distributed Parallel Computing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Neil Skrypuch COSC 3P93 3/21/2007 Highly DistributedParallel Computing

  2. Overview • a network of computers all working towards a similar goal • network consists of many nodes, few servers • nodes perform computing and send results to a server • servers distribute jobs • node machines do not communicate with eachother

  3. Pros

  4. Relatively Simple • don't need to worry about special interconnections • don't need to worry about cluster booting

  5. Non-Homogeneous Network • can work across different computer architectures, OSes, etc • computers can be of varying speeds • doesn't require the fastest or most expensive computers • computers can be distributed anywhere in the world

  6. Infrastructure • infrastructure for HDPC already exists almost everywhere • anyone with a network of computers is already ready for HDPC • lots of programs already exist that take advantage of HDPC

  7. Expansion • expansion is painless • there are no special constraints on the “shape” of the network • not fast enough yet? keep adding more computers until it is

  8. Resilience to Failure • it doesn't matter if one or more nodes die • only the reliability of the central server(s) matter

  9. Cons

  10. Suitability • not all problems are suited to HDPC • highly communication bound problems are a poor fit for HDPC

  11. Server Dependence • central server dependence is a double edged sword • if the central server becomes unavailable, everything grinds to a halt

  12. Network (In)security • how to verify if a client should be allowed to join the network? • protecting data sent over the network • verifying integrity and authenticity of data sent over the network

  13. Network (Un)reliability • nodes temporarily losing connectivity may make them temporarily useless

  14. Dealing With the Issues

  15. Server Dependence • the central server need not be a single server • server itself may be clustered • countless ways to cluster servers

  16. Clustering With a Database • allow nodes to talk directly to the database • cluster the database over multiple servers • multi-master replication • single master replication • lots more...

  17. Server Hierarchy • multiple tiers of servers may also be used • could be considered recursive HDPC • very similar to the tree architecture of supercomputers

  18. Lost Nodes • define a maximum amount of time to wait for a node's response • use redundancy • assume some nodes will always be lost • send duplicate jobs to multiple nodes simultaneously

  19. Network (In)security • not as big of an issue as one might think • encryption and public key infrastructures mitigate most confidentiality and authenticity concerns • redundancy is useful for both reliability and security

  20. Work Buffering • taking larger portions of work at a time • temporary connectivity issues pose less of a problem this way • a node can continue working without talking to a central server for longer

  21. Where is HDPC Useful?

  22. Combinatorics • search • enumeration • generation

  23. Cryptography • brute force cipher cracking • gives a glimpse of the future, in terms of what the average person will be able to crack

  24. Artificial Intelligence • genetic algorithms • genetic programming • alpha-beta search

  25. Graphics • ray tracing • animation • fractal generation and calculation

  26. Simulation • weather and climate modeling • particle physics

  27. Guidelines for Suitability • most problems involving a large search tree are well suited to HDPC • anything that can be broken down into smaller, self-contained, chunks is a good candidate for HDPC

  28. How Well Does HDPC Work?

  29. Folding@Home • ~200,000 non-dedicated nodes • 240 TFLOPS • approximately 40 central servers, unknown speeds

  30. SETI@Home • ~200,000 non-dedicated nodes • 288 TFLOPS • 10 central servers, all relatively modest

  31. Blue Gene/L • currently the fastest supercomputer • not HDPC • 65,536 dedicated nodes • 280 TFLOPS • cost about $100,000,000 US

  32. HDPC Works Well • typical speedup is close to linear • cost is substantially less than a comparable supercomputer • nodes can also be general purpose computers

  33. Why Does HDPC Work Well?

  34. Infrastructure Reuse • in general, new hardware investments are not necessary • creating new infrastructure is expensive and time consuming • it's easy to justify using things you already have for additional purposes • there are tons of idle CPUs at any given time, why not use them?

  35. Low Barrier to Entry • anyone with a couple of networked computers can start experimenting

  36. Painlessly Scalable • smooth curve upwards for both cost and performance

  37. Simpler to Program • doesn't require as much “thinking in parallel” in comparison to other approaches • thinking in parallel is hard and fundamentally different than thinking serially • pushes the heavy lifting onto the database instead of the application programmer

  38. Commodity Hardware is Fast • a typical desktop machine today is more powerful than a supercomputer from 15 years ago • and costs orders of magnitude less • and outputs much less heat • and takes up much less space • and consumes much less power

  39. The Future • supercomputers will become faster • HDPC will become even faster than supercomputers • as both number of computers and speed increases • both supercomputers and HDPC will fill their own separate niche

  40. Questions and Discussion

  41. References • http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats • http://www.boincstats.com/stats/project_graph.php?pr=sah • http://www.boincstats.com/stats/project_graph.php?pr=bo • http://www.itjungle.com/tlb/tlb033004-story04.html • http://setiathome.berkeley.edu/sah_status.html • http://fah-web.stanford.edu/serverstat.html • http://top500.org/list/2006/11/100