1 / 64

Case Studies

Experiencing Cluster Computing. Case Studies. Class 8. Description. Download the source from http://www.sci.hkbu.edu.hk/tdgc/tutorial/ExpClusterComp/casestudy/casestudy.zip Unzip the package Follow the instructions from each example. Hello World. Hello World.

gjune
Télécharger la présentation

Case Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiencing Cluster Computing Case Studies Class 8

  2. Description • Download the source from http://www.sci.hkbu.edu.hk/tdgc/tutorial/ExpClusterComp/casestudy/casestudy.zip • Unzip the package • Follow the instructions from each example

  3. Hello World

  4. Hello World • The sample program uses MPI and has each MPI process print Hello world from process i of n • using the rank in MPI_COMM_WORLD for i and the size of MPI_COMM_WORLD for n. You can assume that all processes support output for this example. • Note the order that the output appears in. Depending on your MPI implementation, characters from different lines may be intermixed. A subsequent exercise (I/O master/slaves) will show how to order the output. • You may want to use these MPI routines in your solution: MPI_Init, MPI_Comm_size, MPI_Comm_rank, MPI_Finalize

  5. Hello World Source casestudy/helloworld/helloworld.c casestudy/helloworld/Makefile Compile and run % mpicc -o helloworld helloworld.c % mpirun -np 4 helloworld Sample output Hello world from process 0 of 4 Hello world from process 3 of 4 Hello world from process 1 of 4 Hello world from process 2 of 4

  6. Sending in a Ring

  7. Sending in a Ring • The sample program that takes data from process zero and sends it to all of the other processes by sending it in a ring. That is, process i should receive the data and send it to process i+1, until the last process is reached. • Assume that the data consists of a single integer. Process zero reads the data from the user. • You may want to use these MPI routines in your solution: MPI_Send, MPI_Recv

  8. Sending in a Ring

  9. Sending in a Ring Source casestudy/ring/ring.c casestudy/ring/Makefile Compile and run % mpicc -o ring ring.c % mpirun -np 4 ring Sample Output 10 Process 0 got 10 22 Process 0 got 22 -1 Process 0 got -1 Process 3 got 10 Process 3 got 22 Process 3 got -1 Process 2 got 10 Process 2 got 22 Process 2 got -1 Process 1 got 10 Process 1 got 22 Process 1 got -1

  10. Finding PI using MPI collective operations

  11. Finding PI using MPI collective operations • The method evaluates PI using the integral of 4/(1+x*x) between 0 and 1. The integral is approximated by a sum of n intervals • The approximation to the integral in each interval is (1/n)*4/(1+x*x). • The master process asks the user for the number of intervals • The master then broadcast this number to all of the other processes. • Each process then adds up every n'th interval (x = 0+rank/n, 0+rank/n+size/n,...). • Finally, the sums computed by each process are added together using a reduction.

  12. Finding PI using MPI collective operations Source casestudy/pi/pi.c casestudy/pi/Makefile Sample Output: Enter the number of intervals: (0 quits) 100 pi is approximately 3.1416009869231249, Error is 0.0000083333333318 Enter the number of intervals: (0 quits) 1000 pi is approximately 3.1415927369231262, Error is 0.0000000833333331 Enter the number of intervals: (0 quits) 10000 pi is approximately 3.1415926544231256, Error is 0.0000000008333325 Enter the number of intervals: (0 quits) 100000 pi is approximately 3.1415926535981269, Error is 0.0000000000083338 Enter the number of intervals: (0 quits) 1000000 pi is approximately 3.1415926535898708, Error is 0.0000000000000777 Enter the number of intervals: (0 quits) 10000000 pi is approximately 3.1415926535897922, Error is 0.0000000000000009

  13. Implementing Fairness using Waitsome

  14. Implementing Fairness using Waitsome • Write a program to provide fair reception of message from all sending processes. Arrange the program to have all processes except process 0 send 100 messages to process 0. Have process 0 print out the messages as it receives them. Use nonblocking receives and MPI_Waitsome. Is the MPI implementation fair? • You may want to use these MPI routines in your solution: MPI_Waitsome, MPI_Irecv, MPI_Cancel

  15. Implementing Fairness using Waitsome Source: casestudy/fairness/fairness.c casestudy/fairness/Makefile Sample Output: Msg from 1 with tag 0 Msg from 1 with tag 1 Msg from 1 with tag 2 Msg from 1 with tag 3 Msg from 1 with tag 4 … Msg from 2 with tag 21 Msg from 1 with tag 55 Msg from 2 with tag 22 Msg from 1 with tag 56 …

  16. Master/slave

  17. Master/slave • Message passing is well-suited to handling computations where a task is divided up into subtasks, with most of the processes used to compute the subtasks and a few processes (often just one process) managing the tasks. The manager is called the "master" and the others the "workers" or the "slaves". • In this example, it is to build an Input/Output master/slave system. This will allow you to relatively easily arrange for different kinds of input and output from the program, including • Ordered output (process 2 after process 1) • Duplicate removal (a single instance of "Hello world" instead of one from each process) • Input to all processes from a terminal

  18. Master/slave • This will be accomplished by dividing the processes in MPI_COMM_WORLD into two sets: • The master (who will do all of the I/O) and the slaves (who will do all of their I/O by contacting the master). • The slaves will also do any other computation that they might desire; for example, they might implement the Jacobi iteration. • The master should accept messages from the slaves (of type MPI_CHAR) and print them in rank order (that is, first from slave 0, then from slave 1, etc.). The slaves should each send 2 messages to the master. For simplicity, Have the slaves send the messages Hello from slave 3 Goodbye from slave 3 • You may want to use these MPI routines in your solution: MPI_Comm_split, MPI_Send, MPI_Recv

  19. Master/slave Source casestudy/io/io.c casestudy/io/Makefile Sample Output % mpicc -o io io.c % mpirun -np 4 io Hello from slave 0 Hello from slave 1 Hello from slave 2 Goodbye from slave 0 Goodbye from slave 1 Goodbye from slave 2

  20. A simple output server

  21. A simple output server • Modify the previous example accept three types of messages from the slaves. These types are • Ordered output (just like the previous exercise) • Unordered output (as if each slave printed directly) • Exit notification (see below) • The master continues to receive messages until it has received an exit message from each slave. For simplicity in programming, have each slave send the messages Hello from slave 3 Goodbye from slave 3 and I'm exiting (3) • You may want to use these MPI routines in your solution: MPI_Comm_split, MPI_Send, MPI_Recv with the ordered output mode with the unordered output mode

  22. A simple output server Source casestudy/io2/io2.c casestudy/io2/Makefile Sample Output % mpicc -o io2 io2.c % mpirun -np 4 io2 Hello from slave 0 Hello from slave 1 Hello from slave 2 Goodbye from slave 0 Goodbye from slave 1 Goodbye from slave 2 I'm exiting (0) I'm exiting (2) I'm exiting (1)

  23. Benchmarking collective barrier

  24. Benchmarking collective barrier • The sample program measures the time it takes to perform an MPI_Barrier on MPI_COMM_WORLD. • It will print the size of MPI_COMM_WORLD and time for each test and make sure that both sender and receiver are ready when the test begin. • How does the performance of MPI_Barrier vary with the size of MPI_COMM_WORLD?

  25. Benchmarking collective barrier Source: casestudy/barrier/barrier.c casestudy/barrier/Makefile Sample Output: % mpirun -np 1 barrier Kind np time (sec) Barrier 1 0.000000 Barrier 5 0.000212 Barrier 10 0.000258 Barrier 15 0.000327 Barrier 20 0.000401 Barrier 40 0.000442

  26. Determining the amount of MPI buffering

  27. Determining the amount of MPI buffering • The sample program determines the amount of buffering that MPI_Send provides. That it, determining how large a message can be sent with MPI_Send without a matching receive at the destination. • You may want to use these MPI routines in your solution: MPI_Wtime, MPI_Send, MPI_Recv

  28. Determining the amount of MPI buffering Hint: Use MPI_Wtime to establish a delay until an MPI_Recv is called at the destination process. By timing the MPI_Send, you can detect when the MPI_Send was waiting for the MPI_Recv Source: casestudy/buflimit/buflimit.c casestudy/buflimit/Makefile

  29. Determining the amount of MPI buffering Sample Output: % mpirun -np 2 buflimit Process 0 on tdgrocks.sci.hkbu.edu.hk Process 1 on comp-pvfs-0-1.local 0 received 1024 fr 1 1 received 1024 fr 0 0 received 2048 fr 1 1 received 2048 fr 0 0 received 4096 fr 1 1 received 4096 fr 0 0 received 8192 fr 1 1 received 8192 fr 0 0 received 16384 fr 1 1 received 16384 fr 0 0 received 32768 fr 1 1 received 32768 fr 0 MPI_Send blocks with buffers of size 65536 0 received 65536 fr 1 1 received 65536 fr 0

  30. Exploring the cost of synchronization delays

  31. Exploring the cost of synchronization delays • In this example, 2 processes are communicating with a third. • Process 0 is sending a long message to process 1 and process 2 is sending a relatively short message to process 1 and then to process 0. • The code is arranged so that process 1 has already posted an MPI_Irecv for the message from process 2 before receiving the message from process 0, but also ensure that process 1 receives the long message from process 0 before receiving the message from process 2.

  32. Exploring the cost of synchronization delays • This seemingly complex communication pattern but can occur in an application due to timing variations on each processor. • If the message sent by process 2 to process 1 is short but long enough to require a rendezvous protocol (meeting point), there can be a significant delay before the short message from process 2 is received by process 1, even though the receive for that message is already available. • Explore the possibilities by considering various lengths of messages.

  33. Exploring the cost of synchronization delays

  34. Exploring the cost of synchronization delays Source casestudy/bad/bad.c casestudy/bad/Makefile Sample Output % mpirun -np 3 maxtime [2] Litsize = 1, Time for first send = 0.000020, for second = 0.000009

  35. Graphics

  36. Graphics • A simple MPI example program that uses a number of procedures in the MPE graphics library. • The program draws lines and squares with different colors in graphic mode. • User can select a region and the program will report the selected coordination.

  37. Graphics Source: casestudy/graph/mpegraph.c casestudy/graph/Makefile

  38. GalaxSee

  39. GalaxSee • The GalaxSee program lets the user model a number of bodies in space moving under the influence of their mutual gravitational attraction. • It is effective for relatively small numbers of bodies (on the order of a few hundred), rather than the large numbers (over a million) currently being used by scientists to simulate galaxies. • GalaxSee allows the user to see the effects that various initial configurations (mass, velocity, spacial distribution, rotation, dark matter, and presence of an intruder galaxy) have on the behavior of the system.

  40. GalaxSee • Command line options: • num_stars star_mass t_final do_display. where num_stars : the number of stars (integer), star_mass : star mass (decimal), t_final : final time for the model in Myears (decimal). do_display : enter a 1 to show a graphical display, or a 0 to not show a graphical display.

  41. GalaxSee Source: casestudy/galaxsee/Gal_pack.tgz Reference: http://www.shodor.org/master/galaxsee/

  42. Cracking RSA

  43. Cryptanalysis • Cryptanalysis is the study of how to compromise (defeat) cryptographic mechanisms, and cryptology is the discipline of cryptography and cryptanalysis combined. • To most people, cryptography is concerned with keeping communications private. Indeed, the protection of sensitive communications has been the emphasis of cryptography throughout much of its history.

  44. Encryption and Decryption • Encryption is the transformation of data into a form that is as close to impossible as possible to read without the appropriate knowledge (a key; see below). Its purpose is to ensure privacy by keeping information hidden from anyone for whom it is not intended, even those who have access to the encrypted data. • Decryption is the reverse of encryption; it is the transformation of encrypted data back into an intelligible form.

  45. Cryptography • Today's cryptography is more than encryption and decryption. Authentication is as fundamentally a part of our lives as privacy. • We use authentication throughout our everyday lives - when we sign our name to some document for instance - and, as we move to a world where our decisions and agreements are communicated electronically, we need to have electronic techniques for providing authentication.

  46. Public-Key vs. Secret-Key Cryptography • A cryptosystem is simply an algorithm that can convert input data into something unrecognizable (encryption), and convert the unrecognizable data back to its original form (decryption). • To encrypt, feed input data (known as "plaintext") and an encryption key to the encryption portion of the algorithm. • To decrypt, feed the encrypted data (known as "ciphertext") and the proper decryption key to the decryption portion of the algorithm. The key is simply a secret number or series of numbers. Depending on the algorithm, the numbers may be random or may adhere to mathematical formulae.

  47. Public-Key vs. Secret-Key Cryptography • The drawback to secret-key cryptography is the necessity of sharing keys. • For instance, suppose Alice is sending email to Bob. She wants to encrypt it first so any eavesdropper will not be able to understand the message. But if she encrypts using secret-key cryptography, she has to somehow get the key into Bob's hands. If an eavesdropper can intercept a regular message, then an eavesdropper will probably be able to intercept the message that communicates the key.

  48. Public-Key vs. Secret-Key Cryptography • In contrast to secret-key is public-key cryptography. In such a system there are two keys, a public key and its inverse, the private key. • In such a system when Alice sends email to Bob, she finds his public key (possibly in a directory of some sort) and encrypts her message using that key. Unlike secret-key cryptography, though, the key used to encrypt will not decrypt the ciphertext. Knowledge of Bob's public key will not help an eavesdropper. To decrypt, Bob uses his private key. If Bob wants to respond to Alice, he will encrypt his message using her public key.

  49. The One-Way Function • The challenge of public-key cryptography is developing a system in which it is impossible (or at least intractable) to deduce the private key from the public key. • This can be accomplished by utilizing a one-way function. With a one-way function, given some input values, it is relatively simple to compute a result. But if you start with the result, it is extremely difficult to compute the original input values. In mathematical terms, given x, computing f(x) is easy, but given f(x), it is extremely difficult to determine x.

  50. RSA • The RSA cryptosystem is a public-key cryptosystem that offers both encryption and digital signatures (authentication). Ronald Rivest, Adi Shamir, and Leonard Adleman developed the RSA system in 1977 [RSA78]; RSA stands for the first letter in each of its inventors' last names.

More Related