Sieve of Eratosthenes

Sieve of Eratosthenes by Fola Olagbemi

Outline • What is the sieve of Eratosthenes? • Algorithm used • Parallelizing the algorithm • Data decomposition options for the parallel algorithm • Block decomposition (to minimize inter-process communication) • The program (returns count of primes) • References

What is the sieve of Eratosthenes? • An algorithm designed by the Greek mathematician Eratosthenes: used to find the prime numbers between two and another specified number. • It works by gradually eliminating multiples of the smallest unmarked number (x) in the given interval, till > the specified number.

Algorithm • Create a list of natural numbers in the specified range e.g. 2, 3, 4, ….. , n, none of which is marked • Set k to 2 (or smallest number in list) where k is the smallest unmarked number in the list • Repeat • Mark all multiples of k between square of k and n • Find smallest number greater than k that is unmarked. Set k to the new value, until > n • The unmarked numbers are prime.

Parallelizing the algorithm • Step 3a – key parallel computation step • Reduction needed in each iteration to determine new value of k (process 0 – controller - may not have the next k) • Broadcast needed after reduction to inform all processes of new k. • The goal in parallelizing is to reduce inter-process communication as much as possible.

Data decomposition options • Goal of data decomposition approach chosen – each process should have roughly the same number of values to work on (optimal load balancing) • Decomposition options: • Interleaved data decomposition • Block data decomposition: • Grouped • Distributed

Data decomposition options (contd.) • Interleaved data decomposition (e.g.): • Process 0 processes numbers 2, 2+p, 2+2p, … • Process 1 processes numbers 3, 3+p, 3+2p, … • Advantage • easy to know which process is working on the value in any array index i • Disadvantages • It can lead to significant load imbalance among processes (depending on how array elements are divided up among processes) • May require a significant amount of reduction/broadcast (communication overheads)

Data decomposition options (contd.) • Block data decomposition: • Grouped: first few processes have more values than the last couple of processes • Distributed: First process has less, second has more, third has less etc. (this is the option used in the program) Task 0 Task 1 Task 3 Task 2 Grouped Task 0 Task 1 Task 3 Task 2 Distributed

Data decomposition (contd.) • Grouped data decomposition: • Let r = n mod p (n – number of values in the list, p – number of processes) • If r = 0, all processes get block of values of size n/p • Else, first r processes get blocks of size ceil(n/p), remaining processes blocks of size (floor(n/p) • Distributed data decomposition: • First element controlled by process i is floor(i*n/p) • Last element controlled by process I is floor((i+1)n/p) - 1

Data decomposition (Contd.) • To reduce communication (i.e. reduce number of MPI_Reduce calls to 1), all the primes used for sieving must be held by process 0. • This requirement places a limitation on the number of processes that can be used in this approach.

References • Michael J. Quinn, Parallel Programming in C with MPI and OpenMP;McGraw Hill; 2004

Questions?

Sieve of Eratosthenes