220 likes | 359 Vues
This project proposal outlines a naive version of parallel matrix multiplication leveraging MPI (Message Passing Interface). We detail the step-by-step computation process, including how Processor0 reads input, distributes matrices, broadcasts data, and gathers results from all processors. Data generation is conducted in R, producing matrices of varying sizes (up to 4096x4096) with integer values ranging from -1000 to 1000. We also investigate performance characteristics, including superlinear speedup, while noting the trade-offs introduced by communication overhead.
E N D
CSE5304—Project ProposalParallel Matrix Multiplication Tian Mi
An naive version with MPI Result: P1 P2 … Pi … PN
An naive version with MPI Pi Pi
An naive version with MPI • Processor0 reads input file • Processor0 distributes one matrix • Processor0 broadcasts the other matrix • All processors in parallel • Do the multiplication of each piece of data • Processor0 gathers the result • Processor0 writes result to output file
Data generation • Data generation in R with package “igraph” • Integer in range of [-1000, 1000] • Matrix size:
Result • Data size: 1024*1024
Result • Data size: 1024*1024
Result • Data size: 1024*1024
Result • Data size: 2048*2048
Result • Data size: 2048*2048
Result • Data size: 2048*2048
Result • Data size: 4096*4096
Analysis • To see the superlinear speedup • increase the computation, which is not dominant enough • larger matrix and larger integer • However, larger matrix or long integer will also increase the communication time (broadcast, scatter, gather)
Cannon's algorithm--Example • http://www.vampire.vanderbilt.edu/education-outreach/me343_fall2008/notes/parallelMM_10_09.pdf
Cannon's algorithm • Still Implementing and debugging • No result to share at present
Thank you • Questions & Comments?