CSE-700 Parallel Programming Assignment 6
E N D
Presentation Transcript
CSE-700 Parallel ProgrammingAssignment 6 박성우 POSTECH Oct 19, 2007
Species and Sequences Species Sequence 1 Sequence 2 ... Sequence n
Ortholog Last Common Ancestor S By speciation Human Dog S1 S2
Paralog Human S By duplication Human S1 S1'
Inparalog Last Common Ancestor S By speciation Human Chimpanzee S1' S1 S2 By duplication
S S1 S2 S' S1' S2' Paralog - Outparalog LCA = Last Common Ancestor LCA Human Dog
Coortholog Species A Species B S1' S1 S2 S2'
Input • Assume a total of n species S1, S2, ..., Sn • For each pair of species {Si, Sj} • Ortholog and paralog relations • Thus n(n + 1)/2 ortholog/paralog files
Seed Ortholog Species A Species B Cluster 1.0 Si Sj
Invariant: No Two Seed Orthologs for Any Sequence Species A Species B Sj 1.0 Si 1.0 Sk
Ortholog and Paralogs Species A Species B Cluster 1.0 Si Sj Si'
Output • Assume a total of n species S1, S2, ..., Sn • Ortholog and paralog relations among all these species • In each cluster, • seed ortholog from each pair of species • paralogs may be included.
S1' S4' S1 S4 Example of Cluster [1] A B S2 S2' D C S3 S3'
S1' S4' S1 S4 Example of Cluster [2] A B S2 S2' D C S3 S3'
S1' S4' S5' S1 S4 S5 Bad Clusters [1] A B S2 S2' D C S3 S3' E
S6' S6 Bad Clusters [2] D C S4' S4 S3 S4'' S5 E
Input File Format • Each line consists of: • Cluster number • Similarity score • Species name • Seed ortholog • Sequence name
Goal • Implement ANY sequential algorithm • There is no definitive answer. • Then parallelize it. • A parser and an output module are provided. • no string comparion • all integer operations