140 likes | 264 Vues
Segment alignment SEA. B89902010 鄭智懷 B89902037 黃敬強 B89902117 胡書瑜. Introduction. Outline of the paper. Increasing evolutionary distance causes homologous proteins to be hard to compare on a sequence level… We focus on the folds of the protein, PLSSs, LSSs…etc.
E N D
Segment alignment SEA B89902010鄭智懷 B89902037黃敬強 B89902117胡書瑜
Introduction • Outline of the paper Increasing evolutionary distance causes homologous proteins to be hard to compare on a sequence level… We focus on the folds of the protein, PLSSs, LSSs…etc. A new look at the local structure prediction Network matching problem
Introduction • PLSS( Predicted Local Structure Segment ) LSS( Local Structure Segment ) maximal structural of units that are shared by proteins with different folds. Predicted by the “Nearest-Neighbor Method” PLSS the LSS use the previous method
Introduction Predicted by the “Nearest-Neighbor Method”
Algorithm Given two networks of PLSSs, find two optimal paths from the source to the sink in each of the networks, whose corresponding PLSSs are most similar to each other. It does not follow the typical position-by-position alignment mode
Algorithm Definition1: first(eee) = 4 last(eee) = 6
Algorithm Definition2: A segment covers i, if Example: first(eee)=4<5 Last(eee)=6>5 So, segment “eee” covers 5
Algorithm Definition3: The set of PLSSs covering position i is denoted E(i). Example: E(17)={ “eeeeee” , “hhh” , “hhhhh” } E(4)={ “eee” , ”eeeeee” } For any pair of positions, i and j, their covering segments are considered in a combinatorial way (total |E(i)|x|E(j)| combinations )
Algorithm Using dynamic programming concept We define V(i,j ) as the maximum similarity score for transforming S1[1…i] to S2[1…j] calculated by
Algorithm Target: The similarity score of aligned positions i and j is ∆(iα,jβ) 看哪一個分數會比較高(在所有segment的可能下) i-1,j位置已經存在deletion gap再扣掉extension的分數 i-1,j位置總分減掉開一個deletion gap的分數 g stands for the gap initiating penalty. h stands for the gap extension penalty.
Algorithm Target: The similarity score of aligned positions i and j is ∆(iα,jβ)
Complexity Assumption In Sequence1 The first vertex 被a1個PLSS所cover The second vertex 被a2個PLSS所cover The third vertex 被a3個PLSS所cover The m-1th vertex 被am-1個PLSS所cover The mth vertex 被am個PLSS所cover In Sequence2 The first vertex 被b1個PLSS所cover The second vertex 被b2個PLSS所cover The third vertex 被b3個PLSS所cover The n-1th vertex 被bn-1個PLSS所cover The nth vertex 被bn個PLSS所cover
Complexity In the i-row and j-column entity of the matrix do (ai)x (bj) operations So, the total operation in the matrix First row Second row Last row