1 / 93

Genome Rearrangements, Synteny, and Comparative Mapping

Genome Rearrangements, Synteny, and Comparative Mapping. CSCI 4830: Algorithms for Molecular Biology Debra S. Goldberg. Turnip vs Cabbage. Share a recent common ancestor Look and taste different. Turnip vs Cabbage. Comparing mtDNA gene sequences yields no evolutionary information

georgio
Télécharger la présentation

Genome Rearrangements, Synteny, and Comparative Mapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome Rearrangements, Synteny, and Comparative Mapping CSCI 4830: Algorithms for Molecular Biology Debra S. Goldberg

  2. Turnip vs Cabbage • Share a recent common ancestor • Look and taste different

  3. Turnip vs Cabbage • Comparing mtDNA gene sequences yields no evolutionary information • 99% similarity between genes • These surprisingly identical gene sequences differed in gene order • This study helped pave the way to analyzing genome rearrangements in molecular evolution

  4. Turnip vs Cabbage: Different mtDNA Gene Order • Gene order comparison:

  5. Turnip vs Cabbage: Different mtDNA Gene Order • Gene order comparison:

  6. Turnip vs Cabbage: Different mtDNA Gene Order • Gene order comparison:

  7. Turnip vs Cabbage: Different mtDNA Gene Order • Gene order comparison:

  8. Before After Turnip vs Cabbage: Gene Order Comparison • Evolution is manifested as the divergence in gene order

  9. Transforming Cabbage into Turnip

  10. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 • Blocks represent conserved genes. Reversals 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 • Blocks represent conserved genes. • In the course of evolution or in a clinical context, blocks 1,…,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10.

  11. Types of Mutations

  12. Types of Rearrangements Inversion/Reversal 1 2 3 4 5 6 1 2 -5 -4 -3 6

  13. Types of Rearrangements Translocation 1 2 3 45 6 1 2 6 4 5 3

  14. Types of Rearrangements Fusion 1 2 3 4 5 6 1 2 3 4 5 6 Fission

  15. Genome rearrangements Mouse (X chrom.) • What are the similarity blocks and how to find them? • What is the architecture of the ancestral genome? • What is the evolutionary scenario for transforming one genome into the other? Unknown ancestor~ 75 million years ago Human (X chrom.)

  16. Why do we care?

  17. SKY (spectral karyotyping)

  18. 13 14 Robertsonian Translocation

  19. 13 14 Robertsonian Translocation • Translocation of chromosomes 13 and 14 • No net gain or loss of genetic material: normal phenotype. • Increased risk for an abnormal child or spontaneous pregnancy loss

  20. 9 22 Philadelphia Chromosome

  21. 9 22 Philadelphia Chromosome • A translocation between chromosomes 9 and 22 (part of 22 is attached to 9) • Seen in about 90% of patients with Chronic myelogenous leukemia (CML)

  22. Colon cancer

  23. Colon Cancer

  24. Comparative maps

  25. Waardenburg’s Syndrome: Mouse Provides Insight into Human Genetic Disorder • Characterized by pigmentary dysphasia • Gene implicated linked to human chromosome 2 • It was not clear where exactly on chromosome 2

  26. Waardenburg’s syndrome and splotch mice • A breed of mice (with splotch gene) had similar symptoms caused by the same type of gene as in humans • Scientists identified location of gene responsible for disorder in mice • Finding the gene in mice gives clues to where same gene is located in humans

  27. Reversals: Example p = 1 2 3 4 5 6 7 8 1 2 5 4 3 6 7 8

  28. Reversals: Example p = 1 2 3 4 5 6 7 8 1 2 5 4 3 6 7 8 1 2 5 4 6 3 7 8

  29. Reversals and Gene Orders • Gene order represented by permutation p: p = p1------ pi-1 pi pi+1 ------pj-1 pj pj+1 -----pn p1------ pi-1pj pj-1 ------pi+1 pipj+1 -----pn • Reversal r ( i, j ) reverses (flips) the elements from i to j inp r(i,j)

  30. Reversal Distance Problem • Goal: Given two permutations, find shortest series of reversals to transform one into another • Input: Permutations pand s • Output: A series of reversals r1,…rttransforming p into s, such that t is minimum • t - reversal distance between p and s • d(p, s) = smallest possible value of t, given p, s

  31. Sorting By Reversals Problem • Goal: Given a permutation, find a shortest series of reversals that transforms it into the identity permutation (1 2 … n ) • Input: Permutation p • Output: A series of reversals r1,… rt transforming p into the identity permutation such that t is minimum • min t =d(p ) = reversal distance of p

  32. Sorting by reversalsExample: 5 steps

  33. Sorting by reversalsExample: 4 steps What is the reversal distance for this permutation? Can it be sorted in 3 steps?

  34. Christos Papadimitrou and Bill Gates flip pancakes Pancake Flipping Problem • Chef prepares unordered stack of pancakes of different sizes • The waiter wants to sort (rearrange) them, smallest on top, largest at bottom • He does it by flipping over several from the top, repeating this as many times as necessary

  35. Sorting By Reversals: A Greedy Algorithm • Unsigned permutations • Example: permutation p = 1 2 3 6 4 5 • First three elements are already in order • prefix(p) = length of already sorted prefix • prefix(p) = 3 • Idea: increase prefix(p) at every step

  36. Doing so, p can be sorted 1 2 36 4 5 1 2 3 46 5 1 2 3 4 5 6 Number of steps to sort permutation of length n is at most (n – 1) Greedy Algorithm: An Example

  37. Greedy Algorithm: Pseudocode SimpleReversalSort(p) 1 fori 1 to n – 1 2 j position of element i in p(i.e., pj = i) 3 ifj ≠i 4 p p * r(i, j) 5 outputp 6 ifp is the identity permutation 7 return

  38. Analyzing SimpleReversalSort • SimpleReversalSort does not guarantee the smallest number of reversals and takes five steps on p = 6 1 2 3 4 5 : • Step 1: 1 6 2 3 4 5 • Step 2: 1 2 6 3 4 5 • Step 3: 1 2 3 6 4 5 • Step 4: 1 2 3 4 6 5 • Step 5: 1 2 3 4 5 6

  39. But it can be sorted in two steps: p = 6 1 2 3 4 5 Step 1: 5 4 3 2 1 6 Step 2: 1 2 3 4 5 6 So, SimpleReversalSort(p) is not optimal Optimal algorithms are unknown for many problems; approximation algorithms used Analyzing SimpleReversalSort(cont’d)

  40. Approximation Algorithms • These algorithms find approximate solutions rather than optimal solutions • The approximation ratio of an algorithm A on input p is: A(p) / OPT(p) where A(p) -solution produced by algorithm A OPT(p) - optimal solution of the problem

  41. Approximation Ratio / Performance Guarantee • Approximation ratio (performance guarantee) of algorithm A: max approximation ratio of all inputs of size n • For algorithm A that minimizes objective function (minimization algorithm): • max|p| = n A(p) / OPT(p) • For maximization algorithm: • min|p| = n A(p) / OPT(p)

  42. Adjacencies and Breakpoints p = p1p2p3…pn-1pn • A pair of elements pi and pi + 1are adjacent if pi+1 = pi + 1 • For example: p= 1 9 3 4 7 8 2 6 5 • (3, 4) or (7, 8) and (6,5) are adjacent pairs

  43. Breakpoints: An Example There is a breakpoint between any adjacent element that are non-consecutive: p = 1 9 3 4 7 8 2 6 5 • Pairs (1,9), (9,3), (4,7), (8,2) and (2,5) form breakpoints of permutation p • b(p) - # breakpoints in permutation p

  44. Adjacency & Breakpoints • An adjacency – consecutive • A breakpoint – not consecutive Extend π with π0 = 0 and πn+1 = n+1 adjacencies π = 5 6 2 1 3 4 0 5 6 2 1 3 4 7 breakpoints

  45. Extending Permutations • Add p0=0 and pn + 1=n+1 at ends of p Example: p = 1 9 3 4 7 8 2 6 5 Extending with 0 and 10 • p = 01 9 3 4 7 8 2 6 510 Note: A new breakpoint was created after extending

  46. Reversal Distance and Breakpoints • Each reversal eliminates at most 2 breakpoints. p = 2 3 1 4 6 5 0 2 3 1 4 6 5 7b(p) = 5 01 3 24 6 5 7b(p) = 4 0 1 2 3 4 6 57b(p) = 2 0 1 2 3 4 5 6 7 b(p) = 0 This implies: reversal distance ≥ #breakpoints / 2

  47. Sorting By Reversals: A Better Greedy Algorithm BreakPointReversalSort(p) 1whileb(p) > 0 2Among all possible reversals, choose reversalrminimizing b(p•r) 3p p • r(i, j) 4outputp 5return Problem: this algorithm may work forever

  48. Strips • Strip: an interval between two consecutive breakpoints in a permutation • Decreasing strip: elements in decreasing order (e.g. 6 5) • Increasing strip: elements in increasing order (e.g. 7 8) 01 9 4 3 7 8 2 5 6 10 • Consider single-element stripsdecreasingexcept strips 0 and n+1 are increasing

  49. Reducing the Number of Breakpoints Theorem 1: If permutation pcontains at least one decreasing strip, then there exists a reversal r which decreases the number of breakpoints (i.e. b(p •r) < b(p) )

  50. Things To Consider • Forp = 1 4 6 5 7 8 3 2 01 4 6 5 7 8 3 2 9 b(p) = 5 • Choose decreasing strip with the smallest element k in p (k = 2 in this case)

More Related