1 / 101

Genome Rearrangement

Genome Rearrangement. By Ghada Badr Part I. Genome, chromosome, gene, gene order. The entire complement of genetic material carried by an individual is called the genome . Each genome contains one or more DNA molecules, one per chromosome. Genome, chromosome, gene, gene order.

cameo
Télécharger la présentation

Genome Rearrangement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome Rearrangement By Ghada Badr Part I

  2. Genome, chromosome, gene, gene order The entire complement of genetic material carried by anindividualis called the genome. Each genome contains one or moreDNA molecules, one per chromosome

  3. Genome, chromosome, gene, gene order A gene is a segment of DNA sequence with a specific function

  4. Gene order: A -B C D -E F Genome, chromosome, gene, gene order A C D F 5’ 3’ 3’ 5’ B E Genes can be ordered by their DNA sequence location. DNA consists of two complementary strands twisted around each otherto form a right-handed double helix. A sign (+/-) is usually used to indicate on which strand a gene is located.

  5. H I K Genome, chromosome, gene, gene order A B C D E F J The DNA molecule (chromosome) may be circular or linear

  6. Genome Rearrangement • The genome is structurally specific to each species, and it changes only slowly over time. Therefore genome comparison among different species can provide us with much evidence about evolution. • Genome rearrangements are an important aspect of the evolution of species. Even when the gene content of two genomes is almost identical, gene order can be quite different. A -B C D -E F Genome 1 B -E F -D A C Genome 2

  7. Genome Rearrangement Gene order analysis on a set of organisms is a powerful technique for genomic comparison phylogenetic inference.

  8. Genome Rearrangement • General Definition for the problem: Given a set of genomes and a set of possible evolutionary events (operations), find a shortest set of events transforming (sorting) those genomes into one another. What genome means and what events are, makes the diversity of the problem. Since these events are rare, scenarios minimizing their number are more likely close to reality. Many models have been proposed.

  9. Genome Models • Genes (or blocks of contiguous genes) are a good example of homologous markers, segments of genomes, that can be found in several species. • The simplest possible model is: • The order of genes in each genome is known, • All the genomes share the same set of genes, • All genomes contain a single copy of each gene, and • All genomes consist of a single chromosome.

  10. Genome Models • Genomes can be modeled by each gene can be assigned a unique number and is exactly found once in the genome. permutations: • Signed Permutation: Each gene may be assigned + or - sign to indicate the strand it resides on. • Unsigned Permutation: If the corresponding strand is unknown.

  11. Permutaions • Genes (markers) are represented by integers: 1, 2, . . . . , n, with +,- sign to indicate the strand they lie on. • The order and orientation of genes of one genome in relation to the other is represented by a signed permutation . •  = ( 2 n-1 n) of size n over {-n, ... , -1, 1, ... , n}, such that for each i from 1 to n, either i or -i is mandatory represented, but not both.

  12. Permutaions Identity permutation: • The identity permutation n = (1, 2, 3, . . . . , n). • When multiple genomes with the same gene content are compared, one of them is chosen as a base (reference), i.e, represented as n, and all other identical genes are given the same integer values.

  13. Permutaions Sorted/unsorted permutation: • In order to sort a permutation this means that we want to apply some operations on to change it to n. • If (1 = 2) We say that is sorted with respect to . • If (12) We say that is unsorted with respect to .

  14. Fruit Fly Mosquito Silkworm Locust Tick Centipede Permutaions Example: Mitochondrial Genomes of 6 Arthropoda 1= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 2= (1 , 2 , 3 , 4 , 5 , 6 , 8 , 7 , 9 ,-10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 1 2 3 4 5 6 8 7 9 -10 11 12 13 14 15 16 17 3= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 14 , 13 , 15 , 16 , 17) 1 2 3 4 5 6 7 8 9 10 11 12 14 13 15 16 17 4= (1 , 2 , 3 , 5 , 4 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 1 2 3 5 4 6 7 8 9 10 11 12 13 14 15 16 17 5= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 13 , 14 , 15 , 16 , 17) 1 3 4 5 6 7 8 9 10 11 -2 12 13 14 15 16 17 6= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 16 , 13 , 14 , 15 , 17) 1 3 4 5 6 7 8 9 10 11 -2 12 16 13 14 15 17

  15. Fruit Fly Mosquito Silkworm Locust Tick Centipede Permutaions Example: Mitochondrial Genomes of 6 Arthropoda 1= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 2= (1 , 2 , 3 , 4 , 5 , 6 , 8 , 7 , 9 ,-10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 3= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 14 , 13 , 15 , 16 , 17) 4= (1 , 2 , 3 , 5 , 4 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 5= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 13 , 14 , 15 , 16 , 17) 6= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 16 , 13 , 14 , 15 , 17)

  16. Permutaions Linear and circular permutation: •  is linear when it represents a linear chromosome, or circular when it represents a circular chromosome. • When  = ( 2 n-1 n) is circular: ’ = (-n n-1 2 1) all permutations obtained by shifts on or ’ shift( , i) = (n-i+1 n-i+2n-1 n1 n-i are all equivalent. Example: (-3,2,1,-4) & (-1,-2,3,4)

  17. Permutaions Points in permutations • For a given permutation  = ( 2 n-1 n), there is a point between each pair of consecutive values i and i+1 in . • If is linear: there are two additional points, one before and one after n. • If is circular: there is one additional point between nand 1. • Pts() = n+1 if linear, and pts() = n if circular.

  18. Permutaions Linear extension of a permutation: • For a given  = ( 2 n-1 n) • If  is linear: a linear extension of is ’= (0,  2 n-1 n, n+1) • If  is circular: a linear extension of is ’= (0,  2 n-1 n-1, n)

  19. Permutaions • Example:  = (4,8,9,7,6,5,1,3,2) ’= (0,4,8,9,7,6,5,1,3,2,10) ’= (0.4.8.9.7.6.5.1.3.2.10) Then Pts() = 10 • Now: we want to compare our genomes.

  20. Permutations - similarity/distance Problem: Given two genomes, How do we measure their similarity and/or distance?  A Related Problem: Given two permutations, How do we measure their similarity and/or distance?

  21. Permutations - similarity/distance • A distance measure should be a metric on the set of genomes. • A Metric d on a set S (d: S  S  R) satisfies the following three axioms: • Positivity: for all s, t in S, d(s,t)  0, and d(s,t)=0 iff s = t. • Symmetry: for all s, t in S, d(s,t) = d(t,s). • Triangular inequality: for all s, t, u in S, d(s,u)  d(s,t) + d(t,u).

  22. Permutations - similarity/distance • Measures of similarity between permutations that are used in computational biology are numerous in literature. • First measures used are (will be useful later on): • Breakpoints (Introduced by Sankoff and Blanchette (1997)) • Common intervals

  23. Permutations-distance - Breakpoints • When analyze  with respect to , each point in  can be an adjacency or a breakpoint. • A point (pair of consecutive values) (i, i+1) in  is an adjacency between  and : when either (i, i+1) or (-I+1, -i) are consecutive in . • If  is linear: we have adjacency before  if  is also the first value in , and an adjacency after n, if n is also last value in . • If  is circular: we assume that n is also last value in  and (n, 1) is an adjacency if  is also the first value in .

  24. Permutations-distance - Breakpoints • Breakpointdistance counts the lost adjacencies between genomes. • The breakpoint distance between  and  is: brp() = pts() - adj() where: pts() is the number of points in . adj() is the number of adjacencies. • If  is sorted ( = ):  has only adjacencies and no breakpoints (brp() = 0). • If  is unsorted ():  has at least one breakpoint (brp()  0).

  25. Permutations-distance - Breakpoints • Back to our Example:  = (4,8,9,7,6,5,1,3,2) ’= (0,4,8,9,7,6,5,1,3,2,10) ’= (0.4.8.9.7.6.5.1.3.2.10) Then Pts() = 10, brp()? Adjacencies? n= (0.1.2.3.4.5.6.7.8.9.10) (8,9) (7,6) (6,5) (3,2)  adj() = 4  brp() = pts() - adj() = 10 - 4 = 6

  26. Permutations-distance - Breakpoints • Breakpointdistance is based on the notion of conserved adjacencies and can be defined on a set of more than two genomes. • It is easy to compute. • It always fails to capture more global relations between genomes. • The first generalization of adjacencies is the notion of common intervals.

  27. Permutations-distance - Common Intervals • Common intervals: subsets of genes that appear consecutively together in two or more genomes, where genes are the same in each interval but may be not in the same order or orientation. Example (circular chromosomes) 1= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 2= (1 , 2 , 3 , 4 , 5 , 6 , 8 , 7 , 9 ,-10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 3= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 14 , 13 , 15 , 16 , 17) 4= (1 , 2 , 3 , 5 , 4 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 5= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 13 , 14 , 15 , 16 , 17) 6= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 16 , 13 , 14 , 15 , 17) If compare the first 4 species: they share 6 adjacencies {1,2}, {2,3},{11.12},{15,16},{16,17},{17,1} If compare all 6 species: they share only 1 adjacency {17,1}

  28. Permutations-distance - Common Intervals • Common intervals: subsets of genes that appear consecutively together in two or more genomes, where genes are the same in each interval but may be not in the same order or orientation. Example (circular chromosomes) 1= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 2= (1 , 2 , 3 , 4 , 5 , 6 , 8 , 7 , 9 ,-10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 3= (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 14 , 13 , 15 , 16 , 17) 4= (1 , 2 , 3 , 5 , 4 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17) 5= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 13 , 14 , 15 , 16 , 17) 6= (1 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , -2 , 12 , 16 , 13 , 14 , 15 , 17) The six permutations are very similar. The genes in the interval [1,12] are all the same, as genes in the intervals [3,6], [6,9],[9,11], and [12,17].

  29. Permutations-distance - Common Intervals • We can use common intervals as a measure of similarity between species. Disadvantage: All these measures do not reflect rearrangement operations or explain what happened to the genome over time.

  30. Rearrangement operations (events) Back to our original problem: Given a set of genomes and a set of possible evolutionary events (operations), find a shortest set of events transforming those genomes into one another. What are the Rearrangement events (Operation)? These events (Operation) could be applied to a single gene or to a group of genes, intervals.

  31. Fruit Fly Mosquito Silkworm Locust Tick Centipede Rearrangement operations Example: Mitochondrial Genomes of 6 Arthropoda 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

  32. Rearrangement Operations Rearrangement operations affect gene order and gene content. There are various types: In case of single-chromosome genome: • Inversions • Transpositions • Reverse transpositions • Gene Duplications • Gene loss In case of multiple-chromosomes genomes we add: • Translocations • fusions • fissions

  33. Rearrangement Operations - Single Chro. Inversion

  34. Rearrangement Operations - Single Chro. Inversion

  35. Rearrangement Operations - Single Chro. Inversion

  36. Fruit Fly Mosquito Silkworm Locust Tick Centipede Rearrangement Operations - Single Chro. Example: Mitochondrial Genomes of 6 Arthropoda An inversion.

  37. Rearrangement Operations - Single Chro. Transposition

  38. Rearrangement Operations - Single Chro. Transposition

  39. Rearrangement Operations - Single Chro. Transposition

  40. Fruit Fly Mosquito Silkworm Locust Tick Centipede Rearrangement Operations - Single Chro. Example: Mitochondrial Genomes of 6 Arthropoda A transposition

  41. Rearrangement Operations - Single Chro. Reverse Transposition

  42. Rearrangement Operations - Single Chro. Reverse Transposition

  43. Rearrangement Operations - Single Chro. Reverse Transposition

  44. Fruit Fly Mosquito Silkworm Locust Tick Centipede Rearrangement Operations - Single Chro. Example: Mitochondrial Genomes of 6 Arthropoda A reverse transposition

  45. Rearrangement Operations - Multiple Chro. Translocation

  46. Rearrangement Operations - Multiple Chro. Translocation

  47. Rearrangement Operations - Multiple Chro. Translocation

  48. Rearrangement Operations - Multiple Chro. Translocation

  49. Rearrangement Operations - Multiple Chro. Translocation

More Related