1 / 36

Lenka Kovářová Supervisor : Milena Kovářová

28. 07. 2011. Informatics view of determining the relationship between organisms. Lenka Kovářová Supervisor : Milena Kovářová. Imagine the situation you have find some new so far unknown organism. And you want to know some relative species of organisms Whtat to do know ?

sheera
Télécharger la présentation

Lenka Kovářová Supervisor : Milena Kovářová

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 28. 07. 2011 Informatics view of determining the relationship between organisms LenkaKovářová Supervisor: Milena Kovářová

  2. Imagine the situation you have find some new so far unknown organism And youwant to knowsomerelative species oforganisms Whtat to do know? Asktheunknownorganism Itwouldprobably not answer Findsomesimilarsigns to knownorganisms Thatis not a proofoftheirrelationship Findgenomicsimilarity

  3. How to compare genome ComparewholeDNA code Theatis a big amount of data Compareselected chromosome Relative speciescanhave the same gene on different locationsof different chromosomes Compareone special gene You don’t knowwhere to findselectedgene in gemoneoffar unknown organism Almostalleucatyotic organisms has mitochondrial DNA

  4. Mitochondrion Semi–autonomic organele Foundin most eukaryotic cells Described as cellular power plants Has its own independentgenome Believed to be originally derived from endosymbiotic prokaryotes

  5. Mitochondrial DNA Mostly circular DNA molecule

  6. Mitochondrial inheritance Mitochondriaare normally inherited exclusively from the mother Mitochondrial DNA is not highly conserved and has a rapid mutation rate Itisusefulfor studyingthe evolutionary relationships of organisms Model of human migration based on Mitochondrial DNA

  7. Project Try to determninetherelationshipbetweenorganismsbased on mitochondrail DNA Download as many mitochondial DNA ofdifferentorganisms as possible Make a program ofsuiatablealgorithm to determinethesimilarityof DNA code Analyse theresults

  8. Taxonomy

  9. VertebrataTaxonomy

  10. Organisms Streptophyta Arabidopsis thaliana Physcomitrella patens

  11. Organisms Fungi Aspergillusniger Ashbyagossypii Saccharomycescerevisiae

  12. Organisms Insecta Apismellifera Acyrthosiphonpisum Triboliumcastaneum

  13. Organisms Deuterostomia Cionaintestinalis Saccoglossuskowalevskii Strongylocentrotuspurpuratus

  14. Organisms Vertebrata Anoliscarolinensis Xenopus (Silurana) tropicalis Daniorerio

  15. Organisms Aves Gallus gallus Meleagrisgallopavo Taeniopygiaguttata

  16. Organisms Mammalia Ornithorhynchusanatinus Monodelphisdomestica

  17. Organisms Carnivora Ailuropodamelanoleuca Canisfamiliaris

  18. Organisms Cetartiodactyla Susscrofa Bostaurus Perissodactyla Equuscaballus

  19. Glires Oryctolaguscuniculus Musmusculus Rattusnorvegicus Organisms

  20. Organisms Primates Macacamulatta Pongoabelii Pan troglodytes Homo sapiens

  21. Metodology GettheFastaformatofmitochondrialDNA Comparethesimilarityofgenoms Findrelativeorganisms

  22. Metodology Compressone DNA codeandcompresstwo DNA codestogether Comparethelenghtofcompressedfiles Computethe coefficientsof similarity Where Compress(x) is the compression algorithm used on file x |x| is the length of file x and DNA1+DNA2 is the concatenation of DNA1 and DNA2 in this order

  23. Compress algoritm Deflatestream Combinationof LZ77 algorithm and Huffman coding Compression is achieved through two steps The matching and replacement of duplicate strings with pointers Replacing symbols with new, weighted symbols based on frequency of use

  24. Compress algorithm Deflatestream Lossless data compression algorithm Combination of LZ77 algorithm and Huffman coding Series of blocks, each block preceded by a 3-bit header 1-bit: Last block in stream marker 1: this is the last-block in the stream 0: there are more blocks to process after this one 2-bits: Encoding method used for this block type: 00: a raw section follows, between 0 and 65,535 bytes in length 01: a static Huffman compressed block, using a pre-agreed Huffman tree 10: a compressed block complete with the Huffman table supplied 11: reserved, don't use

  25. Algorithm LZ77 Duplicate string elimination Within compressed blocks If a duplicate series of bytes is spotted (a repeated string) then a back-reference is inserted linking to the previous location of that identical string instead An encoded match to an earlier string consists of a length (3–258 bytes) and a distance (1–32,768 bytes) Relative back-references can be made across any number of blocks

  26. Huffman coding Replacing Commonly used symbols with shorter representations Less commonly used symbols with longer representations Unprefixed tree of non-overlapping intervals Length of each sequence is inversely proportional to the probability of that symbol needing to be encoded A tree is created which contains space for 288 symbols 0–255: represent the literal bytes/symbols 0–255. 256: end of block 257–285: combined with extra-bits, match length of 3–258 bytes 286, 287: not used

  27. Huffman coding A match length code will be followed by a distance code Based on the distance code read, further "extra" bits may be read in order to produce the final distance. The distance tree contains space for 32 symbols 0–3: distances 1–4 4–5: distances 5–8, 1 extra bit 6–7: distances 9–16, 2 extra bits 8–9: distances 17–32, 3 extra bits ... 26–27: distances 8,193–16,384, 12 extra bits 28–29: distances 16,385–32,768, 13 extra bits 30–31: not used

  28. Numberof basis of mitochondrial DNA genome

  29. Results

  30. Results Homininaerelationship

  31. Results Avesrelationship

  32. Results Insectarelationship

  33. Results Mammaliarelationship

  34. Results Imaginethisisyouunknownanimal

  35. Conclusion Many mitochondrial DNA codesweredowloaded Thetaxomonyrelationshipbetweenthemwasfound Researchofsuitablealgorthmsfordeterminingtherelationshipbetweenorganismswasdone Demonstratedalgorithmwaschoosen Analgorithmwasimplemented Ability to determinethe relationship between organismsusingthisalgorithmwasproofed

  36. Results Thankyouforyourattention

More Related