Download
introduction to bioinformatics n.
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction to Bioinformatics PowerPoint Presentation
Download Presentation
Introduction to Bioinformatics

Introduction to Bioinformatics

145 Vues Download Presentation
Télécharger la présentation

Introduction to Bioinformatics

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Introduction to Bioinformatics Alexandra M Schnoes Univ. California San Francisco Alexandra.Schnoes@ucsf.edu

  2. What is Bioinformatics? • Intersection of Biology and Computers • Broad field • Often means different things to different people • Personal Definition: • The utilization of computation for biological investigation and discovery—the process by which you unlock the biological world through the use of computers.

  3. What does one do in Bioinformatics?(a small sample) • Our Lab: Understanding Protein (Enzyme) Function • dsafd • dsafd ?

  4. What does one do in Bioinformatics?(a small sample) • Discover new drug targets—computational docking Atreya, C. E. et al. J. Biol. Chem. 2003;278:14092-14100 Shoichet, B. K. Nature. 2004;432:862-865

  5. What does one do in Bioinformatics?(a small sample) • Systems Biology sbw.kgi.edu/ www.sbi.uni-rostock.de/ research.html

  6. This lab: Nucleotide & Protein Informatics • Sequence analysis • Finding similar sequences • Multiple sequence alignment • Phylogenetic analysis

  7. SequenceStructureFunction

  8. Process of Evolution • Sequences change due to • Mutation • Insertion • Deletion

  9. Use Evolutionary Principles to Analyze Sequences • If sequence A and sequence B are similar • A and B evolutionarily related • If sequence A, B and C are all similar but A and B are more similar than A and C and B and C. • A and B are more closely evolutionarily related to each other than to C

  10. Extremely Powerful Idea • Start with unknown sequence • Find what the unknown is similar to • Use information about the known to make predictions about the unknown

  11. How do you know when sequences are similar? • Align two sequences together and score their similarity TASSWSYIVE TATSFSYLVG • Use substitution matrices to score the alignment

  12. Substitution Matrices Give a Score for Each Mutation • Many different matrices available • Blosum matrices standard in the field Blosum 62 Scoring matrix http://www.carverlab.org/images/

  13. Scoring: Add up the positional Scores • Score of 30 TASSWSYIVE TATSFSYLVG TASSWSYIVE TATSFSYLVG • Score of 1

  14. Additional issues… • Gaps (insertions/deletions) • Have scoring penalties for opening and continuing a gap TASSWSYIVETASSWSYIVE TATSFLVGTATSF--LVG

  15. How do we find similar sequences? • Start at the National Center for Biotechnology Information • http://www.ncbi.nlm.nih.gov/

  16. How do we find similar sequences? • Nucleotide Sequence Databases

  17. How do we find similar sequences? • Protein Sequence Databases

  18. How do we find similar sequences? • Similarity Search: BLAST • Basic Local Alignment Search Tool

  19. BLAST is very quick but … • Only local alignments • Alignments aren’t great • Only pair-wise alignments

  20. Want better alignments … • Multiple alignment • Multiple sequences • Better signal to noise • More Sequences = Better alignment • More accurate reflection of evolution • ClustalW • Commonly used • Easy to use

  21. Visualize the Multiple Alignment

  22. Use the Alignment to Calculate Evolutionary Distances • See ‘how close’ sequences are to each other • Best way to tell what is ‘most similar’ • Can calculate simple tree from clustalW Taubenberger et al., Nature: 437, 889-893, 2005

  23. Caveats! • In reality • Sequences (even parts of sequences) can evolve at different rates • Don’t have a good understanding of sequence and function • High sequence identity does not always mean the same function • Getting good alignments and good trees can be very hard

  24. Bioinformatics: Sequence Analysis • Start with unknown sequence • Find similar sequences • Create alignment • Create phylogenetic tree • Use information about knowns to make predictions about unknown

  25. Mini Virus Intro— • Often considered ‘not alive’ • Extremely small (much smaller than a cell) • Cellular parasites • Has a genome but can only reproduce inside a host cell

  26. Different Viruses • RNA & DNA viruses • Both single and double-stranded

  27. Different Viruses • RNA & DNA viruses • Both single and double-stranded • Influenza Virus

  28. Influenza Virus (flu) • Small genome—8 RNA molecules • Evolves quickly– genetic drift, antigenic shift

  29. Influenza Virus (flu) • Sequencing Reverse Transcriptase Sequencing Genomic Nucleotide Sequence DNA

  30. Influenza Pandemics • 1918 Flu • Killed from 50-100 Mil. people worldwide • Considered to be one of the most deadly pandemics • Killed many of the young and healthy • Influenza A, Type H1N1 • Thought to have derived from Avian Influenza • Recently reconstituted from recovered human samples • Considerable ethical debate

  31. Avian Influenza • Current fear of pandemic • High mortality rate (including young and healthy) • Current concern is Influenza A, Type H5N1 • Still only transmitted by contact with birds • Is now in Asia and Eastern and Western Europe

  32. This lab: Nucleotide & Protein Informatics • Sequence analysis • Finding similar sequences • Multiple sequence alignment • Phylogenetic analysis