1 / 43

Genomics for Librarians

Genomics for Librarians. Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine. A Genome Revolution in Biology and Medicine. We are in the midst of a "Golden Era" of biology

srisinger
Télécharger la présentation

Genomics for Librarians

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genomics for Librarians Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine

  2. A Genome Revolution in Biology and Medicine • We are in the midst of a "Golden Era" of biology • The Human Genome Project has produced a huge storehouse of data that will be used to change every aspect of biological research and medicine • The revolution is about treating biology as an information science, not about specific biochemical technologies.

  3. The Human Genome Project

  4. The job of the biologist is changing As more biological information becomes available and laboratory equipment becomes more automated ... • The biologist will spend more time using computers & on experimental design and data analysis (and less time doing tedious lab biochemistry) • Biology will become a more quantitative science (think how the periodic table affected chemistry)

  5. A review of some basic genetics

  6. DNA • 4 bases (G, C, T, A) • base pairs G--C T--A • genes • non-coding regions

  7. Decoding Genes

  8. What is Bioinformatics? • The use of information technology to collect, analyze, and interpret biological data. • An ad hoc collection of computing tools that are used by molecular biologists to manage research data. • Computational algorithms • Database schema • Statistical methods • Data visualization tools

  9. Genomics • What is Genomics? • An operational definition: •The application of high throughput automated technologies to molecular biology. • A philosophical definition: •A wholistic or systemsapproach to the study of information flow within a cell.

  10. Genomics make LOTS of data! • Investigators need complex databases just to manage their own experiments • Biologists need to know how to do data mining to answer even simple questions in these huge data sets • Librarians understand the challenges of storage and searching of large amounts of data

  11. New Biology => New Librarians? How do Genomics and Bioinformatics overlap or interact with Library Science? • The NCBI (Natl. Center for Biotechnology Information), the home of GenBank, is part of the National Library of Medicine • We store and organize genes like Journal articles - accession number, annotation, etc. • A big part of bioinformatics involves keyword searches and SQL queries in relational databases

  12. Bioinformatics is NotLibrary Science • We are NOT cataloging a set of known information • Programming and complex algorithms - pattern matching, string matching, biostatistics • Data mining and multi-dimensional visualization tools • Uncertainty of the data and constant revision of the “known” • Genes are guesses based on complex algorithms, not books on the shelf

  13. Raw Genome Data:

  14. BLAST Similarity Search >gb|BE588357.1|BE588357 194087 BARC 5BOV Bos taurus cDNA 5'. Length = 369 Score = 272 bits (137), Expect = 4e-71 Identities = 258/297 (86%), Gaps = 1/297 (0%) Strand = Plus / Plus Query: 17 aggatccaacgtcgctccagctgctcttgacgactccacagataccccgaagccatggca 76 |||||||||||||||| | ||| | ||| || ||| | |||| ||||| ||||||||| Sbjct: 1 aggatccaacgtcgctgcggctacccttaaccact-cgcagaccccccgcagccatggcc 59 Query: 77 agcaagggcttgcaggacctgaagcaacaggtggaggggaccgcccaggaagccgtgtca 136 |||||||||||||||||||||||| | || ||||||||| | ||||||||||| ||| || Sbjct: 60 agcaagggcttgcaggacctgaagaagcaagtggagggggcggcccaggaagcggtgaca 119 Query: 137 gcggccggagcggcagctcagcaagtggtggaccaggccacagaggcggggcagaaagcc 196 |||||||| | || | ||||||||||||||| ||||||||||| || |||||||||||| Sbjct: 120 tcggccggaacagcggttcagcaagtggtggatcaggccacagaagcagggcagaaagcc 179 Query: 197 atggaccagctggccaagaccacccaggaaaccatcgacaagactgctaaccaggcctct 256 ||||||||| | |||||||| |||||||||||||||||| |||||||||||||||||||| Sbjct: 180 atggaccaggttgccaagactacccaggaaaccatcgaccagactgctaaccaggcctct 239 Query: 257 gacaccttctctgggattgggaaaaaattcggcctcctgaaatgacagcagggagac 313 || || ||||| || ||||||||||| | |||||||||||||||||| |||||||| Sbjct: 240 gagactttctcgggttttgggaaaaaacttggcctcctgaaatgacagaagggagac 296

  15. Multiple Alignment

  16. Protein domains(Pattern analysis)

  17. Clustering (Phylogenetics)

  18. UCSC

  19. The Challenge of New Data Types (Genomics) • Gene expression microarrays • thousands of genes, imprecise measurements • huge images, private file formats • Proteomics • high-throughput Mass Spec • protein chips: protein-protein interactions • Genotyping • thousands of alleles, thousands of individuals • Regulatory Networks

  20. Biological Information

  21. Microarray Technology

  22. Spot your own Chip(plans available for free from Pat Brown’s website) Robot spotter Ordinary glass microscope slide

  23. cDNA spotted microarrays

  24. Goal of Microarray experiments • Microarrays are a very good way of identifying a bunch of genes involved in a disease process • Differences between cancer and normal tissue • Tuberculosis infected vs resistant lung cells • Mapping out a pathway • Co-regulated genes • Finding function for unknown genes • Involved these processes

  25. Proteomics • Identify all of the proteins in an organism • Potentially many more than genes due to alternative splicing and post-translational modifications • Quantitate in different cell types and in response to metabolic/environmental factors • Protein-protein interactions

  26. Yeast ProteomeJeong H, Mason SP, A.-L BarabasiNature 411 (2001) 40-41

  27. Human Genetic Variation • Every human has essentially the same set of genes • But there are different forms of each gene -- known as alleles • blue vs. brown eyes • genetic diseases such as cystic fibrosis or Huntington’s disease are caused by dysfunctional alleles

  28. Alleles are created by mutations in the DNA sequence of one person - which are passed on to their descendants

  29. High-Throughput Genotyping

  30. Relate genes to Organisms • Diseases • OMIM: Human Genetic Disease • Metabolic and regulatory pathways • KEGG • Cancer Genome Project

  31. Human Alleles • The OMIM (Online Mendelian Inheritance in Man) database at the NCBI tracks all human mutations with known phenotypes. • It contains a total of about 2,000 genetic diseases[and another ~11,000 genetic loci with known phenotypes - but not necessarily known gene sequences] • It is designed for use by physicians: • can search by disease name • contains summaries from clinical studies

  32. Training "computer savvy" scientists • Know the right tool for the job • Get the job done with tools available • Network connection is the lifeline of the scientist • Jobs change, computers change, projects change, scientists need to be adaptable

  33. Why teach genomics in undergraduate (or Medical) education? • Demand for trained graduates from the biomedical industry • Bioinformatics is essential to understand current developments in all fields of biology • We need to educate an entire new generation of scientists, health care workers, etc. • Use bioinformatics to enhance the teaching of other subjects: genetics, evolution, biochemistry

  34. Genomics in Medical Education “The explosion of information about the new genetics will create a huge problem in health education. Most physicians in practice have had not a single hour of education in genetics and are going to be severely challenged to pick up this new technology and run with it." Francis Collins

  35. Long Term Implications • A "periodic table for biology" will lead to an explosion of research and discoveries - we will finally have the tools to start making systematic analyses of biological processes (quantitative biology). • Understanding the genome will lead to the ability to change it - to modify the characteristics of organisms and people in a wide variety of ways

  36. Bioinformatics: A Biologist's Guide to Biocomputing and the Internet Stuart M. Brown, Ph.D.stuart.brown@med.nyu.eduwww.med.nyu/rcr Essentials of Medical Genomics

  37. www.GenomicsHelp.com

More Related