1 / 18

Lab 1: Bioinformatics and Genomics Internet Resources

Lab 1: Bioinformatics and Genomics Internet Resources. Summary Definition of bioinformatics as many computational tools that help us understand biological data—especially the structure and function of genomic data see also much more definitive discussion at: http://bioinformaticsweb.net/

havyn
Télécharger la présentation

Lab 1: Bioinformatics and Genomics Internet Resources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lab 1: Bioinformatics and Genomics Internet Resources • Summary • Definition of bioinformatics as many computational tools that help us understand biological data—especially the structure and function of genomic data see also much more definitive discussion at: http://bioinformaticsweb.net/ • Guided tour of NCBI- a major integrated repository of sequence data and analysis results and tools • Learn to use NCBI through Exercise 1.2 and homework assignment

  2. What is bioinformatics? • Integrated use of computers and databases to store, analyze and interpret biological data, especially high throughput, large dimensional data. • Primarily thought of in the context of molecular genetic data (DNA sequence, mapping, expression, etc.). • [Is this really new (revolutionary) or just an evolutionary event in data processing?] Breeders have used computation on large biological datasets for >50 years…..

  3. The amount of data has changed! We need to change how we view and communicate complex data- to reduce it to human brain-size chunks but without distorting the message

  4. Types of high throughput biological data • Systematic genome sequence and the annotation thereof • RNA and Protein Expression Data • Protein-protein interaction Data • Metabolomicdata • Scientific literature-mining software output

  5. Types of high throughput biological data • “simple” data and analysis: Genome Sequences Protein domain information • Both are static and Non-context dependent • “more complex” datasets and analysis: Genome Functional Annotation Expression patterns of Protein and RNA Protein Function and Activity Physiological function at molecular/cellular/tissue levels Phenotype • These patterns are context-dependent and may change • They are also inter-dependent- interactions possible

  6. Databanks in Molecular Biology and Genomics • Many specialized databases; many are accessible from genome browsers • The big three genome browsers NCBI, ENSEMBL, UCSC • ENSEMBL browser: http://www.ensembl.org/index.html • UCSC genome browser: http://genome.ucsc.edu; click on “genome browser”

  7. Databanks in Molecular Biology and Genomics • NCBI Tour http://www.ncbi.nlm.nih.gov/ Site map Overview-Databases/Tools-Science Primer-Human Genome Entrez* data model-Data Submission- Education- Databases/Tools Nucleotide-Protein Gene-Homologene-OMIM Human Genome – Map Viewer Blast tools (next week)

  8. NCBI Tour: Home Page has changed

  9. NCBI Tour: Overview Databases and Tools -> Literature DB -> PubMed or -> OMIM -> search for huntingdons Databases and Tools -> Tools for Data mining -> Entrez -> 2 searches: a. Growth hormone b. GH1

  10. NCBI Tour: Overview Human Genome Res. -> Guide to Online Resources -> Browse your genome -> chromosome 21 -> chromosome 11

  11. GenBank Flat file Genbank (http://www.ncbi.nlm.nih.gov/genbank/) is the databank that holds most of the primary sequence data- presented as flat file Flat file contents Locus line (accession number, length, type, sub-directory, release date) Definition Line Accession number Keywords Source References to sequence Features Table Source Gene CDS (coding sequence) Miscellaneous features (sig_peptide, polyA- signal, polymorphism) Base Count Sequence XM_004915 is gone now. Try NM_214163

  12. Ensembl Gene View http://useast.ensembl.org/Homo_sapiens/Info/Index

  13. Additional Bioinformatics tool overviews:UCSC genome browser UCSC: http://genome.ucsc.edu/ Emphasis is on genome tracks of data- at any locus or larger region, you can look at different sets of data to compare gene predictions, other features. - I use UCSC only for a quick look at genes and conservation of their flanking DNA sequence among species Examples: IL1B, IGF1, HPRT or HOXA5 - UCSC also has integration with ENCODE data, which we will discuss later in functional genomics

  14. What are you interested in? Finding available information about: A gene's function? A gene's structure? A gene's location? A gene's expression pattern? Published papers on the gene? Similar genes (and evolutionary relationships) in other species? Context of the gene in the genome- “neighborhood”

  15. Examples of bioinformatic analyses Starting material: Accession number: From publication on gene of interest, use Genbank nucleotide or Entrez search Gene or protein name: Text search of entire NCBI website at Entrez (covers Genbank, PubMed, OMIM, Unigene, etc) Disease (human): Text search of OMIM Sequence data: Begin with sequence of your clone, do comparison of your sequence across available sequence information at NCBI website (BLAST (Genbank nr/nt, human genome+transcript) to identify the sequence if possible. If already cloned and sequenced, can determine the human and mouse location (Gene, UniGene, MapViewer, etc). Also can find out what is known about gene (OMIM, PubMed) Protein structure: Begin with link to structure in PDB (see tutorial): http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/pdb.shtml

  16. Exercise on use of Genome Browsers Tutorial on OMIM: http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/omim.shtml

  17. Exercise on use of Genome Browsers

  18. Exercise on use of Genome Browsers (5 pts)Your assignment:Due Wednesday August 31 before class starts1. Select a disease and gene by end of today2. Give disease/gene information to Dr. Tuggle3. Research this gene using the resources as described above.4. Answer the questions in the Exercise; send answer by email to Dr. Tuggle.- You are allowed to cut and paste text from the websites, but clearly indicate the question and the answer. - A minimal effort will receive minimal points. Make sure I know you visited these sites and that you learned something - You don’t have to use all three browsers and you don’t have to compare them (delete question f)

More Related