NCBI’s Advanced Resources and other Genetics Websites Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries/ U.F. Genetics Institute PCB3063, General Genetics March 2010 firstname.lastname@example.org
Today’s Session • Your term papers • NCBI databases and tools • Structures, MapViewer, dbSNP, Entrez Gene • Genetics website – GeneReviews
Project Poster • Poster content: • 1/3 text, 2/3 figures and tables with legends (55 pts) • Style: • Choose an authorized house style; includes figures, tables, references, etc (10 pts) • Poster presentation: • Questions about content and references (15 pts) • Peer grading: • Team members grade team members (15 pts)
Your Poster • Text – No more than a sentence or two describing the disease and its symptoms • Rest of text should cover all aspects of genetics • Mode of inheritance • Map location and gene structure • What protein is encoded by gene; normal function of protein • Types of mutations and what they do to protein; how do they cause disease? • Etc. • For all of the genetics - evidence for this knowledge; experiments, explanations, etc • Figures and tables can help here
Your Poster • Important feature of your text – How do the researchers know what they know? What is the evidence? What sorts of experiments were done, and how do the results explain what is known?
Poster • References • Use 10-12 peer reviewed journal articles • Each source must be cited within the text (or figure/table legends) and in the references section at the end • Follow “house style” rules for citing in the text, in the references, and even for naming the “references” section
Poster • Figures and Tables: Photographs, diagrams, tables, etc. • Do not just reproduce tables – compile, explain, add information • Each must have a “legend” written in your own words describing what it depicts • Use “house style” for labeling, numbering, referring to in text, etc.
Oral Defense • Please be prepared to review the full content of your poster and to engage in discussion about the hereditary basis of your disorder with Drs. Miyamoto and Tennant, the graduate/ undergraduate TAs, and any faculty/student visitors. • You should obviously know your poster, but also the background research (e.g., articles, NCBI resources, associated lecture materials, and group discussions) that is essential to understanding your human genetic disorder.
House Style • Every journal has its own rules for “style” • Format for citations (within text and in the “references” of poster) • Format for tables and figures, and their legends • Find rules on the journal’s website, or by looking in recent print issues • Rules often called “Instructions for authors/ contributors”; “Style”, etc.
House Style • For this assignment – DO NOT follow style concerning: • “Sections” of the poster. Don’t divide into abstract, intro, materials and methods, etc
NCBI’s Structures • Includes MMDB - Molecular Modeling Database with over 75,000 entries • Experimentally determined biopolymer structures obtained from the Protein Data Bank (nmr, x-ray crystallography) • To view, need to download viewer software (Cn3D 4.1)
What you can do with Structures • Retrieve and view 3D molecular structures – example – find the structure for human leptin • Annotate whole or partial structures • Search on a pattern • Use Sequence or Structure alignments to generate structure hypotheses • Links to other resources
www.library.health.ufl.edu Click on “Databases” from HSCL Website
Or choose “Structure” from the “Popular Resources” link Choose “Structure” from the dropdown and then click on “search” to reach the Structure page
Create search in search box Click here to open structure record
PDB Number Citation and abstract Click here to view structure Download Cn3D 4.1 to view structures Domain information
Default view – secondary structure coloration – blue/coil; green/helix; orange/pleated sheet Structure Viewer Sequence/ Alignment Viewer
Annotation • Use “Style-Edit Global Style” to label/change the ENTIRE structure • To annotate one part of the structure, use “Style – Annotate” • Example - annotate one section of the leptin structure – change view to ball and stick, add protein sidechains, remove helix objects, change color to blue, label every third amino acid with one letter code.
Portion of structure to be annotated must be highlighted prior to annotation
Once you have completed your annotations, you can go to “File”, “Export as png”. This will save your file as a png, which allows you to easily open in nearly any graphics program
Pattern Searching • Identifies particular amino acid sequence patterns; may give insights into function • Example - Retrieve the HIV-1 Reverse Transcriptase record 1REV; find all positions in the sequence in which lysine is separated by 2-4 residues from a histidine • Patterns are listed using Prosite syntax: • k-x(2,4)h • You can find the rules for syntax at http://us.expasy.org/prosite/prosuser.html - meth1
Global Style • You can change the style of the whole molecule • You can also change the style of the whole molecule “except” a part that is already highlighted • This especially useful when you have found a feature and you want to make it very visible
Changes all residues to the user selected color EXCEPT those that were highlighted Remove secondary structure elements DON’T click in sequence window, or highlighted residues will also turn the user selected color
Highlight within a Certain Distance • Close proximity may indicate molecular interaction; researcher may highlight all residues within a particular distance of another entity • Example – highlight all residues within 5 Angstroms of the Mg found in record 1REV.
Highlight entity of interest In this example, all amino acid residues within 5 Angstoms of the Mg will be identified
Highlights the residues within 5 Angstroms of the Mg in both the Structure and Sequence Viewers
MapViewer • One of NCBI’s Genome resources • Displays multiple types of maps – cytogenetic, sequence, genetic linkage, radiation hybrid • Covers human and many model organisms, including: • Danio rerio • Pan troglodytes • Zea mays
Questions Answered by MapViewer • Where does a particular gene reside in a particular organism? • Which identified genes are on chromosome 19, and in what order? • Which genes are in region R of chromosome 6, and what’s the corresponding sequence? • Show the cytogenetic and sequence maps for the same region of a chromosome; align the maps based on shared markers. • What is the distance between two genes?
MapViewer is not an Entrez database, so you can’t find from the dropdown menu – need to click “Maps and Markers” Then choose “MapViewer” from “Quick Links”
This part of the results page shows you on which chromosome EDN1 resides Search in MapViewer using your official gene symbol and human
Make your choice from the reference assembly; this is the assembly that was completed by the public endeavor to sequence the human genome These links will take you to useful databases, depending on the datatype – RefSeq, Entrez Gene, etc Start with the Genes_seq map for Part C
Can now see introns, exons, untranslated regions Zoom in for close-up of gene model Divert to SNP to find variation info for Part C Arrow indicates “plus” or “minus” strand Can link to OMIM and other useful information
SNPs • SNPs = single nucleotide polymorphisms • SNPs within genes can cause differences in protein product • SNPs near genes can be used to create maps • Find SNP records with Entrez SNP • Entrez “SNP” records also include small indels (insertions/deletions) and small repeat regions
SNPs/MapViewer • SNPs can be retrieved from the SNP database page; excellent “limits” • SNPs can also be retrieved from MapViewer when viewing the Genes_Seq map as the Master Map
Changing the radio button to “in gene region” will show all SNPs recorded for the gene and nearby (including untranslated regions and introns)
This table is a record of the SNPs that have been reported. Includes the exon or other element in which the SNP was found, the contig position number, links to sequence records, what kind of SNP, the nucleotide change, amino acid substitution if appropriate, which position in the amino acid, and position in the protein
Back to MapViewer Click on the X to remove the map Click on the name of a map to get information about that map Click on the arrow to make this map the Master Map Add different types of maps through Maps and Options
Can search on map regions – band numbers (9p23 – 9p11); markers (D6S2068 – D6S1497); numerical positions – (Centimorgans 12-14; base pairs 1.1M – 1.5 M)
Entrez Gene • Pulls together information from multiple data domains • Search by names, symbols, accessions, publications, GO terms, chromosome numbers, E.C. numbers, etc. • Links to sequence, structure, etc records • One record per gene per organism
Entrez Gene • Example – search for the Entrez Gene record for presenilin 1, using the official gene symbol we discussed last time
Search usinggene symbol Could have searched under any of these aliases (unlike GenBank where you would have to try them all)
Note – this is the record view that you need to print for Part C – the ENTIRE record (record spills over onto the following pages of this handout) Gene model Official gene symbol as determined by the Human Genome Nomenclature Commission
Summary of protein, function and disease-causing mutations; from RefSeq record Links to PubMed records that provide evidence of function – any researcher can add these