1 / 23

Sequence variations

SNPs Mutations Haplotypes. Sequence variations. http://www.ebi.ac.uk/mutations/. How much variation in human genome?. 3000 Mb; a SNP every 1kb = 3 milj. Underestimate? Rare ones might be the most interesting 40 000 genes x 100 variants = 4 milj. slow build-up

felcia
Télécharger la présentation

Sequence variations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SNPs Mutations Haplotypes Sequence variations http://www.ebi.ac.uk/mutations/

  2. How much variation in human genome? • 3000 Mb; a SNP every 1kb = 3 milj. • Underestimate? • Rare ones might be the most interesting • 40 000 genes x 100 variants = 4 milj. • slow build-up • Bottlenecks in human population history!

  3. Variations • SNPs (Single Nucleotide Polymorphisms) • Indels, dinucleotide mutations • Mutations, polymorphisms • Chromosomal rearrangements • inversions • translocations • indels

  4. Mutations Levels of biological data DNA RNA Polypeptide Protein structure Protein function Protein interection pathways Cellular dynamics Tissue interactions Organism (phenotype) Population dynamics

  5. Gene structure & function Diagnosis of inherited disorders Evolution & ecology Trans- plantation Genetic mapping Tissue typing Forensics Epidemi- ology Insurance evaluation ? Association studies Carrier screening Pharmaco- genetics Scientists Biomed students Healthcare professionals "General public" Uses of sequence variations

  6. Nucleotide sequences Amino acid sequences EMBL/GenBank/DDBJ Non-redundant human sequence SWISS-PROT SNPs Data sources Central variation databases Sequence alignments Population studies Direct submissions Literature / Publishers Single Locus Databases Blood Cells, Molecules and Diseases HGVS

  7. DNA Mutation Checker v.2 Bio::LiveSeq Bio::Variation http://bio.perl.org/

  8. SNPs • dbSNP • main repository • HGVbase • clean subset • TSC • verified SNPs • allele frequency project • National SNP projects • Japan, China, ...

  9. SNPs • dbSNP #29 • 2,673,925 (414,853 masked) • HGVbase #13 • 1,451,426 • TSC #10 • 1,389,655 • 1,062,212 mapped

  10. HGVbase • Human Genome Variation database http://hgvbase.cgb.ki.se/ • ex. HGBASE • Three part collaboration betweenTony Brookes (KI), Heikki Lehvaslaiho (EBI) and Peer Bork (EMBL). • text and homology searches • Distributions: SQL dump, XML, flat file, FASTA

  11. SNP synchronization Ensembl dbSNP HGVbase

  12. HGBASE update • Assays • Flanking sequence retrieval • Effects on predicted genes • Chromosomal locations • Similarity scoring • Haplotypes • WOW extensions http://hgbase.cgb.ki.se/

  13. Mutation distribution

  14. Mutation numbering options Reference Sequence Numbering Schema -1 +1 Coding region cDNA DB entry Coding region gDNA DB entry Genomic gene seq

  15. HGVS • Human Genome Variation Society • http://www.hgvs.org/ • ex. HUGO MDI • Society Journal: Human Mutation • “The Society aims to foster discovery and characterization • of genomic variations including population distribution and • phenotypic associations. • We will promote collection, documentation and free distribution of • genomic variation information and associated clinical variations and • endeavor to foster the development of the necessary • methodology and informatics.”

  16. A human sequence variation database emphasizing data quality and a broad spectrum of data sources WaystationOfficeWarehouse WOW Jamie Cutticia Dick Cotton Heikki Lehväslaiho Tony Brookes

  17. WOWarehouse plans • Expansion of the HGBASE design • Use of the Ensembl framework • Data flow from WayStation • Novel mutations • Direct parsing • Existing resources (SRS) • Need to get the the data in quickly • Haplotype & Genotype descriptions • Phenotype desciption

  18. Other sources LSDB LSDB LSDB LSDB ID WayStation Submitter Warehouse Updates Downloads PubMed Human Mutation Interfaces LSDB LSDB LSDB Editor WOW structure Correction requests Downloads Updates Submission Peer review

  19. Reference Sequence • Strive to use genomic coordinates • Use Ensembl to visualise all variants in genomic context • Ensembl is now using NCBI genome builds => only one, up-to-date reference sequence • Easy way to transform gene coordinates into genomic coordinates

  20. Haplotype representation • Haplotype = list of Marker/AlleleIDs & HaplotypeIDs. • No ordering of IDs in Haplotype definition – taken care of by Marker definition. • No reference haplotypes • Genotype: >2 Haplotypes

  21. Haplotypes • Chr21, 6 chromoses • David Cox (Patil et al, Science) • whole human genome coming • Chr22, >200 individuals • Ian Dunham, in preparation • Haplotype Blocks! • HapMap (Eric Lander, NIH)

  22. Phenotype • Pragmatic approach! • Ideas: • Based on extended GO terminogy developed at Jackson Laboratory • Phenotype = modifier + traits • OMIM compatible • US NML anatomy vocabulary subset?

More Related