1 / 15

Rhesus genome annotations

Rhesus genome annotations. Rob Norgren Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center. Conventional Approach to GeneChip Production. Sequence millions of ESTs Obtain finished genomic sequences Cluster redundant ESTs

crescent
Télécharger la présentation

Rhesus genome annotations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rhesus genome annotations • Rob Norgren • Department of Genetics, Cell Biology and Anatomy • University of Nebraska Medical Center

  2. Conventional Approach to GeneChip Production • Sequence millions of ESTs • Obtain finished genomic sequences • Cluster redundant ESTs • Align EST clusters with genomic sequences • Extract the last 571 bp of sequence from each transcript - probe selection region (PSR) • Choose 11 to 16 probes that tile across the PSR

  3. Problems with the conventional approaches for a rhesus macaque GeneChip • Insufficient ESTs to cover most genes • Little finished genomic sequence (in 2005)

  4. Strategy for targeted amplification of rhesus genes • Identify the terminal exon and flanking sequence for every human gene • Design primers and amplify from monkey genomic DNA • Obtain the rhesus PSR sequences Poly A Terminal exon PSR R F PSR: Probe selection region F: forward primer R: reverse primer

  5. Other sources for rhesus GeneChip PSRs • Preliminary Baylor Genomic SequencesIn silico approach - Aligned human PSRs with preliminary rhesus genomic sequence. • ESTs

  6. Rhesus GeneChip • Available in March 2005 • Novel design • Whole genome expression array - 52,024 probes for 47,000 transcripts • Probesets include 17,093 well-annotated genes (16 probes/probeset) • Probesets were designed for 1,099 well-annotated genes not present on the U133+2.0 human GeneChip.

  7. Rhesus Genome • Draft published in Science on April 17, 2007 • “The rhesus macaque genome assemblyis a draft DNA sequence, and it contains many gaps.”

  8. What does a “draft” rhesus genome mean? • 26,907 protein coding genes for the human • 24,038 protein coding genes for rhesus macaques • Sounds good, but is misleading. • 19,450 well-annotated protein coding genes for humans • 8,744 well-annotated protein coding genes for rhesus macaques • What does “well annotated” mean”? • No “hypothetical” genes • Only genes with “good” gene symbols. No “Locs”.

  9. Problems with GeneChip annotations • Affymetrix relies on NCBI annotations, hence, many probesets are not annotated with “real” gene symbols • Stop gap solution:http://www.unmc.edu/rheusgenechip • Permanent solution requires full and complete annotation of the rhesus genome at NCBI.

  10. What can go wrong at the genome sequencing center? • Large gaps • Small gaps • Misassemblies • Sequencing errors

  11. What can go wrong with ab initio annotations? • Incorrect assignment of pseudogene status • Failure to identify genes • Incorrect gene models (some exons right, some wrong) • Incomplete gene models

  12. Consequences of non-annotated genes • Large number of databases depend on NCBI annotations for their annotations. Example: Affymetrix GeneChips • Errors and omissions are propagated to dependent databases • Users are frustrated when they see “Locs” instead of a proper gene symbol • Users can Blast each probeset consensus sequence or ask their bioinformatics personnel to establish gene identity, but this is wasteful in time and energy.

  13. How to correct annotations • Annotations must be acceptable to NCBI, if they are not, corrections will not propagate to dependent databases. • Some gene annotations can be corrected by manual inspection. • Some gene annotations can be corrected by human ortholog-based gene models rather than ab initio approaches. • Some gene annotations can only be corrected by additional sequencing. • And some gene annotations require a trip to Hell...

  14. Defensins - the gene family from Hell • Large family of genes • Orthologs poorly conserved - positive selection? • Will require focused sequencing and annotation • May require publication before NCBI annotates most of the rhesus defensins

  15. Acknowledgements • Jeff Kittrell • Joel Goodsell • Audrey Gomel • NCRR/NIH

More Related