120 likes | 222 Vues
The ENCODE project's goal is to achieve high-accuracy annotations of gene features across the human genome. This initiative involves collaboration among prestigious institutions such as the University of California Santa Cruz, Yale University, and others across the US and Europe. Using a manual annotation approach, around 20 expert annotators follow HAVANA guidelines to analyze protein-coding loci and non-coding regions. The project utilizes advanced computational tools and software like Otterlace, and it emphasizes rigorous data verification methods, including RT-PCR and sequencing, to ensure reliability and accuracy in genome annotation.
E N D
D A S for ENCODE data coordination Felix Kokocinski, WTSI
Project Overview Partners: • University of California Santa Cruz, USA • Washington University St. Louis, USA • Broad Inst. of MIT and Harvard, USA • Yale University, USA • HAVANA & EnsEMBL, Sanger Institute, UK • University of Lausanne, CH • Centre for Genomic Regulation, ES • Spanish Nat. Cancer Res. Centre, ES Goal: Annotate all evidence-based gene features at a high accuracy across the human genome • protein-coding loci with isoforms • nc loci with transcript evidence • pseudogenes
Manual Genome Annotation • ~20 annotators working according to HAVANA guidelines • computational pipeline for alignments • Otterlace software • input from partner groups, import of data source via DAS • verification with RT-PCR, RACE & sequencing
Perl API Update Scripts Source Adaptors interface WWW exper. ver. issues high prior. issues Data Exchange using DAS Distributed Annotation Sources GenTrack tracking system Otterlace ann. software
GenTrack Annotation Tracking • extension of open-source RoR ticketing system Redmine(www.redmine.org) • data import via DAS • modules for analyzing and flagging data • www.sanger.ac.uk/gentrack
GenTrack: Workflow • Entry points: • List of all genes & transcripts in region • High-priority loci • Loci with specific tags • Identify problem, compare in Otterlace • Resolve by • Changing annotation or • Disbelieving other source • Note decision
DAS Specifics Format: Specialized 1.53E <type-id> from sequence ontology (exon: SO:0000147) <method> (havana_manual_annotation) <type-category> Evidence code describing the type of method (inferred from RT-PCR experiment (ECO:0000109)) <note> - key=value pairs - parent, lastmod [req] (LASTMOD=2006-04-07T15:15:58+0100) - transcripttype, etc. [opt]
Thanks Jennifer Harrow Steve Searle Adam Frankish Bronwen Aken Toby Hunt James Gilbert Tim Hubbard Anacode Andy Jenkinson Steve Trevanion Jonathan Warren Redmine.org Paul Bevan ENCODE partners Jody Clements