1 / 2

Querying SGN Database for Unigenes: Protein Sequences and Classification on TOM1 Array

This document outlines methods to query the SGN database using all spot positions from the TOM1 array to extract predicted protein sequences for unigenes, specifically from the Lycopersicon and Arabidopsis species. It details how to utilize Arabidopsis blast searches against the latest protein models, focusing on manual classifications to identify possible orthologs and in-paralogs. Key results include comprehensive classification based on both 5' and 3' reads. This systematic approach aids in the exploration of unigene data and their classifications within the SGN framework.

agatha
Télécharger la présentation

Querying SGN Database for Unigenes: Protein Sequences and Classification on TOM1 Array

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SGN Database TOM1 Array Use all spot positions (e.g. 1.2.3.4) to query SGN database, for Unigenes Predicted protein sequences Unigene List >sgn|U225498 Lycopersicon Combined #3 [5 ESTs aligned] arabidopsis/peptide: At1g67090.2 68414.m07630 ribulose bisphosphate carboxylase small chain ... genbank/nr: gi|3914578|sp|O22573|RBS3_FRIAG Ribulose bisphosphate carboxylase small chain 3, chloroplast precursor (RuBisCO small subunit 3) ... >sgn|U225499 Lycopersicon Combined #3 [6 ESTs aligned] arabidopsis/peptide: At1g42960.1 68408.m04528 expressed protein (evalue: 3.2e-28, score=121.7) genbank/nr: gi|18400762|ref|NP_564482.1| expressed protein [Arabidopsis thaliana] gi|25373224|pir||B96497 >sgn|U225500 Lycopersicon Combined #3 [35 ESTs aligned] arabidopsis/peptide: At1g69410.1 68408.m07313 Eukaryotic initiation factor 5A -related similar to eukaryotic initiation factor 5A genbank/nr: gi|20138708|sp|Q9AXQ6|IF51_LYCES Eukaryotic translation initiation factor 5A-1... TCCCAGTAGTCCCA..... ... INTERPRO Domains Unigene List, containing i) Arabidopsis blast search result ii) genebank/nr blast search result ii) Sequence of the unigne INPARANOID Sequences were used to blast against latest Arabidopsis Protein models Genebank nr hits used for manual classifications Possible Arabidopsis orthologs and in-paralogs Classification of Unigenes represented on the TOM1 array

  2. SGN Database TOM1 Array Query SGN database with all spot Postions (e.g. 1.2.3.4) seperately to obtain 5‘ unigene and 3‘ unigene Unigene corresponding to 3‘ read Unigene corresponding to 5‘ read Classification of Unigenes represented on the TOM1 array Spot Classification based on 5‘ read Combine results Spot Classification based on 3‘ read Classification of Spots represented on the TOM1 array

More Related