1 / 24

Phytome

A Data Analysis Pipline presented by Jason Phillips. Phytome. High Level Flow Chart. Retrieve Unigenes. Translate Unigenes. Families. Main Outline. Unigenes (Where'd they come from, where'd they go?) Translation (methods and procedures) Building Families (the power of together-ness).

lew
Télécharger la présentation

Phytome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Data Analysis Pipline presented by Jason Phillips Phytome

  2. High Level Flow Chart Retrieve Unigenes Translate Unigenes Families

  3. Main Outline • Unigenes (Where'd they come from, where'd they go?) • Translation (methods and procedures) • Building Families (the power of together-ness)

  4. phytome» Unigene • What are? • Where from? • Nine Species • Arabidopsis, a special case • Storage

  5. phytome» Unigene » What Are? Combined EST's that overlap

  6. phytome» Unigene » Where From? • TIGR • Other sources?

  7. phytome» Unigene » Nine Species

  8. Highly annotated... Highly sequenced... Highly translated... phytome» Unigene » Arabidopsis

  9. phytome» Unigene » Storage species count ------------------- ghir 24350 mcry 8455 osat 60778 hann 20520 mtru 36976 lesc 31012 ljap 11025 lsat 21960 atha 27170 ------------------- total: 242246

  10. phytome» Translation • Methods • Estwise • Estscan • FrameFinder • Procedure • Numbers

  11. phytome» Translation » methods HOMOLOGIES via BLAST AB INITIO ESTSCAN FRAMEFINDER EST-WISE sprot + trembl

  12. phytome» Translation » procedure • EST-WISE (Mac OSX Cluster) • blast swiss prot: 10.3 hours, 35 nodes (~15 days) • blast trembl: 35.7 hours, 35 nodes (~52 days) • ESTSCAN (Mustard) • FrameFinder (Mustard)

  13. phytome» Translation » numbers 151,830 90,416 242,246 Unigenes ESTWISE 226,988 ESTSCAN 15,258 FRAMEFINDER 242,242 4

  14. phytome» Families • Relationships • Clustering • Numbers

  15. phytome» Families » Relationships Blast everything against everything sequences blastable db of sequences query sbjct e-value ------- -------- ----------- mtru302 ljap4523 1 29 mtru302 lesc25072 1 26 mtru302 hann20270 5 24 osat59606 osat59606 1 157 osat59606 osat4002 1 96 osat59606 atha25166 1 88 ...... ..... . .. ...... ..... . ..

  16. phytome» Families » Relationships But we have 4 set's of sequences! nucleotides estwise estscan framefinder blastp 151,830 tblastx 242,246 blastp 226,988 blastp 242,242 Which method do we trust?

  17. phytome» Families » Relationships 4 data sets...4 family interpretations BLAST OFF! tb ~3 days, 28 nodes (~84 days) ~1/4 day, 21 nodes (~5days) ew es ~1/4 day, 21 nodes (~5 days) ff ~1/4 day, 21 nodes (~5 days)

  18. phytome» Families » Relationships BLAST RESULTS Method size no blast no trans attrition ------ -------- -------- -------- ---------- tb 242246 153 0 153 ew 151830 22 90416 90438 ff 242242 24563 4 24567 es 226988 1345 15258 16603

  19. phytome» Families » Clustering TRIBE MCL gene evalue

  20. phytome» Families » Clustering TRIBE MCL gene evalue

  21. phytome» Families » Clustering query sbjct evalue -------- -------- ------ atha7499 atha8483 6 78 atha7499 atha7503 4 90 osat23081 atha10704 8 78 osat23081 osat36667 8 78 atha1072 atha5059 2 68 atha1072 lsat15421 2 60 atha1072 lsat21190 1 102 atha1072 atha5059 9 54 ...... ...... . .. ...... ...... . .. ...... ...... . .. fam id member ------ ------ .... ....... .... ....... 4035 atha7499 4035 atha7503 4035 atha8483 4036 atha10704 4036 osat23081 4036 osat36667 4037 atha1072 4037 atha5059 4037 lsat15421 4037 lsat21190 .... ....... .... ...... tribe mcl

  22. phytome» Families » Clustering blast results families tb tb ew ew TRIBE MCL es es ff ff

  23. Let's look as some histograms! phytome» Families » Clustering

  24. What should we do next round?

More Related