150 likes | 242 Vues
Utilize Proteome Analyst, an accurate web-based tool that uses machine learning to filter biological data, make predictions on protein function and location, and accelerate protein research.
E N D
Proteome Analyst:Accelerating Protein Research www.cs.ualberta.ca/~bioinfo
DNA Sequence 1 cctcgcccgc ctgccgcctt tttgtgcgcg tgtgagtgtg ggccccagcg tgccctcccg 61 ggggtgggtt ccgggcggaa ggcggaggcc cggcgcgcag cccgccgccc gcctgcccgc 121 ggaccgggga gccggggtgc ttggagcggg ggacgccagg cgtgggctgg cggcgggacc 181 aggaggagga ggaggaggag gaggagagcg cgggctggcg cttgcccggg cgcagtcggc 241 ggggaccgag tcgtacttcc tgtgcgaaag gcggcccgac cctaaccgcc accccctccc 301 cctgtctccc tctctgaacc cgcccattgg gggtaggaca ctcagccgtc accgctcgct 361 ctgctggccg ctacctgcag caagataggg ccgccatcgc cgggcgacga cgaggaggag 421 gcggccgccg cagccggggc ccccgccgcc gccggagcga caggtgattt ggcttctgca 481 cagttaggag gagcaccaaa ccgatgggag gttttgtcag ccacacctac aactataaaa 541 gatgaagctg gtaatctagt ccagattcca agtgctgcta cttcaagtgg gcagtatgtt 601 cttccccttc agaatttgca gaatcaacaa atattttccg ttgcaccagg atcagattca 661 tcaaatggta cagtgtccag tgttcaatat caagtgatac cacagatcca gtcagcagat 721 ggtcagcagg ttcaaattgg tttcacaggc tcttcagata atgggggtat aaatcaagaa 781 agcagtcaaa ttcagatcat tcctggctct aatcaaacct tacttgcctc tggaacacct 841 tctgctaaca tccagaatct cataccacag actggtcaag tccaggttca gggagttgca 901 attggtggtt catcttttcc tggtcaaacc caagtagttg ctaatgtgcc tcttggtctg 961 ccaggaaata ttacgtttgt accaatcaat agtgtcgatc tagattcttt gggactctcg 1021 ggcagttctc agacaatgac tgcaggcatt aatgccgacg gacatttgat aaacacagga 1081 caagctatgg atagttcaga caattcagaa aggactggtg agcgggtttc tcctgatatt 1141 aatgaaacta atactgatac agatttattt gtgccaacat cctcttcatc acagttgcct 1201 gttacgatag atagtacagg tatattacaa caaaacacaa atagcttgac tacatctagt
Protein Sequence >UniProt/Swiss-Prot|P30613|KPYR_HUMAN MSIQENISSLQLRSWVSKSQRDLAKSILIGAPGGPAGYLRRASVAQLTQELGTAFFQQQQ LPAAMADTFLEHLCLLDIDSEPVAARSTSIIATIGPASRSVERLKEMIKAGMNIARLNFS HGSHEYHAESIANVREAVESFAGSPLSYRPVAIALDTKGPEIRTGILQGGPESEVELVKG SQVLVTVDPAFRTRGNANTVWVDYPNIVRVVPVGGRIYIDDGLISLVVQKIGPEGLVTQV ENGGVLGSRKGVNLPGAQVDLPGLSEQDVRDLRFGVEHGVDIVFASFVRKASDVAAVRAA LGPEGHGIKIISKIENHEGVKRFDEILEVSDGIMVARGDLGIEIPAEKVFLAQKMMIGRC NLAGKPVVCATQMLESMITKPRPTRAETSDVANAVLDGADCIMLSGETAKGNFPVEAVKM QHAIAREAEAAVYHRQLFEELRRAAPLSRDPTEVTAIGAVEAAFKCCAAAIIVLTTTGRS AQLLSRYRPRAAVIAVTRSAQAARQVHLCRGVFPLLYREPPEAIWADDVDRRVQFGIESG KLRGFLRVGDLVIVVTGWRPGSGYTNIMRVLSIS
Annotation • Knowledge of the DNA and protein sequences greatly accelerates lab research to discover protein function • Time- and resource-intensive • Human bottle-neck
Sequence Database Growth Protein Sequences 2 000 000 1 500 000 1 000 000 500 000 0 Unnannotated Protein Sequences (GenPept) Human Annotated Protein Sequences (SwissProt) 86 88 92 94 96 98 00 02 04 Year
Proteome Analyst Proteome Analyst (PA): • is a free, Web-based tool • uses machine learning to make predictions; can explain its predictions • is very accurate (e.g., precision and recall) Goal: Filter vast amounts of biological data and make meaningful predictions on the function and location of proteins; accelerate protein research.