Protein Analysis (Beyond BLAST)
Protein Analysis (Beyond BLAST). Basic Protein Data (MW, pI) Generalized Features Multiple Sequence Alignments Fingerprints, Profiles, Domains Using Structure. Multiple Sequence Alignments Comparison of Methods. DIALIGN.
Protein Analysis (Beyond BLAST)
E N D
Presentation Transcript
Protein Analysis (Beyond BLAST) • Basic Protein Data (MW, pI) • Generalized Features • Multiple Sequence Alignments • Fingerprints, Profiles, Domains • Using Structure
Multiple Sequence AlignmentsComparison of Methods DIALIGN At 368 LY-------- ---------- --AEVAYHNF APPHVTKNSY FAAILGHNNNPY54 383 AY-------- ---------- --EKWFRTDP RWAKCDEDVF FSELLGHD--VHML 273 HHyqrvshts hgefsfrlpg hlCRIAFHEF RHNGESKAAF RSRVLGHSGGBB003 298 IY-------- ---------- --CKFSYLAF APKNMEMNYW ITKVLGHEPNN15 390 AY-------- ---------- --EMFFRVDP RWKNVDEDVF FMEILGHD--Ecto 304 SH-------- ---------- --RTFk---- --NNCSINIW LTKTLLHE-- At 398 DLETSLSYMT YTL------- ---------- ---------- ----------PY54 411 DPDTQLAYKQ FKL------- ---VNFNPKW TPNISDENPR LAALQELDNDVHML 323 DKSTQNHYEG FELdskveti gvvDMGQNEA DKSYNKQL-- LKHLEQYDATBB003 328 DITTAFHYNR YVL------- ---DNLDDKA DNSLLTLL-- NQRIYTYVRRN15 418 DENTQLHYKQ FKL------- ---ANFSRTW RPEVGDENTR LVALQKLDDEEcto 326 ALDTSIFYSR FRI------- ---DKCStnr gewaf----- ---------- ClustalW PY54 FRTDPRWAKCDEDVFFSELLGHD--DPDTQLAYKQFKLVNFNPKWTPNISDENPRLAALQ 456 N15 FRVDPRWKNVDEDVFFMEILGHD--DENTQLHYKQFKLANFSRTWRPEVGDENTRLVALQ 452 Ecto K------NNCSINIWLTKTLLHE--ALDTSIFYSRFRIDKCS------------------ 342 VHML R-----HNGESKAAFRSRVLGHSGGDKSTQNHYEGFELDSKVETIG-------------- 343 At AP-----PHVTKNSYFAAILGHNNNDLETSLSYMTYTLPEDR------------------ 414 BBB03 AP-----KNMEMNYWITKVLGHEPNDITTAFHYNRYVLDNLD------------------ 344
Multiple Sequence Alignments • Method of choice depends upon goal of alignment • FMI – Thompson et al., 1999. Nucleic Acids Research 27:2682-90. A comprehensive comparison of multiple sequence alignment programs.
Multiple Sequence Alignment Protein/Position within SequenceSequence YNZ5_YEAST/135-152 RLCYNCNETGHISKDCPKO65639/100-117 SGCYNCGELGHISKDCGIQ94821/1600-1617 KGCFNCGEEGHQSRECTKBYR3_SCHPO/17-34 PRCYNCGENGHQARECTKO44758/570-587 RGCHNCGEEGHISKECDKGLH1_CAEEL/262-279 RGCFNCGEQGHRSNECPNO96068/51-68 KGCFKCGEEGHMSRECPQHEXP_LEIMA/43-60 TTCFRCGEEGHMSRECPNO46363/5-22 VTCYKCGEAGHMSRECPKHEXP_LEIMA/196-213 RKCYKCGESGHMSRECPS
Pattern Searches Ultimate Goal Atwood, 2000, Int. J. Biochem. Cell Biol. 32:139-55
Pattern Searches 3 Levels Atwood, 2000, Int. J. Biochem. Cell Biol. 32:139-55
Pattern Searches Hidden Markov Model Atwood, 2000, Int. J. Biochem. Cell Biol. 32:139-55