1 / 18

Comparing Protein Sequences

Tutorial 4. Comparing Protein Sequences. Today’s menu: PAM and BLOSUM score matrices Psi-BLAST Phi-BLAST. PAM matrices are based on global alignments of closely related proteins. The PAM1 is the matrix calculated from comparisons of sequences with no more than 1% divergence.

zeheb
Télécharger la présentation

Comparing Protein Sequences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial 4 Comparing Protein Sequences Today’s menu: • PAM and BLOSUM score matrices • Psi-BLAST • Phi-BLAST

  2. PAM matrices are based on global alignments of closely related proteins. The PAM1 is the matrix calculated from comparisons of sequences with no more than 1% divergence. Other PAM matrices are extrapolated from PAM1. BLOSUM matrices are based on local alignments. BLOSUM 62 is a matrix calculated from comparisons of sequences with at most 62% identity in the blocks. All BLOSUM matrices are based on observed alignments; they are not extrapolated from comparisons of closely related proteins. PAM & BLOSUM

  3. Use Recommendations PAM100 ~ BLOSUM90 Closely Related PAM120 ~ BLOSUM80 PAM160 ~ BLOSUM60 PAM200 ~ BLOSUM52 PAM250 ~ BLOSUM45 Highly Divergent

  4. Example • Query: >ADRM1_HUMAN (A glycosylated plasma membrane protein which promotes cell adhesion • Data Base: nr on Human genome. • Blast Program: BLASTP • Matrices: PAM30,BLOSUM45

  5. What differences we observe?: • With BLOSUM45 we found related and divergent sequences. • With PAM30 we found only related sequences. BLOSUM45 PAM 30

  6. With BLOSUM45 we can discover interesting relations between proteins PAM 30 Mucin-13:a glycosylated membrane protein that protects the cell by binding to pathogens BLOSUM45 . . .

  7. Using different scoring matrices can produce slightly Different alignments: With PAM 30 With BLOSUM45

  8. A same alignment can be solved in many ways, specially when using a matrix for highly divergent sequences (BLOSUM45):

  9. PSI-BLAST Position Specific Iterative BLAST We will analyze the following Archeal uncharacterized protein: >gi|2501594|sp|Q57997|Y577_METJA PROTEIN MJ0577 MSVMYKKILYPTDFSETAEIALKHVKAFKTLKAEEVILLHVIDEREIKKRDIFSLLLGVAGLNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGVDIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVKRKNS

  10. Threshold for initial BLAST Search (default:10) Threshold for inclusion in PSI-BLAST iterations (default:0.005)

  11. The query itself Orthologous sequences in two other archaeal species Other homologous sequences

  12. Is MJ0577 a filament protein? . . . Is MJ0577 a cationic amino transporter? . . . Is MJ0577 a universal stress protein? . . .

  13. PHI-BLAST Pattern Hit Initiated BLAST A-T-X-[AVG]R-S

  14. Pattern symbols []= For grouping up aminoacids that can happen at a given position ()= For numbers, when a residue (or group of residues) is repited - = For separating between positions

  15. Making a pattern …LIDEADKTT… …IMDEADEFL… …LLDEADKCL… …ILDEADRIL… …VVDEADNFI… …LVDEADKGI… …LMDEADEFL… …MLDEADRSI… …LIDEADKML… …MLDEADNWI… …LVDEADRFL… [LIVM](2)-D-E-A-D-[RKEN]-x-[LI]

  16. Example >gi|71154193|sp|P0A9P6|DEAD_ECOLI Cold-shock DEAD box protein A (ATP-dependent RNA helicase deaD) MAEFETTFADLGLKAPILEALNDLGYEKPSPIQAECIPHLLNGRDVLGMAQTGSGKTAAFSLPLLQNLDP ELKAPQILVLAPTRELAVQVAEAMTDFSKHMRGVNVVALYGGQRYDVQLRALRQGPQIVVGTPGRLLDHL KRGTLDLSKLSGLVLDEADEMLRMGFIEDVETIMAQIPEGHQTALFSATMPEAIRRITRRFMKEPQEVRI QSSVTTRPDISQSYWTVWGMRKNEALVRFLEAEDFDAAIIFVRTKNATLEVAEALERNGYNSAALNGDMN QALREQTLERLKDGRLDILIATDVAARGLDVERISLVVNYDIPMDSESYVHRIGRTGRAGRAGRALLFVE NRERRLLRNIERTMKLTIPEVELPNAELLGKRRLEKFAAKVQQQLESSDLDQYRALLSKIQPTAEGEELD LETLAAALLKMAQGERTLIVPPDAPMRPKREFRDRDDRGPRDRNDRGPRGDREDRPRRERRDVGDMQLYR IEVGRDDGVEVRHIVGAIANEGDISSRYIGNIKLFASHSTIELPKGMPGEVLQHFTRTRILNKPMNMQLL GDAQPHTGGERRGGGRGFGGERREGGRNFSGERREGGRGDGRRFSGERREGRAPRRDDSTGRRRFGGDA The DEAD box pattern: [LIVM](2)-D-E-A-D-[RKEN]-x-[LI]

More Related