1 / 26

Understanding PSSM and PSI-BLAST: Scoring Approaches in Homology Detection

This text explores the significance of Position-Specific Scoring Matrices (PSSMs) and PSI-BLAST in detecting homologs based on specific gene products. It details how PSSMs derive scoring matrices by assessing amino acid substitution frequency and accommodating insertions and deletions (INDELs). The iterative nature of PSI-BLAST is emphasized, showcasing how it refines the scoring matrix by filtering through significant hits, ultimately aiding in improved accuracy for homology searches across large protein databases. Learn how these powerful tools enhance protein sequence alignment.

jud
Télécharger la présentation

Understanding PSSM and PSI-BLAST: Scoring Approaches in Homology Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expect value(E-value) • Expected number of hits, of equivalent or better score, found by random chance in a database of the size searched.

  2. Conserved domains Domain: sequence of amino acids that typically fold to a stable tertiary structure. Many proteins are multi-domain.

  3. Blast to Psi-Blast • Blast makes use of Scoring Matrix derived from large number of proteins. • What if you want to find homologs based upon a specific gene product? • Develop a position specific scoring matrix (PSSM).

  4. PSSM M F W Y G A P V I L C R K E N D Q S T H M G A S F 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Determine frequency of substitution, and converts to LogOdd score.

  5. PSSM INDEL M F W Y G A P V I L C R K E N D Q S T H M G A S F 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Indel 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Can include a score for permitting insertions and deletions. Perhaps this position is at a turn, where INDELs are common.

  6. PSSM • In evaluating (scoring) alignments, PSSM approaches typically: • Reward matches to columns that have conserved amino acids • Penalize mismatches to columns with conserved amino acid more than mismatches in a variable column

  7. PSI-BLAST • Input a single query sequence. • Executes a BLAST run. • Program takes significant hits, incorporates matches into a PSSM. • Sequences >98% similar not included (avoid biasing the PSSM).

  8. Power of approach: • PSI-BLAST is iterative. • Takes best hits and improves the scoring matrix.

  9. Original Blast had 84 hits.

  10. The PSSM will skew towards this region

More Related