1 / 26

Expect value (E-value)

Expect value (E-value). Expected number of hits, of equivalent or better score, found by random chance in a database of the size searched. Conserved domains Domain: sequence of amino acids that typically fold to a stable tertiary structure. Many proteins are multi-domain. Blast to Psi-Blast.

jud
Télécharger la présentation

Expect value (E-value)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expect value(E-value) • Expected number of hits, of equivalent or better score, found by random chance in a database of the size searched.

  2. Conserved domains Domain: sequence of amino acids that typically fold to a stable tertiary structure. Many proteins are multi-domain.

  3. Blast to Psi-Blast • Blast makes use of Scoring Matrix derived from large number of proteins. • What if you want to find homologs based upon a specific gene product? • Develop a position specific scoring matrix (PSSM).

  4. PSSM M F W Y G A P V I L C R K E N D Q S T H M G A S F 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Determine frequency of substitution, and converts to LogOdd score.

  5. PSSM INDEL M F W Y G A P V I L C R K E N D Q S T H M G A S F 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Indel 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Can include a score for permitting insertions and deletions. Perhaps this position is at a turn, where INDELs are common.

  6. PSSM • In evaluating (scoring) alignments, PSSM approaches typically: • Reward matches to columns that have conserved amino acids • Penalize mismatches to columns with conserved amino acid more than mismatches in a variable column

  7. PSI-BLAST • Input a single query sequence. • Executes a BLAST run. • Program takes significant hits, incorporates matches into a PSSM. • Sequences >98% similar not included (avoid biasing the PSSM).

  8. Power of approach: • PSI-BLAST is iterative. • Takes best hits and improves the scoring matrix.

  9. Original Blast had 84 hits.

  10. The PSSM will skew towards this region

More Related