1 / 143

The Basic Local Alignment Search Tool (BLAST)

The Basic Local Alignment Search Tool (BLAST). Rapid data base search tool (1990) Idea: (1) Search for high scoring segment pairs. The Basic Local Alignment Search Tool (BLAST). A Y W T Y I V A L T – Q V R Q Y E A T S I L C I V M I Y S R A - Q Y R Y W R Y

swain
Télécharger la présentation

The Basic Local Alignment Search Tool (BLAST)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Basic LocalAlignment Search Tool(BLAST) Rapid data base search tool (1990) Idea: (1) Search for high scoring segment pairs

  2. The Basic LocalAlignment Search Tool(BLAST) A Y W T Y I V A L T – Q V R Q Y E A T S I L C I V M I Y S R A - Q Y R Y W R Y Most local alignments contain highly conserved sections without gaps

  3. The Basic LocalAlignment Search Tool(BLAST) A Y W T Y I V A L T – Q V R Q Y E A T S I L C I V M I Y S R A - Q Y R Y W R Y -> search for high scoring segment pairs (HSP), i.e. gap-free local alignments

  4. The Basic LocalAlignment Search Tool(BLAST)

  5. The Basic LocalAlignment Search Tool(BLAST) A Y W T Y I V A L T – Q V R Q Y E A T S I L C I V M I Y S R A - Q Y R Y W R Y Advantages: (a) speed (b) statistical theory about HSP exists.

  6. The Basic LocalAlignment Search Tool(BLAST) Rapid data base search tool (1990) Idea: (1) Search for high scoring segment pairs (2) Use word pairs as seeds

  7. Pair-wise sequence alignment T W L M H C A Q Y I C I M X H X C X T H Y (1) Search word pairs of length 3 with score > T, Use them as seeds.

  8. Pair-wise sequence alignment Naïve algorithm would have a complexity of O(l1 * l2) Solution: Preprocess query sequence: • Compile a list of all words that have a Score > T when aligned to a word in the Query.

  9. Pair-wise sequence alignment Naïve algorithm would have a complexity of O(l1 * l2) Solution: Preprocess query sequence: • Compile a list of all words that have a Score > T when aligned to a word in the Query. Complexity: O(l1) • Organize words in efficient data structure (tree) for fast look-up

  10. The Basic LocalAlignment Search Tool(BLAST) Rapid data base search tool (1990) Idea: (1) Search for high scoring segment pairs (2) Use word pairs as seeds (3) Extend seed alignments until score drops below threshold value

  11. Pair-wise sequence alignment T W L M H C A Q Y I C I M X H X C X T H Y Extend seeds until score drops by X.

  12. Pair-wise sequence alignment T W L M H C A Q Y I C I X M X H X C X T X H X Y Extend seeds until score drops by X.

  13. Pair-wise sequence alignment Algorithm not guaranteed to find best segment pair (Heuristic) But works well in practice!

  14. The Basic LocalAlignment Search Tool(BLAST) New BLAST version (1997) • Two-hit strategy

  15. Pair-wise sequence alignment W L M H C A Q Y A R V I M X H X C X T H W AX R X v X Search twoword pairs of at the same diagonal, use lowerthreshold T

  16. The Basic LocalAlignment Search Tool(BLAST) New BLAST version (1997) • Two-hit strategy • Gapped BLAST • Position-Specific Iterative BLAST (PSI BLAST)

  17. The Basic LocalAlignment Search Tool(BLAST)

  18. 1aboA 1 .NLFVALYDfvasgdntlsitkGEKLRVLgynhn..............gE 1ycsB 1 kGVIYALWDyepqnddelpmkeGDCMTIIhrede............deiE 1pht 1 gYQYRALYDykkereedidlhlGDILTVNkgslvalgfsdgqearpeeiG 1ihvA 1 .NFRVYYRDsrd......pvwkGPAKLLWkg.................eG 1vie 1 .drvrkksga.........awqGQIVGWYctnlt.............peG 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN...... Multiple sequence alignment

  19. Multiple sequence alignment First question: how to score multiple alignments? Possible scoring scheme: Sum-of-pairs score

  20. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  21. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  22. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP......

  23. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQtkngqGWVPSNYITPVN 1ycsB 39 WWWARlndkeGYVPRNLLGLYP

  24. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  25. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  26. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp

  27. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp

  28. Multiple sequence alignment Multiple alignment implies pairwise alignments: 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  29. Multiple sequence alignment Multiple alignment implies pairwise alignments: Use sum of scores of these p.a. 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN......

  30. Multiple sequence alignment Goal: Find multi-alignment with maximum score !

  31. Multiple sequence alignment • Needleman-Wunsch coring scheme can be generalized from pair-wise to multiple alignment • Multidimensional search space instead of two-dimensional matrix!

  32. Multiple sequence alignment

  33. Multiple sequence alignment Complexity: For sequences of length l1 * l2 * l3 O( l1 * l2 * l3 ) For n sequences ( average length l ): O( ln ) Exponential complexity!

  34. Multiple sequence alignment • Needleman-Wunsch coring scheme can be generalized from pair-wise to multiple alignment • Optimal solution not feasible:

  35. Multiple sequence alignment • Needleman-Wunsch coring scheme can be generalized from pair-wise to multiple alignment • Optimal solution not feasible: • -> Heuristics necessary

  36. Multiple sequence alignment (A) Carillo and Lipman (MSA) Find sub-space in dynamic-programming Matrix where optimal path can be found

  37. Multiple sequence alignment (B) Stoye, Dress (DCA) • Divide search space into small • Calculate optimal alignment for sub-spaces • Concatenate sub-alignments

  38. Multiple sequence alignment (B) Stoye, Dress (DCA)

  39. Multiple sequence alignment (B) Stoye, Dress (DCA)

  40. Multiple sequence alignment Progressive alignment. Carry out a series of pair-wise alignment

  41. Multiple sequence alignment Most popular way of constructing multiple alignments: Progressive alignment. Carry out a series of pair-wise alignment

  42. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN WWRLNDKEGYVPRNLLGLYP AVVIQDNSDIKVVPKAKIIRD YAVESEAHPGSFQPVAALERIN WLNYNETTGERGDFPGTYVEYIGRKKISP

  43. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN WWRLNDKEGYVPRNLLGLYP AVVIQDNSDIKVVPKAKIIRD YAVESEAHPGSFQPVAALERIN WLNYNETTGERGDFPGTYVEYIGRKKISP Align most similar sequences

  44. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASFQPVAALERIN WLNYNEERGDFPGTYVEYIGRKKISP

  45. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASVQ--PVAALERIN------ WLN-YNEERGDFPGTYVEYIGRKKISP

  46. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASVQ--PVAALERIN------ WLN-YNEERGDFPGTYVEYIGRKKISP Align sequence to alignment

  47. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN- WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASVQ--PVAALERIN------ WLN-YNEERGDFPGTYVEYIGRKKISP Align alignment to alignment

  48. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN-------- WW--RLNDKEGYVPRNLLGLYP-------- AVVIQDNSDIKVVP--KAKIIRD------- YAVESEA---SVQ--PVAALERIN------ WLN-YNE---ERGDFPGTYVEYIGRKKISP

  49. Multiple sequence alignment WCEAQTKNGQGWVPSNYITPVN-------- WW--RLNDKEGYVPRNLLGLYP-------- AVVIQDNSDIKVVP--KAKIIRD------- YAVESEA---SVQ--PVAALERIN------ WLN-YNE---ERGDFPGTYVEYIGRKKISP Rule: “once a gap - always a gap”

  50. Multiple sequence alignment Order of pair-wise profile alignments determined by phylogenetic tree based on pair-wise similarity values (guide tree)

More Related