120 likes | 229 Vues
Multiple Sequence Alignment (MSA) is a fundamental technique in bioinformatics for arranging DNA, RNA, and protein sequences to identify regions of similarity due to evolutionary, structural, or functional relationships. MSAs can establish hypotheses about positional homology and provide a concise summary of sequence data, highlighting dissimilarities among sequences. Various approaches to MSA include manual, automatic, and combined methods, with tools like CLUSTAL being widely used. Alignment tasks can vary in complexity depending on insertions or deletions present in the sequences.
E N D
Sequence Alignment • A way of arranging the primary sequences of DNA, RNA and amino acid to identify the regions of similarity that may be a consequence of functional, structural or evolutionary relationship between the sequences.
Goals • To establish an hypothesis of positional homology between bases/amino acids. • To generate a concise, information-rich summary of sequence data. • Sometimes used to illustrate the dissimilarity between a group of sequences. • Alignments can be treated as models that can be used to test hypotheses.
Sequence Alignment • Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. • Gaps (symbol “-”) are inserted between the residues so that residues with identical or similar characters are aligned. Taxon A Taxon B Taxon C GGGAATCTAGGACTATACCGGATCTA GGGAATCTA--ACTATA--GGATCTA GGG--TCTAGGACTATACCGGAT--A
Alignment can be easy or difficult Easy Difficult due to insertions or deletions (indels)
Protein Alignment may be guided by Tertiary Structure Interactions Homo sapiens DjlA protein Escherichia coli DjlA protein
Multiple Sequence Alignment- Approaches 3 main approaches of alignment: • Manual • Automatic • Combined
Manual Alignment Might be carried out because: • Alignment is easy. • There is some extraneous information (structural). • Automated alignment methods have encountered the local minimum problem. • An automated alignment method can be “improved”.
Automatic Alignment:Progressive Approach • Devised by Feng and Doolittle in 1987. • Essentially a heuristic method and as such is not guaranteed to find the ‘optimal’ alignment. • Requires n-1+n-2+n-3...n-n+1 pairwise alignments as a starting point. • Most successful implementation is CLUSTAL.
Overview of ClustalW Procedure ClustalW Hbb_Human 1 - Hbb_Horse 2 .17 - Hba_Human 3 .59 .60 - Quick pairwise alignment: calculate distance matrix Hba_Horse 4 .59 .59 .13 - Myg_Whale 5 .77 .77 .75 .75 - Hbb_Human 4 2 3 Hbb_Horse Hba_Human Neighbor-joining tree (guide tree) 1 Hba_Horse Myg_Whale alpha-helices 1 PEEKSAVTALWGKVN--VDEVGG 4 2 3 Progressive alignment following guide tree 2 GEEKAAVLALWDKVN--EEEVGG 3 PADKTNVKAAWGKVGAHAGEYGA 1 4 AADKTNVKAAWSKVGGHAGEYGA 5 EHEWQLVLHVWAKVEADVAGHGQ