170 likes | 284 Vues
This paper presents advanced methodologies for sequence alignment using Hidden Markov Models (HMMs). The introduction explores the mathematical foundation that allows HMMs to effectively model insertions and deletions, addressing challenges in combining various data forms like sequences and 3D structures. Detailed algorithms including the Forward, Viterbi, and Baum-Welch methods are discussed, showcasing model training on unaligned sequences. Results highlight significant findings regarding consensus length, structural alignments, and sequence identity, emphasizing HMM’s ability to facilitate sensitive fold recognition in complex alignments.
E N D
Multiple alignment using hidden Markove models November 21, 2001 Kim Hye Jin Intelligent Multimedia Lab marisan@postech.ac.kr
Outline • Introduction • Methods and algorithm • Result • Discussion IM lab
Introduction| why HMM Introduction • Why HMM? • Mathematically consistent description of insertions and deletions • Theoretical insight into the difficulties of combining disparate forms of information Ex) sequences / 3D structures • Possible to train models from initially unaligned sequences IM lab
Methods and algorithms|HMMs Methods and algorithms • State transition • State sequence is a 1st order Markov chain • Each state is hidden • match/Insert/delete state • Symbol emission States transition Symbol emission IM lab
Methods and algorithms|HMMs Deletion state Match state Insertion state IM lab
Methods and algorithms|HMMs Methods and algorithms • Replacing arbitrary scores with probabilities relative to consensus • Model M consists of N states S1…SN. • Observe sequence O consists of T symbols • O1… ON from an alphabet x • aij : a transition from Si to Sj • bj(x) : emission probabilities for emission of a symbol x from each state Sj IM lab
Methods and algorithms|HMMs Methods and algorithms • Model of HMM : example of ACCY IM lab
Methods and algorithms|HMMs Methods and algorithms • Forward algorithm • - a sum rather than a maximum IM lab
Methods and algorithms|HMMs Methods and algorithms • Viterbi algorithm • the most likely path through the model • following the back pointers IM lab
Methods and algorithms|HMMs Methods and algorithms • Baum-Welch algorithm • A variation of the forward algorithm • Reasonable guess for initial model and then calculates a score for each sequence in the training set using EM algorithms • Local optima problem: • forward algorithm /Viterbi algorithm • Baum-welch algorithm IM lab
Methods and algorithms|HMMs Methods and algorithms • Simulated annealing • support global suboptimal • kT = 0 : standard Viterbi training procesure • kT goes down while in training IM lab
Methods and algorithms|HMMs Methods and algorithms ClustalW IM lab
Methods and algorithms|HMMs Methods and algorithms ClustalX IM lab
Results Results • len : consensus length of the alignment • ali : the # structurally aligned sequences • %id: the percentage sequence identity • Homo: the # homologues identified in and extraced from SwissProt 30 • %id : the average percentage sequence identity in the set of homologues IM lab
Results Results IM lab
Discussion Discussion • HMM • a consistent theory for insertion and deletion penality • EGF : fairly difficult alignments are well done • ClusterW • progressive alignment • Disparaties between the sequence identity of the structures and the sequence identity of the homologoues • Large non-correlation between score and quality IM lab
Discussion Discussion • The ability of HMM to sensitive fold recognition is apparent IM lab