Maximum likelihood (cont.)

Maximum likelihood(cont.)

The maximum likelihood criterion • The optimal tree is that which would be most likely to give rise to the observed data (under a given model of evolution)

Stepwise view • Propose a tree with branch lengths • Consider the first character • Sum the probability of the data arising via each possible history

Worked example A G Non-change = ¼ +¾e-t Change = ¼ - ¼e-t 0.02 0.01 A A 0.03 0.01 0.02 A G L =0.25(¼ +¾e-0.02) (¼ +¾e-0.02) (¼ +¾e-0.03) (¼ - ¼e-0.01) (¼ - ¼e-0.01)

Sum over all other combinations = 7.09 x 10-3 • AA = 5.87232x10-6 • AG = 7.064163x10-3 • AC = 4.43719x10-8 • AT = 4.43719x10-8 • GA = 1.1204x10-12 • GG = 2.36063x10-5 • GC = 1.1204x10-12 • GT = 1.1204x10-12 • CA = 1.1204x10-12 • CG = 1.78372x10-7 • CC = 1.48277x10-12 • CT = 1.1204x10-12 • TA = 1.1204x10-12 • TG = 1.78372x10-7 • TC = 1.1204x10-12 • TT = 1.48277x10-10

Likelihood scores • Raw likelihood of the data at this site given this tree and branch lengths and model = 0.25(7.09 x 10-3) • Log-likelihood = -6.334787983

What does this number mean? • -6.334787983 = The log-likelihood of this character’s tip values given: • This tree topology • These branch lengths • The model of molecular evolution

Stepwise view • Propose a tree with branch lengths • Consider the first character • Sum the probability of the data arising via each possible history • Multiply across all sites

Multiplying across sites To make it easier, we can lump characters with the same “pattern” lnL = [lnL (0000)]N(0000) + [lnL (0001)]N(0001) + [lnL (0010)]N(0010) + [lnL (0100)]N(0100) + [lnL (0111)]N(0111) + [lnL (0011)]N(0011) + [lnL (0101]N(0101)) + [lnL (0110)]N(0110)

Stepwise view • Propose a tree with branch lengths • Consider the first character • Sum the probability of the data arising via each possible history • Multiply across all sites • Find the branch lengths that yield the highest likelihood for the entire data set

What branch lengths should we assume? • Under the principle of maximum likelihood, we use the set of branch lengths that maximize the likelihood • Once we find those branch lengths, the likelihood score is taken as being the likelihood of the data given this tree topology

Stepwise view • Propose a tree with branch lengths • Consider the first character • Sum the probability of the data arising via each possible history • Multiply across all sites • Find the branch lengths that yield the highest likelihood for the entire data set • Search among trees (!)

More complicated (realistic) models for DNA • Allow deviation from equiprobable base frequencies • HKY85; F81; GTR • Allow two substitution types (ti and tv) • K2P; HKY85 • Allow for six substitution types • GTR

9 6 6 3 5 4 2 1 Relationship among models

Site-to-site rate heterogeneity • The easiest is to assume that all characters have the same intrinsic rate of evolution – but this is unrealistic • What can we do instead?

Long Branch Attraction 0.1 B True tree 0.1 0.4 0.2 A 0.1 C 0.2 D Data patterns Frequency 0011 0.0682 0101 0.0522 0110 0.1242 Parsimony will be positively misleading

Long Branch Attraction 0.1 B True tree 0.1 0.4 0.2 A 0.1 C 0.2 D Parsimony estimate

Why does parsimony fail? • Parsimony is “blind” to branch lengths • It cannot see cases in which the data are better explained by rapid evolution and unequal branch lengths • ML can (if you have the right model of evolution)

Relationship between MP and ML • One argument - MP is inherently nonparametric  No direct comparison possible • MP is an ML model that makes particular assumptions

Parsimony-like likelihood model(see Lewis 1998 for more) • Estimate branch-length independently for each character (or force lengths to be equal) • Only sum over maximum likelihood ancestral states

Why use MP • The model is less realistic, but: • We can do more thorough searches and data exploration (computational efficiency) • Robust results will usually still be supported

Why use ML • The model is explicit • We can statistically compare alternative models of molecular evolution • We can conduct parametric statistical tests • (Even the most complex model is still unrealistically simple)

Relationship between MP and ML • One argument - MP is inherently nonparametric  No direct comparison possible • MP is an ML model that makes particular assumptions

Maximum likelihood (cont.)

Maximum likelihood (cont.)

Presentation Transcript

Maximum likelihood estimation

Maximum Likelihood Estimation

Maximum Likelihood Detection

Maximum likelihood (ML)

Maximum Likelihood

Maximum Likelihood

4. Maximum Likelihood

Maximum Likelihood

Maximum Likelihood Estimation

Maximum Likelihood Estimation

Maximum Likelihood

Maximum likelihood

Maximum likelihood decoding

Maximum Likelihood Estimation

Maximum likelihood (cont.)

Maximum Likelihood

Maximum Likelihood

Maximum Likelihood

Maximum Likelihood Estimation

Maximum Likelihood Estimation

Maximum Likelihood Estimate

Maximum Likelihood Detection