380 likes | 572 Vues
Figs, Wasps, Gophers, and Lice: A Computational Exploration of Coevolution. Ran Libeskind-Hadas Department of Computer Science Harvey Mudd College. The Cophylogeny Problem. From Hafner MS and Nadler SA, Phylogenetic trees support the coevolution of
E N D
Figs, Wasps, Gophers, and Lice: A Computational Exploration of Coevolution Ran Libeskind-Hadas Department of Computer Science Harvey Mudd College
The Cophylogeny Problem From Hafner MS and Nadler SA, Phylogenetic trees support the coevolution of parasites and their hosts.Nature 1988, 332:258-259
Obligate Mutualism of Figs and Fig Wasps ovipostor From Cophylogeny of the Ficus Microcosm, A. Jackson, 2004
Indigobirds and Finches • High level of host specificity (e.g. eggs and mouth markings) www.indigobirds.com
Cophylogeny Reconstruction Host tree
Problem Instance Host tree Parasite tree e d a b c
Problem Instance Host tree Parasite tree e d a b c Tips associations
Input Possible Solutions e e d d a a b c b c
Event Cost Modelcospeciation e e d cospeciation cospeciation d a a b c b c
Event Cost Modelduplication e duplication e d d a a b c b c
Event Cost Modelhost switch e e d host switch d a a b c b c
Event Cost Modelloss e e d loss loss loss loss d a a b c b c
Event Cost Model Cost = cospeciation + host switch + loss Cost = duplication + cospeciation + 3 * loss e duplication e d cospeciation cospeciation loss loss loss loss host switch d a a b c b c
Some typical costs Cost = 8 Cost = 5 e duplication + 2 e d cospeciation cospeciation loss + 0 + 0 + 2 loss loss loss host switch + 2 + 2 + 2 + 3 d a a b c b c
How hard is this problem? • If host switches are not permitted, we can find optimal solutions in “next-to-no-time” (time proportional to the number of nodes in the trees)… • … but host switches shouldn’t be ignored – they are quite common… • … and with host switches, this problem is computationally hard. How hard? • Let’s take a short aside on “hardness”…
A Short Aside on “Hard” Problems Snowplows of Northern Minnesota Burrsburg Frostbite City Tundratown Shiversville Freezeapolis
“Hard” Problems Snowplows of Northern Minnesota Burrsburg Frostbite City Tundratown Shiversville Freezeapolis Greed? Brute Force?
“Greed” isn’t always good! Temptingville A C E B D F
“Hard” Problems The Travelling Salesperson Problem New York 1342 San Francisco Moscow 2142 742 242 442 2642 Paris 1942 Claremont Brute Force? Greed?
“Hard” Problems The Travelling Salesperson Problem 1 Claremont Montclare 1 2 2 Montclear Clearmont 1
“Hard” Problems The Travelling Salesperson Problem 1 Claremont Montclare 1042 1 2 2 Montclear Clearmont 1
n2 versus 2n Fast-O-Matic The Fast-O-Matic performs 109 operations/sec n = 10 n = 30 n = 50 n = 70 n2 2n 100 < 1 sec 900 < 1 sec 2500 < 1 sec 4900 < 1 sec 1024 < 1 sec 109 1 sec
n2 versus 2n Fast-O-Matic The Fast-O-Matic performs 109 operations/sec n = 10 n = 30 n = 50 n = 70 n2 2n 100 < 1 sec 900 < 1 sec 2500 < 1 sec 4900 < 1 sec 1024 < 1 sec 109 1 sec 1015 13 days
n2 versus 2n Fast-O-Matic The Fast-O-Matic performs 109 operations/sec n = 10 n = 30 n = 50 n = 70 n2 2n 100 < 1 sec 900 < 1 sec 2500 < 1 sec 4900 < 1 sec 1021 37 trillion years 1024 < 1 sec 109 1 sec 1015 13 days
n2 versus 2n Fast-O-Matic The Fast-O-Matic performs 109 operations/sec n = 10 n = 30 n = 50 n = 70 n2 2n 100 < 1 sec 900 < 1 sec 2500 < 1 sec 4900 < 1 sec 1021 37 trillion years 1024 < 1 sec 109 1 sec 1015 13 days Computers double in speed every 2 years. Let’s just wait 10 years! 37 trillion years ->
n2 versus 2n Fast-O-Matic The Fast-O-Matic performs 109 operations/sec n = 10 n = 30 n = 50 n = 70 n2 2n 100 < 1 sec 900 < 1 sec 2500 < 1 sec 4900 < 1 sec 1021 37 trillion years 1024 < 1 sec 109 1 sec 1015 13 days Computers double in speed every 2 years. Let’s just wait 10 years! 37 trillion years -> 37 billion years!
Snowplows and Travelling Salesperson Revisited! Tens of thousands of other known problems go in this cloud!! Travelling Salesperson Problem Snowplow Problem Protein Folding Cophylogeny Problem! NP-complete problems
“I can’t find an efficient algorithm. I guess I’m too dumb.” Cartoon from “Computers and Intractability: A Guide to the Theory of NP-completeness” by M. Garey and D. Johnson
“I can’t find an efficient algorithm because no such algorithm is possible!” Cartoon from “Computers and Intractability: A Guide to the Theory of NP-completeness” by M. Garey and D. Johnson
“I can’t find an efficient algorithm, but neither can all these famous people.” Cartoon from “Computers and Intractability: A Guide to the Theory of NP-completeness” by M. Garey and D. Johnson
Coping with NP-completeness… • Brute force • Ad hoc Heuristics • Meta-heuristics • Approximation algorithms
A Meta-heuristic Approach • Fix a timing for the host tree – a relative ordering of the speciation events • All host switches occur “horizontally” in time • We can solve the problem optimally for a given timing using Dynamic Programming
Genetic Algorithm • Host tree and three different possible ordering of the speciation events.
What Jane does… Gopher/Louse pair… 8 tips on gopher tree 10 tips on louse tree Best solutions found are listed here… along with total cost
But perhaps those “seemingly good” solutions of cost 11 are no better than random… In “Stats” mode, we can generate random tip mappings or entirely random parasite trees. Here, we ran 50 trials with random tip mappings. The red dashed line shows the best solution found to our original dataset and the blue histogram shows the costs for the 50 random trials. In this case, none of the random trials resulted in solutions of cost 11 or less!