600 likes | 836 Vues
Evolution & Design Principles in Biology :. a consequence of evolution and natural selection Rui Alves University of Lleida ralves@cmb.udl.es Course Website:http :// web.udl.es / usuaris /pg193845/Bioinformatics_2009/. Part I: Molecular Evolution. Theory of Evolution.
E N D
Evolution & DesignPrinciples in Biology: a consequence of evolution and natural selection Rui Alves University of Lleida ralves@cmb.udl.es CourseWebsite:http://web.udl.es/usuaris/pg193845/Bioinformatics_2009/
Theory of Evolution • Evolution is the theory that allows us to understand how organisms came to be how they are • In probabilistic terms, it is likely that all living beings today have originated from a single type of cells • These cells divided and occupied ecological niches, where they adapted to the new environments through natural selection
How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication)
How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication)
Why Sex??? • Asexual reproduction is quicker, easier more offspring/individual. • Sex may limit harmful mutations • Asexual: all offspring get all mutations • Sexual: Random distribution of mutations. Those with the most harmful ones tend not to reproduce. • Generate beneficial gene combinations • Adaptation to changing environment • Adaptation to all aspects of constant environment • Can separate beneficial mutations from harmful ones • Sample a larger space of gene combinations
What drives cells to adapt? New Niche/ New conditions in old niche
What drives cells to adapt? New (betteradapted) mutation
How do New Genes and Proteins appear? • Genes (Proteins) are build by combining domains • New proteins may appear either by intradomain mutation of by combining existing domains of other proteins … Cell Division … Cell Division
The Coalescent • This model of cellular evolution has implications for molecular evolution • Coalescent Theory: • a retrospectivemodel of populationgeneticsthat traces allallelesof agenein a samplefrom a populationto a single ancestral copysharedbyallmembers of thepopulation, known as themostrecentcommonancestor
Why is the coalescent the de facto standard today? Alternatives? Current sequences have evolved from the same original sequence (Coalescent) Current sequences have converged to a similar sequence from multiple origins of life
ACDEFGHIKLMNPQRSTVWY 20 A EDYAHIKLMNPQRGTVWY 20 Convergence Divergence AAi AAk AAi AAk AAk AAk Which is more likely? AAi Back of the envelop support for divergence Back of the envelop support for ?
About the mutational process • Point mutations: • Transitions (A↔G, C↔T) are more frequent than transversions (all other substitutions) • In mammals, the CpG dinucleotide is frequently mutated to TG or CA (possibly related to the fact that most CpG dinucleotides are methylated at the C-residues) • Microsatellites frequently increase or decrease in size (possibly due to polymerase slippage during replication) • Gene and genome duplications (complete or partial), may lead to: • pseudogenes: function-less copies of genes which rapidly accumulate (mostly deleterious) mutations, useful for estimating mutation rates! • new genes after functional diversification • Chromosomal rearrangements (inversions and translocation), may lead to • meiotic incompatibilities, speciation • Estimated mutation rates: • Human nuclear DNA: 3-5×10-9 per year • Human mitochondrial DNA: 3-5×10-8 per year • RNA and retroviruses: ~10-2 per year
So what if we accept the coalescent model? A1 TSRISEIRR A2 TSRISEIRR A3 TSRISEIRR A4 TSRISEIRR A5 TSRISEIRR A6 TSRISEIRR A7 PSRISEIRR A8 PKRISEVRR A9 PKRISEVRR A10 PQRISAIQR A11 PQRISAIQR A12 PQRISTIQR A13 PQRISTIQR A14 ASHLHNLQR A15 TKHLQELQRE A16 TKHLQELQRE A17 TKHLQELQRE A18 SKHLHELQRD A19 PKNLHELQKD A20 SKRLHEVQSE A1-6 TSRISEIRR A7 PSRISEIRR A8-9 PKRISEVRR A10-11 PQRISAIQR A12-13 PQRISTIQR A14 ASHLHNLQR A15-17 TKHLQELQR A18 SKHLHELQR A19 PKNLHELQK A20 SKRLHEVQS
So what if we accept the coalescent model? A1-6 A1-6 TSRI SEI RR A7 PSRI SEI RR A8-9 PKRI SEVRR A10-11 PQRI SAI QR A12-13 PQRI STI QR A14 ASHLHNLQR A15-17 TKHLQELQR A18 SKHLHELQR A19 PKNLHELQK A20 SKRLHEVQS A’1-7 A7 A10-11 A’10-13 A12-A13
So what if we accept the coalescent model? A’1-7 (p-t) SRI S E I RR A8-9 P KRI S E VRR A’10-13 P QRI S(a-t)I QR A14 A SHLH N LQR A15-17 T KHLQ E LQR A18 S KHLH E LQR A19 P KNLH E LQK A20 S KRLH E VQS 4 3324 5 323 The study of sequence alignments can gives information about the evolution of the different organisms!!!!
Phylogenetic tree reconstruction, overview • Computational challenge: There is an enormous number of different topologies even for a relatively small number of sequences: • 3 sequences: 1 • 4 sequences: 3 • 5 sequences: 15 • 10 sequences: 2,027,025 • 20 sequences: 221,643,095,476,699,771,875 • Consequence: Most tree construction algorithm are heuristic methods not guaranteed to find the optimal topology. • Input data for two major classes of algorithms: • Input data distance matrix, examples UPGMA, neighbor-joining • 2. Input data multiple alignment: parsimony, maximum likelihood • Distance matrix methods use distances computed from pairwise or multiple alignments as input.
Protein A Protein B Protein C Protein D Protein D Protein C Protein A Protein B Protein B Protein C Protein D Protein A Building phylogenetic trees of proteins Genome 1 Genome 2 Genome 3 … Genome …
Distance based phylogenetic trees A1 A2 A3 … ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … A1 A3 A2 A3 A2 A1 5 substitutions 3 substitutions 8 substitutions 5 A1 A3 3 A2
Maximum likelihood phylogenetic trees Probability of aa substitution Alignment ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … A - E D … A 1 0.01 0.2 0.09 … - 0.01 1 0.0001 0.0001 … E 0.2 0.0001 1 0.5 D 0.09 0.0001 0.5 1 …
Maximum likelihood phylogenetic trees A2 Alignment p(1,2) ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … A1 5 substitutions A1 p(1,3) A3 3 substitutions p(2,3)>p(1,2)>p(1,3) A3 A2 A1 A2 p(2,3) 8 substitutions A3 A3 A1 A2
Statistical evaluation of trees: bootstrapping 5 4 1 6 7 2 8 3 • Motivation: Some branching patterns in a tree may be uncertain for statistical reasons (short sequences, small number of mutational events) • Goal of bootstrapping: To assess the statistical robustness for each edge of the tree. • Note that each edge divides the leave nodes into two subsets. For instance, edge 7–8 divides the leaves into subsets {1,2,3} and {4,5}.However, is this short edge statistically robust ? • Method: Try to generate tree from subsets of input data as follows: • Randomly modify input MSA by eliminating some columns and replacing them by existing ones, This results in duplication of columns. • Compute tree for each modified input MSA. • For each edge of the tree derived from the real MSA, determine the fraction of trees derived from modified MSAs which contain an edge that divides the leaves into the same subsets. This fraction is called the bootstrap value. Edges with low bootstrap values (e.g. <0.9) are considered unreliable.
Other Trees • Use genomes • Use Enzymomes • Use whatever group of molecules are important for a given function
Outline • What are design principles • How to study design principles • Examples
Operon Gene 1 Gene 2 Gene 3 What are design principles? • Recurrent qualitative or quantitative rules that are observed in similar types of systems as a solution to a given functional problem • Exist at different levels Nuclear Targeting Sequences
How can design principles emerge in molecular biology? • Inteligent design? Not a scientific hypothesis; out of the table • Evolution? Makes sense, but how could such regularities emerge?
Climbingdownmount improbable • Overtime, edgedstoneswould accumulateontheslope. • Smooth, round, stones accumulate at the bottom. DesignPrinciples: - Smooth, roundishrocks roll downthemountain. - Edged, flat, rocksdon’t.
Design principles in molecular biology • Similarly, if a topology or set of parameters has appeared through mutation and it can be shown to create a molecular network that functionally outperforms all other possible alternatives in a given set of conditions, one can talk about a design principle for the system under those conditions. [sensu engineering]
Index of talk • How to identify design principles • Design principles in: • Gene expression • Metabolic networks • Signal transduction • Development • Design principles, what are they good for? • Summary
First step, define the alternatives Regulator Regulator _ + Gene Gene X0 X1 X2 X3 X0 X1 X2 X3
First step, define the alternatives X3 X0 X1 X2 X3 How strong should the feedback be? t
Then, create models for each alternative Regulator Regulator _ + Gene Gene
Finally: • Compare the dynamic behavior of the models for the two or more alternatives with respect to physiologically relevant criteria.
Then, create models for each alternative X0 X1 X2 X3 X0 X1 X2 X3
Index of talk • How to identify design principles • Design principles in: • Gene expression • Metabolic networks • Signal transduction • Development • Design principles, what are they good for? • Summary
The demand theory for gene expression Regulator Regulator _ • Are there situations where positive regulation of gene expression outperforms negative regulation of gene expression and vice versa? + Gene Gene
Regulating gene expression has principles Regulator Regulator _ • Positive regulator: • More effective when gene product in demand for large fraction of life cycle. • Less noise sensitive if signal is low. • Negative regulator: • More effective when gene product in demand for small fraction of life cycle. • Less noise sensitive if signal is high. + Gene Gene Genetics 149:1665; PNAS 103:3999; PNAS 104:7151;Nature405: 590
Index of talk • How to identify design principles • Design principles in: • Gene expression • Metabolic networks • Signal transduction • Development • Design principles, what are they good for? • Summary
Negative overall feedback is a design principle in metabolic biosynthesis • Negative overall feedback: • More effective in coupling production to demand. • More robust to fluctuations. X0 X1 X2 X3 Bioinformatics 16:786; Biophysical J. 79:2290
Index of talk • How to identify design principles • Design principles in: • Gene expression • Metabolic networks • Signal transduction • Development • Design principles, what are they good for? • Summary