what is a QUASI-SPECIES By Ye Dan U062281A USC3002 Picturing the World through Mathematics
Definition • Quasi-: widely-used prefix to indicate “almost”, “seemingly”, “nearly” etc. • Species: ? Biological: A class of individual characterized by a certain phenotypic behavior. Chemical: An ensemble of equal, identical molecules. complicated and loosely defined
Definition • An ensemble of “nearly” identical molecules? • Preliminary understanding: a cluster of closely related but non-identical molecular species
Why quasi-species? • 1970s Manfred Eigen and Peter Schuster Chemical Theory for the Origin of Life • Assuming RNAas the first biological replicator – base-pairing • Dynamics of chemicaland spontaneous reproduction of RNA molecules
Why quasi-species? • RNA replication • Basis of all life • Occur initially as spontaneous chemical reproduction of simple molecules at a very slow rate, subject to high error-rates.
Why quasi-species – Errors? • Random event lead to mutations • mismatchingin base-pairing. • The result: not an absolutely homogeneouspopulation of RNA molecules , but a mixture of RNA molecules with different nucleotide sequences. ie. a QUASI-SPECIES
Chemical Kinetics • Selection molecules have different replication rates depending on their sequence (the faster, the fitter) • Mutation offspring sequence differ from its parent in certain positions by ‘point mutation’
Chemical Kinetics • n different RNA sequences (length l)with population v1, v2, …, vn • replication rates a1, a2, …, an • probability of replication of iresults in j (i, j=1,2,…,n) Qji No error: Mutation:
Chemical Kinetics • Mathematical formulation (DE) • population v1, v2, …, vn • replication rates a1, a2, …, an • probability of replication of iresults in j Qji • growth rate
Rate of growth of one variant dependent on not only itself, but also all other variants In long run, no fixation of the fastest growing sequence. The population will reach an equilibrium which will contain a whole ensemble of mutants with different replication rates – quasi-species.
(A more precise) Definition • Quasi-species: the equilibrium distribution of sequences that is formed by this mutation and selection • Quasi-species, not any individual mutant sequence, is the target of selection • Guided mutation
Sequence Space & Fitness Landscape • Given a length, all possible variants • Distance between two sequences is Hamming distance • No. of dimension = length of the sequence • 4 possibilities in each dimension: A, T, C, G • One more dimension: reproduction rate ie. Fitness • Selection pressure determines Fitness landscape
Quasi-species and Evolution • Quasi-species: a small cloud in sequence space, wanders over the fitness landscape and search for peaks • Evolution: distablizationof the existing quasi-species upon change of fitness landscape – new peaks • Hill-climbing under guidance of natural selection • Mutationsalong the way is guided
Error Threshold • Error-free replication: evolution stops • Error rate toooo high: population unable to maintain any genetic information, evolution impossible • Error rate must be below a critical threshold value
Error Threshold • Error rate (p): per base probability to make a mistake • Mutation term Hij is the Hamming distance between variant i and j (no. of bases in which the two strains differ) • Error-free replication:
Error Threshold (Math again…) • Assume a population of length l consists of • a fast replicating variant v1, the wild type, with replication rate a1 • its mutant distribution v2 with a lower average replication rate a2. • q: the per base accuracy of replication ( q= 1- p). • Prob(the whole sequence is replicated without error) =
Error Threshold (Math again…) • (Neglecting the small probability that erroneous replication of a mutant gives rise to a wild-type sequence) the ratio converges to (consider )
Error Threshold (Math again…) • in order to maintain the wild type in the population • Recall , there must be a critical q value where
Error Threshold (Math again…) A condition limiting the maximum length of the RNA sequence! ie.
Error Threshold (Math again…) • An approximation for the upper genome length l that can be maintained by a given error rate • Facts: • Viral RNA replication (little proof-reading mechanism involved): p ≈ 10-4; l ≈ 104 • Human genome: p ≈ 10-9; l ≈ 3x109
App. On Viral Quasi-species • Consider viral dynamics and basic reproductive ratio in a quasi-species concept • Eliminate the fittest virus mutants by increasing the mutation rate with a drug • Drive the whole virus population to extinction by further increase of mutation rate
Some fancier Mathematics • Consider the standard equation for a dynamic (bacteria/viral) population • Vector represents the population sizes of each individual sequences; • Matrix contains the replication rate and mutation probabilities • (unspecific degradation or dilution flow )is any function of that keeps the total population in a constant size. It can be
Some fancier Mathematics • Equilibrium of , • Largest Eigenvalue : max. average replication rate • Eigenvector (corresponding to ): the quasi-species • Normalize , describes the exact population structure of the quasi-species - each mutant has a frequency • can be understood as the fitness of the quasi-species
A Brief Review • Quasi-species – produced by errors in the self-replication of molecules; a well-defined (eqm) distribution of mutants generated by mutation-selection process; target of selection • Chemical kinetics; Mathematical framework • The fitness landscape, and the implication on evolution • Error threshold and application • Fitness and exact structure of the quasi-species as eigenvalue and eigenvector of the selection-mutation matrix
The End Questions?