Chapter 4: Stochastic Processes Poisson Processes and Markov Chains

Chapter 4: Stochastic ProcessesPoisson Processes and Markov Chains Presented by Vincent Buhr

Overview • The Homogeneous Poisson Process • The Poisson and Binomial Distributions • The Poisson and Gamma Distributions • The Pure-Birth Process • Finite Markov Chains • Modeling

The Poisson Process • A sequence of events occurring during a time interval forms a homogeneous Poisson process if two conditions are met: • The occurrence of any event in the time interval (a,b) is independent of the occurrence of any event in the time interval (c,d) • There is a constant λ such that for any sufficiently small time interval (t,t+h), the probability that an event occurs in the time interval is independent of t, and is λh+o(h) • Condition 2 ensures that the probability of an event occurring within an interval is proportional to the length of the interval • With these two conditions, the number of events N that occur up to any time t has a Poisson distribution with parameter λt

The Poisson Process (cont) • Pj(t) is the probability that N=j at time t • At time 0 the value of N is necessarily 0, so P0(0)=1 and Pi(0)=0 for all i>0 • At time t+h, N=0 only if no events occur in the interval (0,t) or the interval (t,t+h), so: • At time t+h, N=1 can happen one of two ways, either N=1 at time t and no events occur in the interval (t,t+h) or N=0 at time t and one event occurs in the interval (t,t+h), which gives:

The Poisson Process (cont) • Finally, N>1 can occur at time t+h one of three ways • N=j at time t and no events occur in the interval (t,t+h) • N=j-1 at time t and one event occurs in the interval (t,t+h) • N=j-k, where k > 1, and more than one event occurs in the interval (t,t+h) • These possibilities give us an equation that looks a lot like 4.3: • The only difference between the equations is the order of o(h), so we can use 4.4 for all j > 0

The Poisson Process (cont) • After subtracting P0(t) from both sides of (4.2) and Pj(t) from both sides of (4.4), dividing through by h, and letting h → 0 we get two equations: • (4.5) has the solution: • And since we know that P0(0)=1 we can infer that C=1, so:

The Poisson Process (cont) • Using (4.9) and mathematical induction we can prove that the solution to (4.6) is: • Which is the Poisson distribution with parameter λt as we set out to prove

The Poisson and Binomial Distributions • Under special circumstances the binomial distribution can be made to approximate the Poisson distribution • If n → + Inf., p → 0, and np = λ then for any y the binomial probability approaches the Poisson probability • To prove this, first we write the binomial probability equation as:

The Poisson and Binomial Distributions (cont) • If we then fix y and λ, and write p as λ / n then as n approaches infinity each term in (4.12) has a finite limit • Terms of the form (n-i) λ / n approach λ • With these limits (4.12) approaches • Which is the Poisson probability

The Poisson and Gamma Distributions • Equation (4.9) hints at a connection between the Poisson distribution and the exponential distribution • The random time until the first event occurs in a Poisson process with parameter λ is given by the exponential distribution with parameter λ • To prove this we can let F(t) be the probability that the first event occurs before time t, which means that the density function for the time until the first occurrence is the derivative of F(t) • From (4.9) • So: • Which is the exponential distribution

The Poisson and Gamma Distributions (cont) • Additionally, the distribution of the time between successive events is also given by the exponential distribution • This means that the random time until the kth event occurs is the sum of k independent exponentially distributed times, which has the gamma distribution • To prove this, let t0 be some fixed value of t • Then, if the time until the kth event occurs exceeds t0, the number of events occurring before time t0 is less than k • So the probability that k-1 or less events occur before time t0 is equal to the probability that the time until the kth event occurs exceeds t0 • This leads us to the following equality: • The RHS of this equality is essentially the gamma distribution

The Pure-Birth Process • When deriving the Poisson distribution we assumed that the probability of an event in a time interval is independent of the number of events that have occurred up to time t • This assumption does not always hold in biological applications • In the pure-birth process it is assumed that given the value of a random variable at time t is j, the probability that it increases to j+1 in a given time interval (t,t+h) is λjh • The Poisson case arises when λj is independent of j and is just written as λ • As with the Poisson process we can arrive at a set of differential equations for the probability that the random variable takes the value j at time t

The Pure-Birth Process (cont) • One example of an application of the Pure-Birth process is the Yule process, where it is assumed that λj= jλ • The motivation for this process arises from populations where if the size of the population is j the probability that it increases to size j+1 is proportional to j • For this case, the solution to the differential equations given before is:

The Pure-Birth Process (cont) • Another example of the application of the Pure-Birth process comes from polymerase chain reaction (PCR) • In PCR, sequential additions of base pairs to a primer occur to create the product • For this process, λj=m-j, which implies that once the length of the product reaches m no further increase in length is possible • With this condition, the solution is: • Neither this example nor the last follow the Poisson distribution, which shows the importance of verifying the event independence assumption

Introduction to Finite Markov Chains • A Markov chain process occupies one of a finite number of discrete states at a given time unit • In a time step from t to t+1 the process either stays in the same state or moves to another state in a probabilistic way (as opposed to deterministic) • A simple Markov chain has two basic properties: • The Markov property- The probability that the process changes from Ej to Ek in one time step depends only on the current state Ej and not on any past states • The temporally homogeneous transition probabilities property- If at time t the process is in state Ej, the probability that it changes to Ek in one time step is independent of t • Some Markov processes ignore one or both of these properties, but we will assume both hold

Transition Probabilities and the Transition Probability Matrix • If at time t a Markovian random variable is in state Ej the probability that at time t+1 it is in state Ek is denoted by pjk, which is the transition probability from Ej to Ek • This notion implicitly contains both the properties mentioned before • A transition probability matrix P of a Markov chain contains all of the transition probabilities of that chain

Transition Probabilities and the Transition Probability Matrix (cont) • It is also assumed that there is a initial probability distribution for the states in the process • This means that there is a probability πi that at the initial time point the Markovian random variable is in state Ei • To find the probability that the Markov chain process is in state Ej two time steps after being in state Ei you must consider all the possible intermediate steps after one time step that the process could be in • This can also be done for the whole process at once by matrix multiplication, the notation Pn is used to denote an n-step transition probability matrix

Markov Chains with Absorbing States • A Markov chain with an absorbing state can be recognized by the appearance of a 1 along the main diagonal of its transition probability matrix • A Markov chain with an absorbing state will eventually enter that state and never leave it • Markov chains with absorbing states bring up new questions, which will be addressed later, but for now we will only consider Markov chains without absorbing states

Markov Chains with No Absorbing States • In addition to having no absorbing states, the Markov models that we will consider are also finite, aperiodic, and irreducible • Finite means that there are a finite number of possible states • Aperiodic means that there is no state such that a return to that state is possible only t0, 2t0, 3t0, … transitions later, where t0 > 1 • Irreducible means that any state can eventually be reached from any other state, but not necessarily in one step

Stationary Distributions • Let the probability that at time t a Markov chain process is in state Ej be φj • This means that the probability that at time t+1 the process is in state Ej is given by • If we assume that these two probabilities are equal then we get: • If this is the case, then the process is said to be stationary, that is, from time t onwards, the probability of the process being in state Ej does not change

Stationary Distributions (cont) • If the row vector φ’ is defined by: • Then we get the following from (4.25) • The row vector must also satisfy • With these equations we can find the stationary distribution when it exists • Note that (4.27) generates one redundant equation that can be omitted

Stationary Distribution Example • We are given a Markov chain with the following transition probability matrix • Using (4.27) and (4.28) we can form a set of equations to solve • The solution to these equations is: • This means that over a long time period a random variable with the given transition matrix should spend about 24.14% of the time in state E1, 38.51% of the time in state E2, etc.

Stationary Distribution Example (cont) • With matrix multiplication we can see how quickly the Markov chain process would reach the stationary distribution • From this it appears that the stationary distribution is approximately reached after 16 time steps

The Graphical Representation of a Markov Chain • It can be convenient to represent a Markov chain by a directed graph, using the states as nodes and the transition probabilities as edges • Additionally, start and end states can be added as needed • The graph structure without probabilities added is called the topology • These definitions are used later in the book to discuss hidden Markov models

Modeling • While the homogeneous Poisson process has many uses in Bioinformatics, the two assumptions made at the beginning of the chapter (homogeneity and independence) do not always hold • Similarly, the assumptions made by Markov chain processes do not always hold • For example, from analyzing DNA, it has become apparent that the probability that the nucleotide a is followed by g depends to some extent on the location of a gene in a chromosome, also the nucleotides preceding a may have an affect on the probability of g following a • However, the memoryless Markov chain property is often assumed even when its applicability is uncertain

Modeling (cont) • In general, mathematical models often make simplifying assumptions about properties of the events being modeled • There are few cases where we could predict outcomes of events with exact accuracy, but even if we could, modeling may be more desirable due to the complexity of the phenomena being analyzed • It is often necessary to find a middle ground between easily solveable, simple models, and combersome, complex ones • The key to this is knowing that a model only needs to capture enough of the true complexity of the situation to serve our purposes

Modeling (cont) • Finding the middle ground may not always be easy either however, since benchmarks on model performances are not often easily available, and subjectivity may come into play • An example of modeling simplification comes from the early version of BLAST, which assumed that nucleotides are identically and independently distributed along a DNA sequence • We now know that this is not true, but the BLAST procedure does work, in that it captures enough of biological reality to be effective • Another aspect of modeling is that we often assume that a model will become more refined as it is used and we learn more about the reality of the phenomena; models are rarely said to be in their final versions

Chapter 4: Stochastic Processes Poisson Processes and Markov Chains