Basic Probability and Statistics

Basic Probability and Statistics Random variables Distribution functions Various probability distributions

Definitions • An experiment is a process whose output is not known with certainty. • The set of all possible outcomes of an experiment is called the sample space (S). • The outcomes are called sample points in S. • A random variable is a function that assigns a real number to each point in S. • A distribution functionF(x) of the random variable X is defined for each real number x as follows

Properties of distribution function

Random Variables • A random variable (r.v.) X is discrete if it can take on at most a countable number of values x1, x2, x3,… • The probability that the discrete r.v. X takes on a value xi is given by: p(xi)=Pr(X= xi). • p(x) is called the probability mass function.

Random Variables • A r.v. is said to be continuous if there exists a nonnegative function f(x) such that for any set of real numbers B, • f(x) is called probability density function.

Random Variables • Mean or expected value of a r.v. X is denoted by E[X] or µ,and given by: • Variance of a r.v. X is denoted by Var(X) or σ2, and given by:

Properties of mean • If X is a discrete random variable having pmf p(x), then: • If X is continuous with pdf f(x), then: • Hence, for constants a and b,

Property of variance • For constants a and b,

Joint Distribution • If X and Y are discrete r.v., then, is called the joint probability mass function of X and Y. • Marginal probability mass functions of X and Y: • X, Y are independent if

Conditional probability • Let A and B be two events. • Pr(A|B) is the conditional probability of event A happening given that B has already occurred. • Baye’s theorem: • If events A and B are independent, then Pr(A|B) = Pr(A). • Hence, from Baye’s theorem:

Dependency • Covariance is a measure of linear dependence and is denoted by Cij or Cov(Xi, Xj) • Another measure of linear dependency is the correlation factor: • Correlation factor is dimensionless but covariance is not.

Two random numbers in simulation experiment • Let X and Y be two random variates in a given simulation experiment that are not independent. • Our performance parameter is X+Y. • However, if the two r.v.’s are independent:

Bernoulli trial • An experiment with only two outcomes – “Success” and “Failure” where the chance of outcome is known apriori. • Denoted by the chance of success “p” (this is a parameter for the distribution). • Example: Tossing a “fair” coin. • Let us define a variable Xi such that – • Then, E[Xi] = p; and Var(Xi) = p(1-p).

Binomial r.v. • A series of n independent Bernoulli trials. • If X is the number of successes that occur in the n trials, then X is said to be Binomial r.v. with parameters (n, p). Its probability mass function is:

Binomial r.v.

Poisson r.v. • A r.v. X which can take values 0, 1, 2, … is said to have a Poisson distribution with parameter λ(λ > 0) if the pmf is given by: • For a Poisson r.v., • The probabilities can be recursively found out:

Uniform r.v. • A r.v. X is said to be uniformly distributed over the interval (a, b) when its pmf is: • Expected value:

Uniform r.v. • Variance • Distribution function F(x) for a given x: a < x < b is

Normal r.v. pdf: The normal density is a bell-shaped curve that is symmetric about µ. It can be shown that for a normal r.v. X with parameters (µ, σ2),

Normal r.v. • If X ~ N(µ, σ2), then is N(0,1). • Probability distribution function of “Standard Normal” is given as: • If X ~ N(µ, σ2), then:

Central Limit Theorem • Let X1, X2, X3…Xn be a sequence of IID random variables having a finite mean µand finite variance σ2. Then:

Exponential r.v. pdf: cdf:

Exponential r.v. • When multiplied by a constant, it still remains an exponential r.v. • Most useful property: Memoryless!!! • Analytical simplicity

Poisson process

Useful property of Poisson process • Let S11 denote the time of the first event of the first Poisson process (with rate λ1), and S12 denote the time of the first event of the second Poisson process (with rate λ2). Then:

Covariance stationary processes • Covariance between two observations Xi and Xi+j depends only on j and not on i. • Let Cj be the covariance for this process. • So the correlation factor is given by:

Point Estimation • Let X1, X2, X3…Xn be a sequence of IID random variables (observations) having a finite population mean µand finite population variance σ2. • We are interested in finding these population parameters through the sample values. • This sample mean is unbiased point estimator of µ. • That is to say that:

Point Estimation • The sample variance: is an unbiased point estimator of σ2. • Variance of the mean: • We can estimate this variance of mean by: • This is true only if X1, X2, X3…Xn are IID.

Point Estimation • However, most often in simulation experiment, the data is correlated. • In that case, estimation using sample variance is dangerous. Because it underestimates the actual population variance.

Interval Estimation • Let X1, X2, X3…Xn be a sequence of IID random variables (observations) having a finite population mean µand finite population variance σ2(> 0). • We want to construct confidence interval for mean µ. • Let Zn be a random variable with a probability distribution Fn(z).

Interval Estimation • Central Limit Theorem states that: where is the standard normal distribution with mean 0 and variance 1. • Often, we don’t know the population variance σ2. • It can be shown that CLT applies if we replace σ2 by sample variance S2(n). • The variable tn is approximately normal as n increases.

Standard Normal distribution • Standard Normal distribution is N(0,1). • The cumulative distributive function (CDF) at any given value (z) can be found using standard statistical tables. • Conversely, if we know the probability, we can compute the corresponding value of z such that, • This value is z1-α/2 and is called the critical point for N(0,1). • Similarly, the other critical point (z2 = -z1-α/2) is such that:

Interval Estimation • It follows for a large n:

Interval Estimation • Therefore, if n is sufficiently large, an approximate 100(1-α) percent confidence interval of µ is given by: • If we construct a large number of independent 100(1-α) percent confidence intervals each based on n different observations (n sufficiently large), the proportion of these confidence intervals that contain µ should be 1-α.

Interval Estimation • What if the n is not “sufficiently large”? • If Xi’s are normal random variables, the random variable tn has a t-distribution with n-1 degrees of freedom. • In this case, the 100(1-α) percent confidence interval for µ is given by:

Interval Estimation • In practice, the distribution of Xi’s is rarely normal and the confidence interval (with t-distribution) will be approximate. • Also, the CI given with “t” is larger than the one with “z”. • Hence, it is recommended that we use the CI with “t”. Why? • However,

Hypotheses testing • Assume that X1, X2, X3…Xn are normally distributed (or be approximately normal) and that we would like to test whether µ = µ0, where µ0 is a fixed hypothesized value of µ. • If is large then our hypothesis is not true. • To conduct such test (whether the hypothesis is true or not), we need a statistical parameter whose distribution is known when the hypothesis is true. • Turns out, if our hypothesis is true (µ = µ0), then the statistic tn has a t-distribution with n-1 df.

Hypotheses testing • We form our two-tailed hypothesis (H0) to test for µ = µ0 as: • The portion of real line that corresponds to the rejection of H0 is called the critical region for the test. • The probability that the statistic tn falls in the critical region given that H0 is true, which is clearly equal to α, is called level of the test. • Typically if the tn doesn’t fall in the rejection region, we “do not reject” the H0.

Hypotheses testing • Type I error: If one rejects H0 when it is true, this is called Type I error, which is again equal to α. This errors is under experimenter's control. • Type II error: If one accepts H0 when it is false, it is Type II error. It is denoted by β. • We call δ = 1- β as power of test which is the probability of rejecting H0 when it is false. • For a fixed α, power of the test can only be increased by increasing n.

Basic Probability and Statistics

Basic Probability and Statistics

Presentation Transcript

Probability and Statistics

Review of Basic Probability and Statistics

Probability and statistics

Probability and Statistics

Probability and Statistics

Statistics and Probability

Probability and Statistics

Probability and Statistics

PROBABILITY AND STATISTICS

Basic Results in Probability and Statistics

Probability and Statistics

Statistics I: Basic Probability

Probability and Statistics

《 Probability and statistics 》

Probability and Statistics

Probability and Statistics

Probability and Statistics

Statistics and Probability 13.1 Basic Statistics

Probability and Statistics

PROBABILITY AND STATISTICS

PROBABILITY AND STATISTICS

STATISTICS AND PROBABILITY