1 / 117

STATISTICS Random Variables and Probability Distributions

STATISTICS Random Variables and Probability Distributions. Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University. Definition of random variable (RV).

carter
Télécharger la présentation

STATISTICS Random Variables and Probability Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STATISTICSRandom Variables and Probability Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

  2. Definition of random variable (RV) • For a given probability space ( ,A, P[]), a random variable, denoted by X or X(), is a function with domain  and counterdomain the real line. The function X() must be such that the set Ar, denoted by , belongs to A for every real number r. • Unlike the probability which is defined on the event space, a random variable is defined on the sample space. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  3. Random experiment Sample space Event space Probability space is defined whereas is not defined. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  4. Cumulative distribution function (CDF) • The cumulative distribution function of a random variable X, denoted by , is defined to be Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  5. Consider the experiment of tossing two fair coins. Let random variable X denote the number of heads. CDF of X is Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  6. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  7. Indicator function or indicator variable • Let  be any space with points  and A any subset of . The indicator function of A, denoted by , is the function with domain  and counterdomain equal to the set consisting of the two real numbers 0 and 1 defined by Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  8. Discrete random variables • A random variable X will be defined to be discrete if the range of X is countable. • If X is a discrete random variable with values then the function denoted by and defined by is defined to be the discrete density function of X. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  9. Continuous random variables • A random variable X will be defined to be continuous if there exists a function such that for every real number x. • The function is called the probability density function of X. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  10. Properties of a CDF is continuous from the right, i.e. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  11. Properties of a PDF Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  12. Example 1 • Determine which of the following are valid distribution functions: Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  13. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  14. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  15. Example 2 • Determine the real constant a, for arbitrary real constants m and 0 < b, such that is a valid density function. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  16. Function is symmetric about m. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  17. Characterizing random variables • Cumulative distribution function • Probability density function • Expectation (expected value) • Variance • Moments • Quantile • Median • Mode Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  18. Expectation of a random variable • The expectation (or mean, expected value) of X, denoted by or E(X) , is defined by: Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  19. Rules for expectation • Let X and Xi be random variables and c be any real constant. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  20. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  21. Variance of a random variable Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  22. is called the standard deviation of X. • Variance characterizes the dispersion of data with respect to the mean. Thus, shifting a density function does not change its variance. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  23. Rules for variance Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  24. Two random variables are said to be independent if knowledge of the value assumed by one gives no clue to the value assumed by the other. • Events A and B are defined to be independent if and only if Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  25. Moments and central moments of a random variable Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  26. Properties of moments Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  27. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  28. Quantile • The qth quantile of a random variable X, denoted by , is defined as the smallest number satisfying . Discrete Uniform Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  29. Median and mode • The median of a random variable is the 0.5th quantile, or . • The mode of a random variable X is defined as the value u at which is the maximum of . Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  30. Note: For a positively skewed distribution, the mean will always be the highest estimate of central tendency and the mode will always be the lowest estimate of central tendency (assuming that the distribution has only one mode). For negatively skewed distributions, the mean will always be the lowest estimate of central tendency and the mode will be the highest estimate of central tendency. In any skewed distribution (i.e., positive or negative) the median will always fall in-between the mean and the mode. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  31. Moment generating function Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  32. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  33. Usage of MGF • MGF can be used to express moments in terms of PDF parameters and such expressions can again be used to express mean, variance, coefficient of skewness, etc. in terms of PDF parameters. • Random variables of the same MGF are associated with the same type of probability distribution. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  34. The moment generating function of a sum of independent random variables is the product of the moment generating functions of individual random variables. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  35. Expected value of a function of a random variable Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  36. If Y=g(X) Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  37. Y Y=g(X) y X x1 x2 x3 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  38. Theorem Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  39. Chebyshev Inequality Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  40. The Chebyshev inequality gives a bound, which does not depend on the distribution of X, for the probability of particular events described in terms of a random variable and its mean and variance. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  41. Probability density functions of discrete random variables • Discrete uniform distribution • Bernoulli distribution • Binomial distribution • Negative binomial distribution • Geometric distribution • Hypergeometric distribution • Poisson distribution Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  42. Discrete uniform distribution N ranges over the possible integers. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  43. Bernoulli distribution 1-p is often denoted by q. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  44. Binomial distribution • Binomial distribution represents the probability of having exactly x success in n independent and identical Bernoulli trials. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  45. Negative binomial distribution • Negative binomial distribution represents the probability of achieving the r-th success in x independent and identical Bernoulli trials. • Unlike the binomial distribution for which the number of trials is fixed, the number of successes is fixed and the number of trials varies from experiment to experiment. The negative binomial random variable represents the number of trials needed to achieve the r-th success. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  46. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  47. Geometric distribution • Geometric distribution represents the probability of obtaining the first success in x independent and identical Bernoulli trials. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  48. Hypergeometric distribution where M is a positive integer, K is a nonnegative integer that is at most M, and n is a positive integer that is at most M. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  49. Let X denote the number of defective products in a sample of size n when sampling without replacement from a box containing M products, K of which are defective. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

  50. Poisson distribution • The Poisson distribution provides a realistic model for many random phenomena for which the number of occurrences within a given scope (time, length, area, volume) is of interest. For example, the number of fatal traffic accidents per day in Taipei, the number of meteorites that collide with a satellite during a single orbit, the number of defects per unit of some material, the number of flaws per unit length of some wire, etc. Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

More Related