350 likes | 460 Vues
Statistics & Data Analysis. Course Number B01.1305 Course Section 31 Meeting Time Wednesday 6-8:50 pm. CLASS #4. Class #4 Outline. Brief review of last class Questions on homework Chapter 5 – Special Distributions. Review of Last Class. Probability trees
E N D
Statistics & Data Analysis Course Number B01.1305 Course Section 31 Meeting Time Wednesday 6-8:50 pm CLASS #4
Class #4 Outline • Brief review of last class • Questions on homework • Chapter 5 – Special Distributions
Review of Last Class • Probability trees • Probability distribution functions • Expected value • Standard deviation
Chapter 5 Some Special Probability Distributions
Chapter Goals • Introduce some special, often used distributions • Understand methods for counting the number of sequences • Understand situations consisting of a specified number of distinct success/failure trials • Understanding random variables that follow a bell-shaped distribution
Counting Possible Outcomes • In order to calculate probabilities, we often need to count how many different ways there are to do some activity • For example, how many different outcomes are there from tossing a coin three times? • To help us to count accurately, we need to learn some counting rules • Multiplication Rule : If there are m ways of doing one thing and n ways of doing another thing, there are m times n ways of doing both
Example • An auto dealer wants to advertise that for $20G you can buy either a convertible or 4-door car with your choice of either wire or solid wheel covers. • How many different arrangements of models and wheel covers can the dealer offer?
Counting Rules • Recall the classical interpretation of probability:P(event) = number of outcomes favoring event / total number of outcomes • Need methods for counting possible outcomes without the labor of listing entire sample space • Counting methods arise as answers to: • How many sequences of k symbols can be formed from a set of r distinct symbols using each symbol no more than once? • How many subsets of k symbols can be formed from a set of r distinct symbols using each symbol no more than once? • Difference between a sequence and a subset is that order matters for a sequence, but not for a subset
Counting Rules (cont) • Create all k=3 letter subsets and sequences of the r=5 letters: A, B, C, D and E • How many sequences are there? • How many subsets are there?
Review: Sequence and Subset • For a sequence, the order of the objects for each possible outcome is different • For a subset, order of the objects is not important
Example • A group of three electronic parts is to be assembled into a plug-in unit for a TV set • The parts can be assembled in any order • How many different ways can they be assembled? • There are eight machines but only three spaces on the machine shop floor. • How many different ways can eight machines be arranged in the three available spaces? • The paint department needs to assign color codes for 42 different parts. Three colors are to be used for each part. How many colors, taken three at a time would be adequate to color-code the 42 parts?
Binomial Distribution • Percentages play a major role in business • When percentage is determined by counting the number of times something happens out of the total possibilities, the occurrences might following a binomial distribution • Examples: • Number of defective products out of 10 items • Of 100 people interviewed, number who expressed intention to buy • Number of female employees in a group of 75 people • Of all the stocks trades on the NYSE, the number that went up yesterday
Binomial Distribution (cont) • Each time the random experiment is run, either the event happens or it doesn’t • The random variable X, defined as the number of occurrences of a particular event out of n trials has a binomial distribution if: • For each of the n trials, the event always has the same probability of happening • The trials are independent of one another
Example: Binomial Distribution • You are interested in the next n=3 calls to a catalog order desk and know from experience that 60% of calls will result in an order • What can we say about the number of calls that will result in an order? • Questions: • Create a probability tree • Create a probability distribution table • What is the expected number of calls resulting in an order? • What is the standard deviation?
Example: Binomial Probabilities • How many of your n=6 major customers will call tomorrow? • There is a 25% chance that each will call • Questions: • How many do you expect to call? • What is the standard deviation? • What is the probability that exactly 2 call? • What is the probability that more than 4 call?
Example • It’s been a terrible day for the capital markets with losers beating winners 4 to 1 • You are evaluating a mutual fund comprised of 15 randomly selected stocks and will assume a binomial distribution for the number of securities that lost value • Questions: • What assumptions are being made? • What is the random variable? • How many securities do you expect to lose value? • What is the standard deviation of the random variable? • Find the probability that 8 securities lose value • What is the probability that 12 or more lose value?
Normal Distribution • The normal distribution is sometimes called a Gaussian Distribution, after its inventor, C. F. Gauss (1777- 1855). • Well-known “bell-shaped” distribution • Mean and standard deviation determine center and spread of the distribution curve • The mathematical formula for the normal f (y) is given in HO, p. 157. We won't be needing this formula; just tables of areas under the curve. • The empirical rule holds for all normal distributions • Probability of an event corresponds to area under the distribution curve
Standard Normal Distribution • Normal Distribution with =0 and =1 • Letter Z is used to denote a random variable that follows a Standard Normal Distribution
Visualization Symmetrical Tail Tail Mean, Median and Mode
Characteristics • Bell-shaped with a single peak at the exact center of the distribution • Mean, median and mode are equal and located at the peak • Symmetrical about the mean • Falls off smoothly in both directions, but the curve never actually touches the X-axis
Why Its Important • Many psychological and educational variables are distributed approximately normally • Measures of reading ability, introversion, job satisfaction, and memory are among the many psychological variables approximately normally distributed • Although the distributions are only approximately normal, they are usually quite close. • It is easy for mathematical statisticians to work with • This means that many kinds of statistical tests can be derived for normal distributions • Almost all statistical tests discussed in this text assume normal distributions • These tests work very well even if the distribution is only approximately normally distributed.
More Visualizations =3.1 years, Plant A =3.9 years, Plant B =5 years, Plant C
Z-score • Compute probabilities using tables or computer • Convert to z-score: • Look up CUMULATIVE PROBABILITY ON TABLE:
LOOKUP Table Standard Normal Lookup Table
Example • Sales forecasts are assumed to follow a normal distribution • Target, or expected value is $20M with a $3M standard deviation • What is the probability of sales lower than $15M? • What is the probability sales exceed $25M? • What is the probability sales are between $15M and $25M ?? • What is the value of k such that the sales forecast exceeds k is 60% ?
Example • Benefits compensation costs for employees with a certain financial services firm are approximately normally distributed with a mean of $18,600 and standard deviation of $2,700. • Find the probability that an employee chosen at random has an benefits package that costs less than $15,000 • Find the probability that an employee chosen at random has an benefits package that costs more than $21,000 • What is the value of k such that the benefits compensation exceeds k is 95% ?
Example • A telephone-sales firm is considering purchasing a machine that randomly selects and automatically dials telephone numbers • The firm would be using the machine to call residences during the evening; calls to business phones would be wasted. • The manufacturer of the machine claims that its programming reduces the business-phone rate to 15% • As a test, 100 phone numbers are to be selected at random from a very large set of possible numbers • Are the binomial assumptions satisfied? • Find the probability that at least 24 of the numbers belong to business phones • If in fact 24 of the 100 numbers turn out to be business phones, does that cast series doubt on the manufacturer’s claim? • Find the expected value and standard deviation of the number of business phone numbers in the sample
Example • Assumed the stock market closed at 8,000 yesterday. • Today you expect the market to rise a mean of 1 point, with a standard deviation of 34 points. Assume a normal distribution. • What is the probability the market goes down tomorrow? • What is the probability the market goes up more than 10 points tomorrow? • What is the probability the market goes up more than 40 points tomorrow? • What is the probability the market goes up more than 60 points tomorrow? • Find the probability that the market changes by more than 20 points in either direction. • What is the value of k such that the market close exceeds k is 75% ?
Using R • factorial(n) – n! • dbinom(x, n, p) – binomial probability distribution function • pbinom(x, n, p) – binomial cumulative distribution function • pnorm(q, mean, sd) – normal cumulative distribution function • qnorm(p, mean, sd) – inverse CDF
Homework #4 Hildebrand/Ott 5.2, page 141 5.3, page 141 5.9, page 150 5.14, page 150 5.32, page 163 5.33, page 163 5.34, page 163 Reading: Chapter 6 (all) and 7 (all). Verzani 6.5