Statistics 303

Statistics 303 Chapter 4 and 1.3 Probability

Probability • The probability of an outcome is the proportion of times the outcome would occur if we repeated the procedure many times. • Examples • Coin: What is the probability of obtaining heads when flipping a coin? • A single die: What is the probability I will roll a four? • Two dice: What is the probability I will roll a four? • A jar of 30 red and 40 green jelly beans: What is the probability I will randomly select a red jelly bean? • Computer: In the past 20 times I used my computer, it crashed 4 times and didn’t crash 16 times. What is the probability my computer will crash next time I use it?

Probability • Independence: Two events are independent if the outcome of one does not affect or give an indication of the outcome of the other. Dependent Events Independent Flipping a coin twice Temperature on consecutive days 3 jelly beans: red, green, orange. Eat one. Eat another.

Probability • Independence: Two events are independent if the outcome of one does not affect or give an indication of the outcome of the other. Dependent Events Independent Randomly polling two individuals Comparing fertilizer yield for two adjacent field plots Rolling two dice

Probability • Definition: A sample space is a set of all the possible outcomes of a process. • Example: Coin • What is the sample space for flipping a coin 3 times? HHH TTT HHT TTH HTT THH HTH THT

Probability • Definition: An event is an outcome or set of outcomes of a process. • Example: Coin • What is one of the possible events for flipping a coin 3 times? HHT

Probability Rules • Rule 1: The probability of any event is between 0 and 1 inclusive. • Pr(HTH) = 1/8 which is between 0 and 1. • Rule 2: The probability of the whole sample space is 1. • Pr(rolling a 1 or 2 or 3 or 4 or 5 or 6) = 1 • Rule 3: The probability of an event not occurring is 1 minus the probability of the event. This is known as the complement rule. • Pr(not rolling a 5) = 1 – 1/6 = 5/6

Probability Rules • Rule 4: If two events A and B have no outcomes in common (they are disjoint), then Pr(A or B) = Pr(A) + Pr(B) • Pr(rolling a 1 or a 6) = Pr(rolling a 1) + Pr(rolling a 6) = 1/6 + 1/6 = 2/6 or 1/3 • Rule 5: If two events A and B are independent, then Pr(A and B) = Pr(A)Pr(B) • Pr(rolling a 1 and then a 6) = Pr(rolling a 1) * Pr(rolling a 6) = (1/6)(1/6) = 1/36

Rules of Probability • Rule 1: 0 ≤ P(A) ≤ 1 • Rule 2: P(S) = 1 • Rule 3: Complement Rule: For any event, A, P(Ac) = 1 – P(A) • Rule 4: Addition Rule: If A and B are disjoint events, then P(A or B) = P(A) + P(B) • Rule 5: Multiplication Rule: If A and B are independent events, then P(A and B) = P(A)P(B)

Random Variables • A random variable is a variable whose value is a numerical outcome of a random phenomenon. • A discrete random variable, X, has a finite number of possible values. The probability distribution of X lists the values and their probabilities.

Discrete Probability Example • The table above represents the distribution of a discrete random variable, X. How likely are you to get an X more than 2? 0.2 + 0.3 + 0.3 = 0.8 • How likely are you to get TWO 3’s if you take a random sample of 2 from this population? 0.2 * 0.2 = 0.04

Discrete Probability Example • To find the mean of a discrete distribution, multiply each possible value by its probability, then add all the products. (1*0.1) + (2*0.1) + (3*0.2) + (4*0.3) + (5*0.3) = 3.6 • To find the variance of a discrete distribution, subtract the mean from each of the X’s, square it, multiply it by the corresponding probability, then add up all the products. (1-3.6)2*0.1 + (2-3.6)2*0.1 + (3-3.6)2*0.2 + (4-3.6)2*0.3 + (5-3.6)2*0.3 = 1.61

Random Variables • A continuous random variable, X, takes all values in an interval of numbers. The probability distribution is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. • Examples are uniform, normal, left-skewed distributions, and right-skewed distributions.

Uniform Distribution Example • This is a Uniform Distribution from 0 to 1. Since the area under the curve is 1, the height is also 1. To find the probability for a given interval, you find the areas under the curve.

The Normal Distribution The mean for a normal distribution is called μ (pronounced ‘myu’), the Greek letter for m (for mean). The standard deviation for a normal distribution is called s (pronounced ‘sigma’), the Greek letter for s (for std. dev.) s m

Are Canadians Taller? • A study claimed Canadians are taller than Americans. • They take a sample of 30 people from Canada and find their sample average height is .01 cm taller than the known national average of the USA. Do you believe the studies claim? • Why or why not? • What if they took a sample of 10,000 people? • A Census? • How can we quantify how often results we saw would happen due to sampling variability (chance)

The Normal Distribution We often write that a variable (call it X) has normal distribution with mean m and variance s2 in the following way: Note that the std. dev. is still s. s m

The Normal Distribution The normal distribution we have already seen is the standard normal distribution, which has mean = 0 and standard deviation = 1 (the variance = 12 = 1). This is also called the Z-distribution. 1 0

The Normal Distribution The distribution of any variable which is normally distributed can be converted to a standard normal (Z) distribution in the following way: This is known as a Z-score.

s 1 0 m The Normal Distribution The distribution of any variable which is normally distributed can be converted to a standard normal (Z) distribution in the following way: X

The Normal Distribution • For the Standard Normal Distribution (or Z-Distribution) we can find probabilities associated with different values of Z using Z-tables. Z

The Normal Distribution • First we look at some general characteristics of the Z-distribution. • The area under the entire curve is 1. • The area under the curve to the left of 0 is 0.5. • We say, “The probability that Z is to the left of 0 is 0.5.” • This can be written as Prob ( Z < 0) = 0.5. 1 0.5 Z 0

The Normal Distribution • We can find the probability that Z is to the left of any number using the Z-table. • Z-tables can be found at http://stat.tamu.edu/stat30x/zttables.html • Z-tables can also be found on the inside front cover of the book • Notice first if we go in the table to the value z = 0.00 we see the probability is 0.5. 0.5 Z 0

Z Z 1.25 0.50 The Normal Distribution • We can find the probability that Z is to the left of any number using the Z-table. See explanation at http://stat.tamu.edu/stat30x/notes.html called How To Use Z Tables • Let’s look at some examples: Pr ( Z < 0.50) = ? Pr ( Z < 1.25) = ? Answer0.8944 Answer0.6915

Z Z -3.75 -2.01 The Normal Distribution • More examples of probabilities to the left or less than a number Pr ( Z < -3.75) = ? Pr ( Z < -2.01) = ? Answer < 0.0003 or 0.0001 Answer0.0222

Z Z 0.50 1.25 The Normal Distribution • The Z-table only gives probabilities to the left of the value. If we want to get probabilities to the right we use 1 – Pr (Z < z). Pr ( Z > 0.50) = ? Pr ( Z > 1.25) = ? Answer:1 – 0.8944 = 0.1056 Answer:1 – 0.6915 = 0.3085

Z Z -3.75 -2.01 The Normal Distribution • More examples of finding probabilities to the right of a number using 1 – Pr (Z < z). Pr ( Z > -3.75) = ? Pr ( Z > -2.01) = ? Answer: > 1 – 0.0001 = 0.9999 Answer:1 – 0.0222 =0.9778

The Normal Distribution • To find probabilities between two numbers, find the less than (of to the left) probability for each number and then subtract. Pr (-2.01< Z < 2.01) = ? ANSWER: 0.9778 – 0.0222 = 0.9556

The Normal Distribution • Now suppose we know X ~ N (m, s2) and we want to know the probability that X is less than some value. We must first convert the X to a Z and then use the probabilities from the Z-table. Recall that if X ~ N (m, s2) , then So Pr (X < x) = Pr (Z < (x – m)/s)

Z X 4 3 0.50 0 The Normal Distribution • Here’s an example. Suppose X ~ N ( 3, 22). Find the probability that X is less than 4. Pr ( X < 4 ) = ? Pr ( X < 4 ) = Pr (Z < 0.5) = 0.6915

The Normal Distribution • We will look at some more difficult examples by hand: • Suppose X ~ N (2, 32), • Given a value z, find the corresponding x that it came from. • How many standard deviations is x from m? • Find Pr (X > 5). • Find Pr (X < –4 or X > 8). • Find Pr ( –4 < X < 8 ). • Find the x* such that Pr ( X < x* ) = .8485, where .8485 is some probability.

Normal Distribution Example • Suppose the sample proportion of 100 students who think that there is insufficient parking is normally distributed with a mean of 0.8 and a standard deviation of 0.04 ie. p ~ N(0.8, 0.042). How often would we get a sample proportion of 0.75 or less? Z= (0.75 - 0.8)/0.04 = -1.25 Pr(Z < -1.25) = 0.1056 10.56% of the time we would get a sample proportion of 0.75 or less.

P-value • P-value • The probability that we will have observed values as extreme or more extreme if the null hypothesis is true • For now, think of null hypothesis as if there is no effect, or no change.

Test example • Suppose a test is known to have a distribution that is Normal with a mean of 700 points and a variance of 25 • A new class is offered that claims that its students test scores improve. • 16 students take the test • They score on average 704 points on the tests • Do you believe the class’ claim that test scores are improved?

Test example • How to solve the problem? • Question 1: How is the statistic x-bar distributed if the null hypothesis is true? • Why interested in this quantity? • N(700, 25) • N(700, 25/16) • Why?

Test Example • The test statistic is distributed (700, 25/16) • Because we are in the sampling distribution of x-bar. We are interested in how the test scores are distributed ON AVERAGE for a class of 16 people, not for a single individual. • Step II find how often the results will happen by chance alone. • How?

Test Example Look up 3.2 on Z-table

Test example • P(Z<3.2)=.9993 • Is this the quantity we are interested in? • Draw a picture

Test example • 1-.9993=.0007 • What does this mean? • Look up definition of p-value and answer

Test example • We will observe results as extreme as 704 or more extreme then 704 for a classroom of 16 .0003 proportion of the time or 0.03% of the time • Do you believe the classes claim? • Is this a ‘weird’ result if the null hypothesis is true

Test example • Practical significant: Does 4 points really matter? • Would you take the class?

Sample Question • The number of cars on a freeway every day is distributed normally with a mean of 100 and a variance of 10. • Another highway in the area is completed. The cars on the original highway are counted every day for 7 days and the average cars on the highway are 97.8. Is there evidence that there are less cars on the original highway?

Law of Large Numbers • Draw independent observations at random from any population with finite mean, μ. Decide how accurately you would like to estimate μ. As the number of observations drawn increases, the sample mean ( ) of the observed values eventually approaches the mean μ of the population as closely as you specified and then stays that close.

Rules for means Rule 1: If X is a random variable and a and b are fixed numbers, then Rule 2: If X and Y are random variables, then

Rules for Variances Rule 1: If X is a random variable and a and b are fixed numbers, then Rule 2: If X and Y are independent random variables, then Note: variances, NOT std. devs. always add. If X and Y are not independent, then there is a correlation factor added or subtracted from their sum.

General Addition Rules • The union of any collection of events is the event that at least one of the collection occurs. • Addition Rule for Disjoint Events – If events A, B, and C are disjoint in the sense that no two have any outcomes in common, then P(one or more of A, B, C) = P(A)+P(B)+P(C) • General Addition Rule for the Unions of Two Events – For any two events A and B, P(A or B) = P(A) + P(B) – P(A and B)

Statistics 303

Statistics 303

Presentation Transcript

Statistics 303

Business 303 Sheppard

Statistics 303

Statistics 303

ESS 303 – Biomechanics

LING 303

LING 303

Statistics 303

STUDY 303

Unit 303

BBSC 303

BBSc 303

HKIN 303

Statistics 303

CM - 303

STUDY 303

Statistics 303

BAB 303

CRJ 303 Bright Tutoring/crj 303.com

ECON 303