1 / 74

Statistical Techniques I EXST7005

You are still here. Statistical Techniques I EXST7005. The Z distribution. The first statistical distribution we will use and develop for hypothesis testing is the Z distribution. This will require an understanding of the distribution, Of how to work with the distribution,

fionan
Télécharger la présentation

Statistical Techniques I EXST7005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. You are still here Statistical Techniques I EXST7005 The Z distribution

  2. The first statistical distribution we will use and develop for hypothesis testing is the Z distribution. • This will require an understanding of the distribution, • Of how to work with the distribution, • And of the Z transformation. • Fortunately, other statistical distributions will be similar. Once these techniques are learned, they apply readily to other statistical tests and applications. Objective

  3. THE Z - TRANSFORMATION • Purpose - transforms values from any normal population to the corresponding values from the Standard Normal Distribution. • this distribution is N(=0, =1) • Zi = (Yi- ) / 

  4. THE Z - TRANSFORMATION (continued) • where; •  = the mean of the original population •  = the standard deviation of the original population • Yi = the value of an observation from the original population • Zi = the corresponding value from a Standard Normal Distribution

  5. The purpose of this transformation is to deal with the infinite number of possible normal curves with different values of  and  by standardizing any normal curve so we can work with a single distribution. • we will then work with these distributions from a TABLE of Z values. • This will tie together much of what we have discussed (frequency and probability concepts, transformations, use of means, variances and standard deviations, and their calculations). THE Z - TRANSFORMATION (continued)

  6. 0 Z0 0 Z0 • "Typical" PROBABILITY statements are of the form. • P[ Z  Z0] = r.c.f. at Z0 • where Z0 is an hypothesized value • for Z < 0 Z > 0 PROBABILITY statements

  7. PROBABILITY statements (continued) • We will need tables to work with the Z distribution. You should have those tables available. Your book has Z-tables and I have a copy of mine on the internet. • The Z table is exactly symmetric. As a result, the negative half (below zero) is a mirror image of the upper half. • So our tables will give only half of the distribution, since it is EXACTLY symmetric.

  8. PROBABILITY statements (continued) • To work with these half tables, it is important to note that • P[Z  0] = P[Z  0] = 0.5 since half of the distribution is above 0 and half is below • P[Z  -Z0] = P[Z  +Z0] since the table is symmetric • P[Z  Z0] = 1 - P[Z  Z0]

  9. Table gives positive values only. -4 -3 -2 -1 0 1 2 3 4 • My tables are patterned after Steel & Torrie, 1980, page 578 • The table in the text is "ONE-SIDED". Only 1 side is required due to symmetry. TABULATED Z DISTRIBUTION

  10. TABULATED Z DISTRIBUTION (continued) • Values in the rows on the LEFT SIDE and TOP of the Z table give the value of Z, values in the body of the table are the probabilities of randomly choosing a larger Z value by random chance • For example, take Z = 0.11. What proportion of the distribution occurs above this value? Or, what is the probability of picking a Z value at random and it being larger than 0.11?

  11. TABULATED Z DISTRIBUTION (continued) Only the first 6 rows and columns of the table are shown here, plus the row and column headings. Complete tables are available on the Internet linked to the departmental STATLAB page. 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.10 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.20 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.30 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.40 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.50 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912

  12. Values at the SIDE and TOP of the Z table give the value of Z, e.g. find Z=0.11 • read the integer portion and first decimal part (0.1) along the left side and find • the second decimal (0.01) along the top Reading the Z tables 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.10 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.20 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.30 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.40 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.50 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912

  13. Reading the Z tables • The intersection of these gives the z value, in this case P(Z0.11)=0.4562 • Note that the value of Z=0.00 has a probability of 0.5, so half of the distribution is above this value (and half below) 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.10 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.20 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.30 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.40 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.50 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912

  14. -4 -4 -3 -3 -2 -2 -1 -1 0 0 1 1 2 2 3 3 4 4 Z=0.11 • What did we just do? • We found the area under the curve above a value of Z=0.11 Working with Z tables

  15. -4 -4 -3 -3 -2 -2 -1 -1 0 0 1 1 2 2 3 3 4 4 • The values in the available tables will always be giving the r.c.f. of the upper area of the curve. • What if we want to work with the lower half of the curve? Working with Z tables

  16. -4 -4 -3 -3 -2 -2 -1 -1 0 0 1 1 2 2 3 3 4 4 • Due to symmetry in the distribution, the probability of a randomly selected value falling in the negative area to the left is the same as the corresponding positive area, so • P(Z0.11) = P(Z-0.11) Working with Z tables Z=-0.11

  17. Some things we know from previous discussions. • P(Z0) = P(Z0)=0.5 • The probability that a randomly selected Z falls between the limits -1 and +1 is about 68%, and half of the remaining fall in each of the tails (about 16%). Since =1 for the standard normal, we should have about 16% above 1, and 16% below -1. Looking this up in the table we see P(Z1) = 0.1587. Working with Z tables

  18. The probability that a randomly selected Z falls between the limits -1.96 and +1.96 is about 95%, and half of the remaining fall in each of the tails (about 2.5%). Since =1 for the standard normal, we should have about 2.5% above 1.96, and 2.5% below -1.96. Looking this up in the table we see P(Z1.96) =0.0250, and P(Z-1.96) would be the same. • A memorable value. Working with Z tables

  19. The probability that a randomly selected Z falls between the limits -2.576 and +2.576 is about 95%, and half of the remaining fall in each of the tails (about 0.5%). Since =1 for the standard normal, we should have about 2.5% above 2.576, and 0.5% below -2.576. Attempting to look this up in the table we see that 2.576 does not occur exactly. Working with Z tables

  20. 2.576 does not occur exactly, but • P(Z2.57) = P(Z-2.57)=0.0051 • P(Z2.58) = P(Z-2.58)=0.0049 • So the true value is somewhere between 2.57 and 2.58, it turns out to be exactly • P(Z2.576) = P(Z-2.576)=0.005 • "In between" values would normally be determined by interpolation. Exact values can be obtained from various software packages, including SAS and EXCEL. Working with Z tables

  21. Note: On an exam, if a value does not occur exactly, I will accept either of the two limits on either side of the correct value, or anything in between. Working with Z tables

  22. A few more examples. • Find P(Z1.35). This is an area in the upper half of the distribution (since Z is positive) so we can read it directly from the Z tables. P(Z1.35) = 0.0885 • Find P(Z2.22). This is an area from the lower half of the table, but due to symmetry P(Z2.22) = P(Z2.22), so we can use the upper half of the table that we have available. P(Z2.22) = 0.0132 Working with Z tables

  23. -4 -3 -2 -1 0 1 2 3 4 • What about problems that do not ask for the area in the upper or lower tail? For example, P(Z1.30). This value is in the upper half of the table, but the probability requested is for randomly chosen Z values LESS THAN OR EQUAL, this will go into the lower half of the distribution! More Working with Z tables

  24. -4 -3 -2 -1 0 1 2 3 4 • To solving this problem you must recall that the total area under the curve adds to 1. To find P(Z1.30), we first find P(Z1.30) and subtract from 1. • P(Z1.30) = 1 - P(Z1.30) = 1-0.0968 = 0.9032. More Working with Z tables 0.0968 0.9032

  25. -4 -3 -2 -1 0 1 2 3 4 Even trickier Z distribution problems • Find P(Z-0.65). • Now we are looking for a value greater than or equal to a value on the negative side of the distribution.

  26. -4 -3 -2 -1 0 1 2 3 4 • Now we combine 2 rules, the area defined by P(Z-0.65) will be equal to 1-P(Z-0.65), and P(Z-0.65) from the lower part of the distribution will be equal to P(Z0.65) from the upper half of the distribution. This is the area our tables give. Even trickier Z distribution problems

  27. -4 -3 -2 -1 0 1 2 3 4 • From our tables we first find P(Z0.65)=0.2578=P(Z-0.65) due to symmetry, and • 1-P(Z-0.65) = 1-0.2578 = 0.7422 Even trickier Z distribution problems

  28. -4 -3 -2 -1 0 1 2 3 4 • It is strongly advisable to always sketch the problem, and to see if the answer makes sense. In this case we can see from the sketch that the desired area is over half of the total area, so the answer should be greater than 0.5, and of course it was (P(Z)-0.65)= 0.7422). Even trickier Z distribution problems

  29. A few Extra Examples • 1) P(Z3.50)=? • 2) P(Z-2.00)=? • 3) P(Z0.00)=? • 4) P(Z1.64)=P(Z-1.64)=? • 5) P(Z1.96)=P(Z-1.96)=?

  30. A few Extra Examples • 1) P(Z3.50)=? Read from the table • 2) P(Z-2.00)=? Read from the table, but for the upper (positive) end • 3) P(Z0.00)=? Read from the table • 4) P(Z1.64)=P(Z-1.64)=? This is not in the table. Use 1-P(Z1.64) • 5) P(Z1.96)=P(Z-1.96)=? This is not in the table. Use 1-P(Z1.96)

  31. Two-tailed problems and "area in the middle" problems • A common type of problem is to determine the area between two limits, or to determine the area in the tails outside some specific limits.

  32. Two-tailed problems and "area in the middle" problems • Probability expressions for these problems will take the form P(-Z1ZZ2)=?

  33. Two-tailed problems and "area in the middle" problems • If the problem is symmetric, then -Z1=Z2, then call the limit Z0 and we can write this as P(|Z|Z0).

  34. Two-tailed problems and "area in the middle" problems • Probability expressions for areas in the tails will take the form P(|Z|Z0) if the problem is symmetric. If not, we write two expressions, P(ZZ1) or P(ZZ2).

  35. -4 -3 -2 -1 0 1 2 3 4 Two-tailed problems and "area in the middle" problems (continued) • Problems involving sections may not be symmetric, and may occur entirely in the positive tail, or negative tail.

  36. -4 -3 -2 -1 0 1 2 3 4 • P(|Z|Z0), where Z0 = 1.96. Note that since we are taking the absolute value of a randomly chosen Z value, that value may be either positive or negative and it's absolute value may be greater than or equal to 1.96. Some Examples

  37. -4 -3 -2 -1 0 1 2 3 4 Some Examples • P(|Z|Z0), where Z0 = 1.96. • P(|Z|Z0) = P(ZZ0) + P(ZZ0), and since it is symmetric • P(|Z|Z0) = 2*P(ZZ0) = 2(0.0250) = 0.050

  38. -4 -3 -2 -1 0 1 2 3 4 • P(|Z|Z0), where Z0 = 2.576. This problem is similar to the previous, but it describes the area in the middle, between two limits. • P(|Z|Z0) = 1-P(|Z|Z0) = 1-2*P(ZZ0) = 1-2(0.0050) = 0.99 Some Examples

  39. -4 -3 -2 -1 0 1 2 3 4 An asymmetric case • P(-1.96Z2.576)=? This is the area in the middle, the total minus the two tails. We already know these tails. • P(-1.96Z2.576) = 1 - P(Z1.96) - P(Z2.576) = 1-0.0250-0.0050 = 0.97

  40. Working the tables backward & forward • Finding a value of Z0 when a probability is known. • Basically, we find the value of the probability in the body of our Z table, and determine the corresponding value of Z0.

  41. -4 -3 -2 -1 0 1 2 3 4 • P(ZZ0)=0.1587, find the value of Z0 • This probability is a value less than 0.5, so it is a tail and can be solved directly from our tables. We only need to find 0.1587 in the table and determine the corresponding value of Z0. Working the tables backward & forward (continued)

  42. P(ZZ0)=0.1587, find the value of Z0 • The value in the table occurs in the row corresponding to "1.0" and the column corresponding to "0.00". • Finally note that the randomly chooses Z was to be LESS THAN or equal to Z0, so we are in the lower tail. Z0 = -1.00 Working the tables backward & forward (continued)

  43. -4 -3 -2 -1 0 1 2 3 4 • P(ZZ0)=0.8413, find the value of Z0 • This probability is a value greater than 0.5. To read it from our tables we must determine the corresponding tail. Working the tables backward & forward (continued)

  44. P(ZZ0)=0.8413, find the value of Z0 • The corresponding tail would be given by 1-0.8413=0.1587. So this is the same as the value we just looked up, it occurs in the row corresponding to "1.0" and the column corresponding to "0.00". • Finally, note that the randomly choosen Z was to be LESS THAN or equal to Z0, but since the probability was larger than 0.5 we are in the upper tail. Z0 = +1.00 Working the tables backward & forward (continued)

  45. -4 -3 -2 -1 0 1 2 3 4 • P(1ZZ0)=0.1, find the value of Z0 • First note that the smallest value is 1.00, so we are working with the upper tail of the Z distribution. Working the tables backward & forward (continued) Z0 somewhere here

  46. -4 -3 -2 -1 0 1 2 3 4 • P(1ZZ0)=0.1, find the value of Z0 • We can find the probability that Z is  1 as 1-P(Z1)=1-0.1587=0.8413 • The area of interest is stated to have a area of 0.1, so the upper tail would be what is left, 1-0.8413-0.1=0.0587 Working the tables backward & forward (continued) 0.1 0.8413 0.0587

  47. -4 -3 -2 -1 0 1 2 3 4 • P(1ZZ0)=0.1, find the value of Z0 • We find 0.0587 in our Z tables and note that it is between 1.56 and 1.57. I would accept either answer. The actual value, determined by the EXCEL "NORMDIST" function is Z0=1.565781531. Working the tables backward & forward (continued) 0.1 0.8413 0.0587

  48. P(|Z|  Z0) = 0.05, find the value of Z0 • Note that the random value of Z is greater than Z0, so we are examining an area in the tails, where the random Z may be either positive or negative and its absolute value will exceed the value of Z0. Working the tables backward & forward (continued)

  49. P(|Z|  Z0) = 0.05, find the value of Z0 • Recall that our tables give only one tail, and this particular case has 0.05 in TWO TAILS, so we want to determine the value of Z0 for only one of those tails. • Each tail would have half of the 0.05 since it is a symmetric problem. So the area in the tail is 0.05/2 = 0.025. Working the tables backward & forward (continued)

  50. 1.96 -1.96 • P(|Z|  Z0) = 0.05, find the value of Z0 • We can examine the tables and determine that for a probability of 0.025 the Z value is 1.96, so Z0 = 1.96. Some people like to write Z0 = ±1.96, but this isn't really necessary. Working the tables backward & forward (continued)

More Related