1 / 22

Statistical Techniques I EXST7005

Statistical Techniques I EXST7005. Other topics - Linear models and Transformations. Course Progression. Objective - Hypothesis testing Background Transformation - Many applications in statistics require modifying an existing distribution to a recognized statistical distribution

pepper
Télécharger la présentation

Statistical Techniques I EXST7005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Techniques I EXST7005 Other topics - Linear models and Transformations

  2. Course Progression • Objective - Hypothesis testing Background • Transformation - • Many applications in statistics require modifying an existing distribution to a recognized statistical distribution • Particularly, tests of hypotheses require taking an observed distribution and transforming to a recognized statistical distribution.

  3. The simplest form of the linear additive model • Yi =  + i for i=1, 2, 3,...,N • This is a population version of the model, so the term  is a constant, the population mean • The sample version would useYi, which is a variable. LINEAR MODELS

  4. Yi =  + i for i=1, 2, 3,...,N • i represents the deviations of the observations from the mean. It has a mean of zero since deviations sum to zero. • ei would be used to represent sample deviations, • and or course N would be changed to n for a sample. • Yi = Y + ei for i=1, 2, 3,...,n LINEAR MODELS (continued)

  5. This is a mathematical representation of a population or sample. All of the analyses discussed in the Statistical Methods courses have a linear model. The models get more complex as the analysis gets more advanced. • Multiplicative models and multiplicative errors exist, but are not covered in basic statistical methods. NOTE THAT THE ERROR TERM IN THIS MODEL IS ADDITIVE. LINEAR MODELS (continued)

  6. LINEAR MODELS (continued) • Other models we will discuss this semester include • Yi = i + i for t-tests: • Yi =  + i + i for ANOVA, or another form of the t-test • Yi =  + Xi + i Simple Linear Regression

  7. CODING and TRANSFORMATIONS • THEOREMS • If a constant "a" is added to each observation then, the mean of the data set will increase by "a" units the variance and standard deviation will remain unchanged • EXAMPLE: Population of size N = 4 • Yi = 2, 4, 6, 8 •  = Y / N = 20/4 = 5 • 2 = [Y2 - (Y)2/N] = (120 - 100) / 4 = 5 •  = 2.24

  8. CODING and TRANSFORMATIONS (continued) • now add 10 to each observation • EXAMPLE: Population size still N = 4 • Yi = 12, 14, 16, 18 •  = Y / N = 60/4 = 15 • 2 = [Y2 - (Y)2/N] = (920 - 900) / 4 = 5 •  = 2.24 • Notice that the mean increased by 10 and the variance and standard deviation did not change.

  9. CODING and TRANSFORMATIONS (continued) • NOTE that "a" may be either negative or positive, so we and add or subtract a constant from all values of Y . • if we took the values of Yi = 12, 14, 16, 18 and subtracted 10 from each value we would reverse the previous example. • When subtracting the mean is REDUCED by the value subtracted and the variance and standard deviation remain unchanged. • the mean would then ten less and the variance and standard deviation would be unchanged

  10. CODING and TRANSFORMATIONS (continued) • Another theorem • If each observation Yi is multiplied by a constant "a" then, • the mean of the data set is "a" times the old mean • the new variance is "a2" times the old variance • the standard deviation is "a" times the old standard deviation

  11. CODING and TRANSFORMATIONS (continued) • EXAMPLE: using the same Population as before; N = 4 • Y = 2, 4, 6, 8, •  = 5; 2 = 5;  = 2.24

  12. CODING and TRANSFORMATIONS (continued) • let "a" be 10; so we multiply each observation by 10. • Yi = 20, 40, 60, 80 •  = Y / N = 200/4 = 50 • which is a = 10(5) = 50 • 2 = [Y2 -(Y)2/N]=(12000-10000)/4= 500 • which is a22 = 102(5) = 500 •  = 22.4 • which is as = 10(2.24) = 50

  13. CODING and TRANSFORMATIONS (continued) • NOTE that "a" may also be an inverse (i.e. 1/a instead of a), so we can multiply or divide all values of Yi by any constant • if we took the values of Y=20, 40, 60, 80 and divided each Yi by 10, we would reverse the previous example. • For division, the mean is divided by the value "a" (1/10 ), the variance divided by "a2" (1/100), and the standard deviation divided by "a" (1/10 )

  14. CODING and TRANSFORMATIONS (continued) • The TRANSFORMATION operations may be used in combination. • EXAMPLE: Population of size N = 3 • Y = 10, 20, 30: =20; 2 =66.67;  = 8.16 • The transformation is "divide by 10 (or multiply by 1/10 ) and subtract 2" • Yi = -1, 0, 1 (much easier to work with) • ' = Y / N = 0/3 = 0 • 2' = [Y2 -(Y)2/N] = (2 - 0)/3 = 0.66667 • ' = 0.816

  15. CODING and TRANSFORMATIONS (continued) • To get back the original values we must reverse the transformation. • NOTE THAT ORDER IS IMPORTANT. • Above we 1) divided and then 2) subtracted. • To reverse this we must 1) add and then 2) multiply. •  = 10(' + 2) = 10(2) = 20

  16. CODING and TRANSFORMATIONS (continued) • ADDITION AND SUBTRACTION DO NOT AFFECT MEASURES OF DISPERSION, so we need consider only the division • 2 = a22' = 100(0.66667) = 66.667 •  = a' = 10(0.816) = 8.16 • Note that there is no addition or subtraction for the measures of dispersion since they were unaffected by the original transformation.

  17. OTHER TRANSFORMATIONS • The logarithmic transformation was mentioned previously. • Yi' = log(Yi) • if we calculate statistics such as the mean with the log transformed values, then detransform with the antilog. • antilog(log(Yi')/n) = e(log(Yi')/n) = GM(Yi) • This results in a "geometric mean"

  18. OTHER TRANSFORMATIONS (continued) • HOWEVER, note that we cannot take the logarithm of 0 (zero), so if there are zeros in the data set we must combine two transformations. On common modification is to add 1 to all observations. • USE Yi' = log(Yi + 1) • be careful in detransforming to subtract 1 after taking the anti-log to detransform. Order is important.

  19. OTHER TRANSFORMATIONS (continued) • The same is true for inverses used in calculating the harmonic mean • Yi' = 1/Yi • if we calculate the mean of the inverse transformed values, then detransform with the inverse to get the harmonic mean.

  20. The "Z" TRANSFORMATION - we will use this a lot • It employs the previously discussed transformations in combination • or • This transformation standardizes ANY normal distribution to a different normal distribution with  = 0; 2 = 1;  = 1 OTHER TRANSFORMATIONS (continued)

  21. OTHER TRANSFORMATIONS (continued) • This is necessary, because otherwise there are an infinite number of different normal distributions with different means and variances. • EXAMPLE: transform the data for a population of N = 4. • Y = 2, 4, 6, 8 • initially, calculate the mean and variance •  = 5; 2 = 5;  = 2.24

  22. OTHER TRANSFORMATIONS (continued) • the transformation • Yi = (2-5)/2.24, (4-5)/2.24, (6-5)/2.24, (8-5)/2.24 = 1.34, 0.45, 0.45, 1.34 •  = 0 / 4 = 0 • 2 = 4 / 4 = 1 • NOTE: omit addition and subtraction from variance •  = 1 = 1 • We will see a lot more of the Z transformation.

More Related