1 / 21

Multiple Regression

Multiple Regression. EPP 245 Statistical Analysis of Laboratory Data. Cystic Fibrosis Data. Cystic fibrosis lung function data lung function data for cystic fibrosis patients (7-23 years old) age a numeric vector. Age in years. sex a numeric vector code. 0: male, 1:female.

tass
Télécharger la présentation

Multiple Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Regression EPP 245 Statistical Analysis of Laboratory Data

  2. Cystic Fibrosis Data Cystic fibrosis lung function data lung function data for cystic fibrosis patients (7-23 years old) age a numeric vector. Age in years. sex a numeric vector code. 0: male, 1:female. height a numeric vector. Height (cm). weight a numeric vector. Weight (kg). bmp a numeric vector. Body mass (% of normal). fev1 a numeric vector. Forced expiratory volume. rv a numeric vector. Residual volume. frc a numeric vector. Functional residual capacity. tlc a numeric vector. Total lung capacity. pemax a numeric vector. Maximum expiratory pressure. EPP 245 Statistical Analysis of Laboratory Data

  3. Some Stata Commands . insheet using "cystfibr.csv" (11 vars, 25 obs) . graph matrix age sex height weight bmp fev1 rv frc tlc pemax . graph export cystfibr-scm.wmf . regress pemax age sex height weight bmp fev1 rv frc tlc . rvfplot . graph export cystfibr-rvf.wmf EPP 245 Statistical Analysis of Laboratory Data

  4. EPP 245 Statistical Analysis of Laboratory Data

  5. Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549 .9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274 .7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv | .196972 .1962136 1.00 0.331 -.2212474 .6151915 frc | -.3084314 .4923899 -0.63 0.540 -1.357936 .7410729 tlc | .1886017 .4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

  6. Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549 .9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274 .7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv | .196972 .1962136 1.00 0.331 -.2212474 .6151915 frc | -.3084314 .4923899 -0.63 0.540 -1.357936 .7410729 tlc | .1886017 .4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ T-test of additional value of variable EPP 245 Statistical Analysis of Laboratory Data

  7. Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549 .9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274 .7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv | .196972 .1962136 1.00 0.331 -.2212474 .6151915 frc | -.3084314 .4923899 -0.63 0.540 -1.357936 .7410729 tlc | .1886017 .4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ Test of whole model EPP 245 Statistical Analysis of Laboratory Data

  8. EPP 245 Statistical Analysis of Laboratory Data

  9. Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 2.93 Model | 17101.3907 9 1900.15452 Prob > F = 0.0320 Residual | 9731.24928 15 648.749952 R-squared = 0.6373 -------------+------------------------------ Adj R-squared = 0.4197 Total | 26832.64 24 1118.02667 Root MSE = 25.471 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.54196 4.801699 -0.53 0.604 -12.77654 7.692618 sex | -3.736782 15.45982 -0.24 0.812 -36.68861 29.21505 height | -.4462549 .9033548 -0.49 0.628 -2.37171 1.4792 weight | 2.992816 2.007957 1.49 0.157 -1.287044 7.272675 bmp | -1.744944 1.155237 -1.51 0.152 -4.207274 .7173865 fev1 | 1.080697 1.080947 1.00 0.333 -1.223288 3.384682 rv | .196972 .1962136 1.00 0.331 -.2212474 .6151915 frc | -.3084314 .4923899 -0.63 0.540 -1.357936 .7410729 tlc | .1886017 .4997351 0.38 0.711 -.8765585 1.253762 _cons | 176.0582 225.8911 0.78 0.448 -305.4174 657.5338 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  10. . regress pemax age height weight bmp fev1 rv frc tlc Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 8, 16) = 3.49 Model | 17063.4886 8 2132.93607 Prob > F = 0.0159 Residual | 9769.15144 16 610.571965 R-squared = 0.6359 -------------+------------------------------ Adj R-squared = 0.4539 Total | 26832.64 24 1118.02667 Root MSE = 24.71 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.114515 4.330841 -0.49 0.632 -11.29549 7.066459 height | -.394836 .851725 -0.46 0.649 -2.200412 1.41074 weight | 2.834909 1.841995 1.54 0.143 -1.069947 6.739765 bmp | -1.741637 1.120651 -1.55 0.140 -4.117312 .634038 fev1 | 1.26509 .7429407 1.70 0.108 -.3098737 2.840054 rv | .1779046 .1742911 1.02 0.323 -.1915759 .5473852 frc | -.2483218 .4122804 -0.60 0.555 -1.122317 .6256736 tlc | .2084044 .4782484 0.44 0.669 -.8054369 1.222246 _cons | 153.0385 198.7149 0.77 0.452 -268.2183 574.2953 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  11. . regress pemax age height weight bmp fev1 rv frc Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 7, 17) = 4.16 Model | 16947.5458 7 2421.07798 Prob > F = 0.0077 Residual | 9885.09416 17 581.476127 R-squared = 0.6316 -------------+------------------------------ Adj R-squared = 0.4799 Total | 26832.64 24 1118.02667 Root MSE = 24.114 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -2.663193 4.043832 -0.66 0.519 -11.19493 5.868546 height | -.4895733 .8036502 -0.61 0.550 -2.185127 1.205981 weight | 3.155659 1.647815 1.92 0.072 -.3209274 6.632245 bmp | -1.962543 .9753332 -2.01 0.060 -4.020316 .0952305 fev1 | 1.247861 .7239953 1.72 0.103 -.2796361 2.775357 rv | .1595988 .1650733 0.97 0.347 -.1886753 .5078729 frc | -.1764595 .368749 -0.48 0.638 -.9544518 .6015328 _cons | 198.2942 165.3311 1.20 0.247 -150.5238 547.1123 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  12. . regress pemax age height weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 6, 18) = 5.04 Model | 16814.3899 6 2802.39832 Prob > F = 0.0034 Residual | 10018.2501 18 556.569447 R-squared = 0.6266 -------------+------------------------------ Adj R-squared = 0.5022 Total | 26832.64 24 1118.02667 Root MSE = 23.592 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -1.819342 3.560301 -0.51 0.616 -9.299258 5.660573 height | -.4101508 .7693006 -0.53 0.600 -2.026391 1.20609 weight | 2.874434 1.506126 1.91 0.072 -.2898203 6.038688 bmp | -1.949083 .9538193 -2.04 0.056 -3.952983 .0548169 fev1 | 1.411959 .6238279 2.26 0.036 .1013452 2.722573 rv | .0955779 .0946057 1.01 0.326 -.1031813 .2943371 _cons | 166.9049 148.4762 1.12 0.276 -145.0321 478.8418 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  13. . regress pemax height weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 5, 19) = 6.23 Model | 16669.0534 5 3333.81068 Prob > F = 0.0014 Residual | 10163.5866 19 534.92561 R-squared = 0.6212 -------------+------------------------------ Adj R-squared = 0.5215 Total | 26832.64 24 1118.02667 Root MSE = 23.128 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- height | -.4485274 .7505918 -0.60 0.557 -2.019534 1.122479 weight | 2.338692 1.060094 2.21 0.040 .1198889 4.557495 bmp | -1.641001 .7246036 -2.26 0.035 -3.157614 -.1243885 fev1 | 1.471767 .6007182 2.45 0.024 .2144491 2.729084 rv | .110117 .0884543 1.24 0.228 -.07502 .295254 _cons | 137.0958 133.8559 1.02 0.319 -143.0677 417.2594 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  14. . regress pemax weight bmp fev1 rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 4, 20) = 7.96 Model | 16478.0401 4 4119.51002 Prob > F = 0.0005 Residual | 10354.5999 20 517.729996 R-squared = 0.6141 -------------+------------------------------ Adj R-squared = 0.5369 Total | 26832.64 24 1118.02667 Root MSE = 22.754 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 1.748914 .3806332 4.59 0.000 .9549274 2.542901 bmp | -1.377243 .5653421 -2.44 0.024 -2.556526 -.1979604 fev1 | 1.547698 .5776112 2.68 0.014 .3428223 2.752574 rv | .1257152 .0831456 1.51 0.146 -.0477234 .2991538 _cons | 63.9467 53.27673 1.20 0.244 -47.18661 175.08 ------------------------------------------------------------------------------ Least significant variable EPP 245 Statistical Analysis of Laboratory Data

  15. . regress pemax weight bmp fev1 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | 1.536475 .3644235 4.22 0.000 .7786149 2.294335 bmp | -1.465406 .5792906 -2.53 0.019 -2.670106 -.260705 fev1 | 1.108629 .5143694 2.16 0.043 .0389396 2.178319 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

  16. . stepwise, pr(.05): regress pemax age sex height weight bmp fev1 rv frc tlc begin with full model p = 0.8123 >= 0.0500 removing sex p = 0.6688 >= 0.0500 removing tlc p = 0.6384 >= 0.0500 removing frc p = 0.6156 >= 0.0500 removing age p = 0.5572 >= 0.0500 removing height p = 0.1462 >= 0.0500 removing rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fev1 | 1.108629 .5143694 2.16 0.043 .0389396 2.178319 weight | 1.536475 .3644235 4.22 0.000 .7786149 2.294335 bmp | -1.465406 .5792906 -2.53 0.019 -2.670106 -.260705 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

  17. . stepwise, pr(.1) pe(.05): regress pemax age sex height weight bmp fev1 rv frc tlc begin with full model p = 0.8123 >= 0.1000 removing sex p = 0.6688 >= 0.1000 removing tlc p = 0.6384 >= 0.1000 removing frc p = 0.6156 >= 0.1000 removing age p = 0.5572 >= 0.1000 removing height p = 0.1462 >= 0.1000 removing rv Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 3, 21) = 9.28 Model | 15294.4519 3 5098.15064 Prob > F = 0.0004 Residual | 11538.1881 21 549.437528 R-squared = 0.5700 -------------+------------------------------ Adj R-squared = 0.5086 Total | 26832.64 24 1118.02667 Root MSE = 23.44 ------------------------------------------------------------------------------ pemax | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fev1 | 1.108629 .5143694 2.16 0.043 .0389396 2.178319 weight | 1.536475 .3644235 4.22 0.000 .7786149 2.294335 bmp | -1.465406 .5792906 -2.53 0.019 -2.670106 -.260705 _cons | 126.3336 34.71986 3.64 0.002 54.12965 198.5375 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

  18. Cautionary Notes • The significance levels are not necessarily believable after variable selection • The original full model F-statistic is significant, indicating that there is some significant relationship: F(9,15) = 2.93, p = 0.0320 • After variable selection, F(3,21) = 9.28, p = 0.0004, which is biased. EPP 245 Statistical Analysis of Laboratory Data

  19. set obs 25 generate x1 = invnormal(uniform()) generate x2 = invnormal(uniform()) generate x3 = invnormal(uniform()) generate x4 = invnormal(uniform()) generate x5 = invnormal(uniform()) generate x6 = invnormal(uniform()) generate x7 = invnormal(uniform()) generate x8 = invnormal(uniform()) generate x9 = invnormal(uniform()) generate y = invnormal(uniform()) regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 stepwise, pr(.1): regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 EPP 245 Statistical Analysis of Laboratory Data

  20. . regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 9, 15) = 0.91 Model | 12.3235639 9 1.36928488 Prob > F = 0.5397 Residual | 22.5105993 15 1.50070662 R-squared = 0.3538 -------------+------------------------------ Adj R-squared = -0.0340 Total | 34.8341632 24 1.45142347 Root MSE = 1.225 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | -.0441858 .2998066 -0.15 0.885 -.6832085 .594837 x2 | -.9078136 .4347798 -2.09 0.054 -1.834525 .0188976 x3 | .2076754 .3789522 0.55 0.592 -.6000421 1.015393 x4 | -.0056383 .3319125 -0.02 0.987 -.7130931 .7018166 x5 | -.330546 .3854497 -0.86 0.405 -1.152113 .4910207 x6 | .0202964 .3470704 0.06 0.954 -.7194666 .7600594 x7 | -.073401 .3135234 -0.23 0.818 -.7416603 .5948583 x8 | -.0552909 .3026913 -0.18 0.858 -.7004621 .5898803 x9 | -.3190092 .3137931 -1.02 0.325 -.9878434 .349825 _cons | -.2490392 .3078424 -0.81 0.431 -.9051898 .4071113 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

  21. . stepwise, pr(.1): regress y x1 x2 x3 x4 x5 x6 x7 x8 x9 begin with full model p = 0.9867 >= 0.1000 removing x4 p = 0.9545 >= 0.1000 removing x6 p = 0.8456 >= 0.1000 removing x1 p = 0.8165 >= 0.1000 removing x7 p = 0.7506 >= 0.1000 removing x8 p = 0.5023 >= 0.1000 removing x3 p = 0.2866 >= 0.1000 removing x5 p = 0.2081 >= 0.1000 removing x9 Source | SS df MS Number of obs = 25 -------------+------------------------------ F( 1, 23) = 7.23 Model | 8.33379862 1 8.33379862 Prob > F = 0.0131 Residual | 26.5003646 23 1.15218977 R-squared = 0.2392 -------------+------------------------------ Adj R-squared = 0.2062 Total | 34.8341632 24 1.45142347 Root MSE = 1.0734 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x2 | -.6644002 .2470417 -2.69 0.013 -1.175445 -.1533555 _cons | -.1523124 .214703 -0.71 0.485 -.5964594 .2918346 ------------------------------------------------------------------------------ EPP 245 Statistical Analysis of Laboratory Data

More Related