1 / 23

VecTOR PROJECTionS

VecTOR PROJECTionS. 90˚. Matrix Operation: Inverse MAtrix. Important for solving a set of linear equations, is the matrix operation that d efines an inverse of a matrix. X -1 : Inverse matrix of X. X -1 X = I

jesus
Télécharger la présentation

VecTOR PROJECTionS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VecTORPROJECTionS 90˚

  2. Matrix Operation: Inverse MAtrix Important for solving a set of linear equations, is the matrix operation that defines an inverse of a matrix. X-1 : Inverse matrix of X X-1 X = I where I is the identity matrix:all entries on the diagonal are 1,all others 0 ( here for 3 x 3 matrix)

  3. Matrix Operation: Important for solving a set of linear equations, is the matrix operation that defines an inverse of a matrix. X-1 : Inverse matrix of X X-1 X = Iwhere I is the identity matrix Not all matrices have an inverse matrix and there is not a simple rule how to calculate the entries in an inverse matrix! We skip the formal mathematical aspects and note here only the important facts: For symmetric square matrices like covariance matrices or correlation matrices the inverse exists

  4. Summary Simple Linear Regression Principal Component Analysis

  5. Summary 2-dimensional sample space: Simple Linear Regression: Minimizes the Summed Squared Errors (measured in the vertical direction between Fitted regression line and observed data points) Principal Component Analysis: Finds the direction of vector that maximizes the variance that is projecting onto this vector.

  6. Regression analysis in R Simple linear regression in R: the function res<-lm( y ~ x ) calculates the linear regression line It returns a number of useful additional statistical measures of the quality of the regression line.

  7. Regression line using res$fitted

  8. Residuals (errors) res$residuals Remember: We assumed that errorsare uncorrelated to the ‘predictor’ variable x. It is recommended to check that the errors itself do NOT have an organized structure when plotted over x.

  9. Histogram of residuals (errors) hist(res$residuals) Remember: We assumed that errorsare uncorrelated to the ‘predictor’ variable x. It is recommended to check also if the errors follow a Gaussian (bell-shaped) distribution. Note: the function fgauss() is defined in myfunctions.R [call source(“scripts/myfunctions.R”)]

  10. Linear Regression statistics When applying linear regression, a number of test statistics are calculated in R’s lm() function. Slope of regression line Statistical significance: The smaller thevalue, the higher the significance of the linear relationship (slope >0) Regression Parameter (slope) Correlation coefficient between the fitted y-values and observed y-values

  11. Linear Regression:usE the Linear Regression with caution! Outliers can have a large effect and suggest a linear relationship where there is none! It can be tested for the influence of single outlier observations. The sample space is important! If you only observed x and y in a limited range or a subdomain of the sample space,

  12. Linear Regression:The Danger of using the Linear Regression! Outliers can have a large effect and suggest a linear relationship where there is none! It can be tested for the influence of single outlier observations. The sample space is important! If you only observed x and y in a limited range or a subdomain of the sample space, extrapolation can give misleading results

  13. Multiple LINEAR REGRESSION Random Error (noise) Predictand (e.g. Albany Airport Temperature anomalies) Predictors: e.g.: Temperatures from nearby stations or: Indices of Large-Scale Climate Modeslike El Nino Southern Oscillation, North Atlantic Oscillation or: prescribed time-dependent functions like linear trend, periodic oscillation, polynoms Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  14. Multiple LINEAR REGRESSION Write a set of linear equations for each observation in the sample (e.g. for each year of temperature observations Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  15. Multiple LINEAR REGRESSION Or in short Matrix notation Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  16. Multiple LINEAR REGRESSION size of the vectors / matrices: n x 1 n x k k x 1 n x 1 The mathematical problem we need to solve is: Given all the observations of the predictand (stored in vector ) and the predictor variables stored in matrix X, we want to find simultaneously a for each predictor variable a proper scaling factor, such that the fitted estimated values minimize the sum of the squared errors. Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  17. Multiple LINEAR REGRESSION We find here the covariance matrix (scaled by n) of the predictor variables. The ‘-1’ indicates another fundamentally important matrix operation: The inverse of a matrix Covariance (scaled by n) of all predictors with the predictand size of the vectors / matrices: k x 1 ( k x n n x k ) k x n n x 1 (k x k) (k x 1) Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  18. Multiple LINEAR REGRESSION The resulting kx1 matrix (i.e. vector) contains a proper scaling factor for each predictor. In other words: multiple linear regression is a weighted sum of the predictors (after conversion intounits of the predictand y). size of the vectors / matrices: k x 1 ( k x n n x k ) k x n n x 1 (k x k) (kx1) Source: http://reliawiki.org/index.php/Multiple_Linear_Regression_Analysis (figures retrieved April 2014)

  19. Example Multiple Linear Regressionwith 2 predictors The scatter cloud shows a linear dependence of the values in y along the two predictor dimensions x1 x2.

  20. General rule: work with as few predictors as possible. (every time you add a new predictoryou increase the risk of over-fitting the model) Observe how good the fitted values and observed values match (correlation) Choose predictors that provide independent information about the predictand The problem of collinearity: If the predictors are all highly correlated among each otherthen the MLR can become very ambiguous (because it gets harder to calculate accurately the inverse of the covariance matrix) Last but not least: the regression coefficients from the MLR are not ‘unique’. If you add or remove one predictor, all regression coefficients can change. Tips for multiple linear regression (MLR)

  21. Global Sea Surface Temperatures Principal Component Analysis From voluntary ship observations colors show the percentage of months with at least one observation in a 2 by 2 degree grid box. From paper in Annual Review of Marine Science (2010)

  22. Global Sea Surface Temperatures Principal Component Analysis Climatology 1982-2008 Red areas mark regions with highest SST variability

  23. Global Sea Surface Temperatures Principal Component Analysis Principal Component Analysis (PCA) (Empirical Orthogonal Functions (EOF)) The first leading Eigenvector Eigenvectors form now geographic pattern. Grids with high positive values and large negative values are covarying out of phase (negative correlation). Green regions show small variations in this Eigenvector #1. The Principal Component is a time series showing the temporal evolution of the SST variations. This mode is associated with the El Niño - Southern Oscillation

More Related