1 / 38

Vector geometry: A visual tool for statistics

Vector geometry: A visual tool for statistics. Sylvain Chartier Laboratory for Computational Neurodynamics and Cognition Centre for Neural Dynamics. Vector geometry. How using a vector (arrow) we can represent concepts of

ringk
Télécharger la présentation

Vector geometry: A visual tool for statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Vector geometry: A visual tool for statistics Sylvain Chartier Laboratory for Computational Neurodynamics and Cognition Centre for Neural Dynamics

  2. Vector geometry • How using a vector (arrow) we can represent concepts of • Mean, variance (standard deviation), normalization and standardization. • How using two vectors we can represent concepts of • Correlation and regression.

  3. A datum (0) (16)

  4. Two data (8) (0) (16) Principal of independence of observation : perfectly opposed direction

  5. Two data (8) (16,8) (0) (0, 0) (16)

  6. Two data (16,8) (0, 0)

  7. Starting point: Zero Finish point (16,8) Starting point (0,0)

  8. Starting point: Mean Finish point x = (x1, x2) Starting point

  9. Starting point: Mean Starting point (12, 12) Finish point x = (16, 8)

  10. One group

  11. Many groups

  12. Degrees of freedom

  13. We remove the effect of the meanWe centralized the data Starting point (mean) (12, 12) Finish point x = (16, 8) (0, 0) = (4, -4)

  14. We remove the effect of the mean(many groups)

  15. We remove the effect of the mean(many groups)

  16. We remove the effect of the mean(many groups) What is the real dimensionality?

  17. We remove the effect of the man • If we have two data, we will get one dimension. • If we have three data, we will get two dimensions . . . • If we have n data, we will get n-1 dimensions. • In other words, degrees of freedom represent the true dimensionality of the data..

  18. Variance

  19. What is the difference between these three (composed of two data each) ? • Length (distance) • The higher the variability, the longer the length • will be. (2.5, -2.5) (1.5, -1.5) (-0.5, 0,5)

  20. What is the difference between these three groups? • How do we measure the length (distance)? • Pythagoras • Hypotenuse of a triangle • ? = (4^2+3^2) = 25 = 5 (4,3) 5 ? 3 4

  21. What is the difference between these three groups? Therefore, the point (4,3) is at a distance of 5 from its starting point. = sum of squares = variance×(n-1) (4,3) 5

  22. What is the difference between these three groups? What is the length of this three lines? 1 ? A) 1 1 1 3 ? C) 1 2 ? 1 B) • The dimensionality inflates the variability. • In order to a have measure that can take into account for the dimensionality, what do we need to do? 1

  23. What is the difference between these three groups? • We divide the length of the data set by its true dimensionality = (quadratic) distance (from the mean) corrected by the (true) dimensionality of the data.

  24. Normalization et standardization

  25. Normalization vs Standardization • To normalize is equivalent as to bring a given vector x (arrow) centered (mean = 0) at a length of 1.. • Normalization: z = x  by its length • zTz = 1 • Standardization: zx = x  SD • zxTzx = n-1 • => zx = z*(n-1)

  26. Two groups

  27. One group of three participants

  28. Two groups of three participants

  29. Two groups of three participants • They can be represented by a plane

  30. Two groups of three participants • They can be represented by a plane

  31. Two groups of three participants • They can be represented by a plane

  32. Two groups of three participants • They can be represented by a plane • This is true whatever the number of participants

  33. Correlation and regression

  34. Relation between two vectors • If two groups (u and v) has the same data, then the two vectors are superposed on each other. • As the two vectors distinguish from each other, the angle between them will increase.

  35. Relation between two vectors • If the angle reaches 90 degrees, then they share nothing in common.

  36. Relation between two vectors • The cosine of the angle is the coefficient of correlation

  37. e Relation between two vectors • Regression: b • The shortest distance is the one that crosses at 90° the vector u

  38. Relation between two vectors • Regression: The formula to obtain the regression coefficients can be obtained directly from the geometry • By substitution, we can isolate the b1 coefficient. • If we generalized to any situation (multiple, multivariate)

More Related