Dr.-Ing. Stefan Hinz Remote Sensing Technology Technische Universität München

Some Basics of Matrix Calculus 1 Matrices in general2 Eigen-decomposition, Principal component analysis3 Singular value decomposition Dr.-Ing. Stefan Hinz Remote Sensing Technology Technische Universität München

1 Matrices in general Definition Types Rank Determinant Trace Matrix operations Geometric interpretation

Matrices – Definition • Matrix: System of elements with m rows andn columns • Vector: Matrix with m = 1 or n = 1 • Notation: Matrix Column vector Row vector • Equality of two matrices (or vectors): for

Matrices – Types (I) • Rectangular (general) matrix • Square matrix aii => main diagonal elements • Symmetric martix • Skew-symmetric matrix

Matrices – Types (II) • Upper triangular matrix analogue: lower triangular matrix • Diagonal matrix • Unity matrix / identity matrix for for • Zero matrix

Rank • Rank r = rg A  Number of linearly independent rows or columns with Example: 1 linearly independent row 1 linearly independent column NB: A square matrix with linearly dependent rows (and columns) is called singular; otherwise non-singular. A matrix always has the same maximum number of linearly independent rows and linearly independent columns (w/o proof). The difference d of number of rows n and rank r is called rank defect:

Determinant • Determinant d = det A d skalar det A defined for square matrices only • 2-by-2 matrix: • 3-by-3 matrix: det A = • n-by-n matrix: - Generalization of 2-by-2 or 3-by-3 case is NOT possible! - Many calculations involved when using sub-determinants => Standard procedure: - Transform A into upper triangular matrix B with an determinant-preserving Algorithm (e.g. Gauss-elimination), i.e. det B = det A - Calculate det B via det B = bii

Trace • Trace of a square matrix: sum of main diagonal elements tr A =

Matrix operations (I) • Transpose of matrix A => A’ • Exchange rows and columns of a matrix • For symmetric matrices: • In case of symmtetric complex valued matrices: AH = A (AH is called „Hermitian“ (Hermitesch) and is the transpose of Acontaining the conjugate complex elements of A ) • Sum and difference of matrices Rules: A + B = B + A (commutative) A + ( B + C ) = ( A + B ) + C (associative) ( A + B )’ = A’ + B’ sp ( A + B ) = sp A + sp B

Matrix operations (II) • Product of matrices: • Matrix and scalar: • Matrix and matrix (or vector): C = AB ; (only defined if number of columns of A and number of rows of B are identical) Rules: A ( BC ) = ( AB ) C (associative) A ( B + C ) = AB + AC (distributive) ( A + B ) C = AC + BC det ( AB ) = det ( BA ) = det A det B sp ( AB ) = sp ( BA ) rg ( AB ) ≤min ( rg A, rg B ) BUT: not commutative, i.e, in general

Matrix operations (III) • Examples of matrix / vector products • linear equation system: • Inner product (scalar product of vectors x and y): • Outer product (dyadic product of vectors x and y):

Matrix operations (IV) • Inverse of matrix A => with only exists if det A ≠ 0 (rg A = n). Rules: • Calculation of Solve equation system A X = I for X with (e.g. using Cholesky-decomposition)

Matrix operations (V) • Inversion of special matrices: • 2-by-2 matrix: => • Diagonal matrix: => • Symmetric matrix: => i.e. A-1 is also symmetric • Orthogonal matrix: =>

Matrix operations (VI) • Derivative: • Inner product: => explicitly: => c/x = y • Quadratic form with A being symmetric (i.e. A’ = A): => • Linear system with vector y being a function of vector x (i.e. y(x)): z = A y(x)=>

Geometric interpretation of matrices (I) • m-by-n Matrix X: • m row-vectors define a n-dimensional space ( mn ) • n column-vectors define also a n-dimensional space ( mn ) • Vectors generating the space must be linearly independent • Example:

Geometric interpretation of matrices (II) That means: • Matrix X establishes a linear mapping from m-space into n-space • a diagonal matrix applies only a scaling • Products of matrices are combinations of linear mappings • Solution of the linear system X w = b There exists a solution if a vector w being equal to b can be generated by linear combination of the column vectors of X b lies in the (hyper-)plane defined by column vectors of X • Least Squares Solution of X w = b => b lies not in column-space of X => approximate solution: determine w such that the error e = X w - b (i.e. the euclidean distance) is minimized

2 Eigen-decomposition, principal component analysis Motivation Eigen-decomposition Principal component analysis (PCA) – Application to multispectral images

Rx Tx B Plane  y x 2x 2u x,y 0 x,y 0 2y 2v Eigen-decomposition – Motivation • Example: A set-up of a typical geodetic measurement • Observations (measurements): B: Basisvector RTX: 2D-Vector in  (plane defined by B and B  V) RRX: 2D-Vector in  • Unknowns: X,Y: 2D-point co-ordinates in plane  • Least-squares solution delivers unknows, X,Y, and 2-by-2 variance-covariance matrix of unknowns, x,y = • x,y defines the error ellipse wrt. to the co-ordinate axes x, y. • In addition, also the orientation of the ellipse (i.e. the main axis) is of greatest interest • Coordinate transformation (x,y) => (u,v) necessary, such that u,v = => Principal component analysis / eigen-decomposition acomplishes diagonalization of x,y

Eigen-decomposition – Definition (I) • Eigenvalues of a n-by-n (square) matrix A: λi • Eigenvectors of a n-by-n (square) matrix A: y i • Def: A y = yλ => a linear mapping A of eigenvector y yields only a new length of y (byλ) • Solving for λ and y: A y = yλ (λI - A) y = 0  For y ≠ 0: Φ(λ) = det (λI - A) = 0 => λ Φ(λ): characteristic Polynomial then insert λand solve for y (by a non-trivial solution!) (λI - A) y = 0 => y

Eigen-decomposition – Definition (II) • The above definition A y = yλ can be generalized for all eigenvalues λi and eigenvectors y i of matrix A • Requirements: • λi ≠λj for all i,j • A must be diagonalizable • Equivalently: => Y diagonalizes A (as required in the above example) • Finally, sort λi in descending order (through exchange of rows or columns)

Eigen-decomposition – Definition (III) • Solution of characteristical polynomial Φ(λ) = 0 for λ may be inefficient => Exploit special cases (if possible) • Eigen-decomposition of upper triangular matrix R: • Eigen-decomposition of symmetric matrix S: Matrix S symmetric: S = S‘ Eigen-decomposition yields: S = YY-1 = S‘ = (Y-1)‘‘Y‘ Since  is diagonal matrix:  = ‘ and Y-1 = Y‘ (i.e. Y orthogonal) Consequently, for symmetric matrices S: S = Y Y‘ Equivalently:  = Y‘SY

Application to multi-spectral images (I) • Given: multi-spectral image with n channels (e.g. n = 3 for Red, Green, Blue) Usually the grayvalues of these channels are correlated • Task: data representation without correlation => data compression (potentially) • Procedure: • Calculation of n-by-1 mean vector m and n-byn variance-covariance matrix x m = K: Number of pixel per channel x : symmetric • Diagonalization of x via Eigen-decomposition: • Solve for eigenvalues λ : det • Eigenvectors gi :with gi being column-vectors of G • Diagonalization of x yields d : d = G‘x G • d is diagonal matrix, i.e. correlations between channels have been removed • G is orthogonal matrix (row-vectors of G generate orthogonal basis) • Transform channels using G : d = Gx  md = Gmx => channels with small eigenvalues contain less information and may be omitted

Red channel Green channel Blue channel 1. Channel after PCA 2. Channel after PCA 3. channel after PCA Application to multi-spectral images (II) • Example: Original true color imgae (RGB)

3 Singular value decomposition Definition Relation to eigenvalues and eigenvectors

Singular value decomposition (I) • Singular value decomposition (SVD) can be understood as generalization of eigen-decomposition for non-square matrices: • diagonalization of m-by-n matrix X: with (dimension n-by-n) U: m-by-m matrix V: n-by-n matrix • schematic:

Singular value decomposition (II) • As for principal component analysis, an application of SVD is data compression or dimension reduction: • Sort singular values • Omit insignificant values, i.e. reduce dimension of n: k < n • Schematic:

Relation to Eigen-decomposition • Singular value decomposition of matrix A: A = UV‘ • Expansion by A‘yields: A‘A = VU‘UV‘ • Finally, with U‘U = I and V‘ = V-1 (U and V orthogonal): A‘A = V2V-1 • Eigen-Decomposition of A‘A yields: A‘A = YY-1 => The squared singular values of A equal the eigenvalues of A‘A

Dr.-Ing. Stefan Hinz Remote Sensing Technology Technische Universität München