Least Square Method for Parameter Estimation

Least Square Method for Parameter Estimation

Least Square Method for Parameter Estimation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. Least Square Method for Parameter Estimation Docent Xiao-Zhi Gao Department of Automation and Systems Technology

2. Fitting Experimental Data • Scatterplot shows some patterns in experimental data • When we see a straight line pattern, we want to model the data with a linear equation • Fitting experimental data with models allows us to understand data and make predictions

3. Fitting Experimental Data

4. Fitting Experimental Data Do the curve fit a group of given data samples in the best way?

5. Gauss and Legendre • The method of least square was first published by Legendre in 1805 and by Gauss in 1809 • Both mathematicians applied it to determine the orbits of bodies about the sun • Gauss went on to publish further development of this method in 1821

6. Carl Gauss (1777 – 1855)

7. Fitting Experimental Data • Given: data points, functional form • Goal: find constants in function • Example: given (xi, yi), find line through them; i.e., find a and b in y = ax+b (x6,y6) y=ax+b (x3,y3) (x5,y5) (x1,y1) (x7,y7) (x4,y4) (x2,y2)

8. Fitting Experimental Data • Example: measure positions of falling objects,fit parabola time p = –1/2 gt2 position  Estimate g from fitting the position

9. Least Square Line • How can we find the best line to fit the data? • We would like to minimize the total distance away from the line • This distance is measured vertically from the point to the line

10. An Example of Least Squares Line Consider four data samples (1 ,2.1), (2,2.9), (5,6.1), and (7,8.3) , with the fitting line f(x) = 0.9x + 1.4 The squared errors are: x1=1 f(1)=2.3 y1=2.1 e1= (2.3 – 2.1)² = .04 x2=2 f(2)=3.2 y2=2.9 e2= (3.2 – 2.9)² =. 09 x3=5 f(5)=5.9 y3=6.1 e3= (5.9 – 6.1)² = .04 x4=7 f(7)=7.7 y4=8.3 e4= (7.7 – 8.3)² = .36 The total squared error is 0.04 + 0.09 + 0.04 + 0.36 = 0.53 By finding better coefficients of the fitting line, we can make this error even smaller But how to find the best coefficients so that this error is minimized?

11. An Example of Least Squares Line

12. Minimizing Vertical Distances between Points and Line • E = (d1)² + (d2)² + (d3)² +…+(dn)² for n data points • E = [f(x1) – y1]² + [f(x2) – y2]² + … + [f(xn) – yn]² • E = [mx1 + b – y1]² + [mx2 + b – y2]² +…+ [mxn + b – yn]² • E= ∑(mxi+ b – yi )²

13. Minimization of E • How do we deal with this? E = ∑(mxi+ b – yi )² • Treat x and y as constants, since we are trying to find the optimal m and b • Partials of E wrt. to m and b E/m = 0 and E/b = 0

14. Minimization of E • E/b = ∑2(mxi + b – yi) = 0 m∑xi + ∑b = ∑yi->mSx + bn = Sy • E/m = ∑2xi (mxi + b – yi) = 2∑(mxi² + bxi – xiyi) = 0 m∑xi² + b∑xi = ∑xiyi->mSxx + bSx = Sxy

15. Solutions of m and b

16. A Simple Example • Find the linear least squares fit to the data: (1,1), (2,4), (3,8) Sx = 1+2+3= 6 Sxx = 1²+2²+3² = 14 Sy = 1+4+8 = 13 Sxy = 1(1)+2(4)+3(8) = 33 n = number of points = 3 The line of best fit is y = 3.5x – 2.667

17. Line of Best Fit: y = 3.5x – 2.667

18. Linear Least Squares • General pattern: • Note that dependence on unknowns should be linear

19. Solving Linear Least Squares • Take partial derivatives:

20. Solving Linear Least Square • For convenience, rewrite as matrix: • Factors:

21. Solving Linear Least Squares

22. Solving Linear Least Squares • for each xi,yicompute f(xi), g(xi), etc. store in row i of A store yi in b • compute (ATA)-1 ATb • (ATA)-1 AT is known as “pseudoinverse” of A

23. A Special Case: Zero Order • In case of fitting data with a constant (zero order) • In this case, f(xi) = 1, and we have • Mean is the least square estimation for best constant fit (k-means clustering)

24. Zero Order vs. First Order

25. Weighted Least Square • Common case: (xi,yi) may have different uncertainties associated with them • Want to give more weights to measurements of which you are more certain • Weighted least square minimization

26. Weighted Least Squares • Define weight matrix W as • Solve weighted least squares via

27. Conclusions • Least square method is the most commonly used parameter estimation method • It seeks to minimize the measures of the differences between the approximating function and given data points • Least square method is not the best fit for data that is not linear

28. Computer Exercise • Least square method for parameter estimation of Hooke’s law • Hooke’s law of spring • Our task is to estimate k0 and k1, based on the measurement data of l and f

29. Data for Spring Example

30. Computer Exercise • Estimate parameters of k0 and k1 • Extend the first order polynomial to higher order polynomials: • Estimate k0, k1,k2, ..., up to the fourth order polynomial

31. Fitting Data Using Polynomials