Least SquaresWritten by Paul BourkeJune 1997 One often has a number of data points and the need to describe the behaviour of the data with a function of a known form. In general the data points are noisy samples and the functions parameters can't be chosen so as to exactly describe all the samples. The desire then is to find the parameters of the function that "best" describe the samples. The method of least squared error is one common choice of the "best" estimate. The squared error is the sum of the squared differences between each sample and the expected value as predicted by the function. The parameter values that minimise this error are considered the best estimates. The first example will illustrate the key points behind the technique by applying it to a problem that can readily be worked out manually. Example 1Consider finding the equation of a straight line given n samples (x_{i},y_{i}). We will fit the function f(x) = ax + c, the normal definition of a linear function where a is the slope, and c is where the line intercepts the y axis. The key features and conventions are illustrated below.
The general expression for the least square error E is defined as
In this particular case it is
This function will have a minimum where the first derivatives with respect to a and c are 0.
DE/dc = 0 = (ax_{0} + c  y_{0}) + (ax_{1} + c  y_{1}) + ... + (ax_{n1} + c  y_{n1})
a (x_{0} + x_{1} + ... + x_{n1}) + c (1 + 1 + ... + 1) = y_{0} + y_{1} + ... + y_{n1}
y = (y_{0}, y_{1}, y_{2}, ... y_{n1}) u = (1,1,1,... 1) then we can write the problem as two simultaneous equations with two unknowns, a and c as follows
a u.x + c u.u = u.y
Solving the above two equations for a and c is straightforward. In general all least squares estimates irrespective of the type of function or how many parameters can be formulated into an set of simultaneous equations. There are the same number of equations as there are unknowns or parameters to be fitted.
Example 2To fit a plane f(x,y) = ax + by + c given samplesy = (y_{0}, y_{1}, y_{2}, ... y_{n1}) z = (z_{0}, z_{1}, z_{2}, ... z_{n1}) and applying the same derivation as in the case of the line we get 3 simultaneous equations and 3 unknowns
a y.x + b y.y + c y.u = y.z a u.x + b u.y + c u.u = u.z
