Thursday, March 23, 2023

Least‐Square Regression: Curve fitting with the application of MATLAB

Least‐Square Regression
 Curve fitting, also known as regression analysis, is the process of finding a mathematical function that best represents a set of data points. This function, often a polynomial or other mathematical equation, is chosen to approximate the relationship between the input variables (the independent variable) and the output variables (the dependent variable) in the data set.

Curve fitting is commonly used in scientific and engineering applications to model the behavior of physical systems, analyze experimental data, and make predictions based on past observations. It is also used in finance, economics, and other fields to forecast future trends or estimate the parameters of complex models.

There are various methods of curve fitting, including linear regression, nonlinear regression, and polynomial regression. These techniques involve different mathematical approaches for determining the best-fit curve, depending on the complexity of the data set and the desired level of accuracy.


Least-Square Regression is a statistical method used to estimate the relationship between a dependent variable and one or more independent variables. The method involves finding the line or curve that best fits the data points by minimizing the sum of the squared differences between the actual values and the predicted values.

In simple linear regression, the method involves fitting a straight line to the data points, where the line is represented by the equation:

y = a + bx

where y is the dependent variable, x is the independent variable, a is the y-intercept, and b is the slope of the line. The goal of least-squares regression is to find the values of a and b that minimize the sum of the squared differences between the actual values of y and the predicted values of y.

To accomplish this, the method involves calculating the residuals, which are the differences between the actual values of y and the predicted values of y for each data point. The sum of the squared residuals is then minimized by finding the values of a and b that result in the smallest value of this sum.

The method can be extended to multiple linear regression, where there are multiple independent variables, and the goal is to find the equation of a hyperplane that best fits the data points. In this case, the method involves minimizing the sum of the squared differences between the actual values of y and the predicted values of y, where the predicted values are calculated using a linear combination of the independent variables.

Least-squares regression is widely used in various fields, including finance, economics, engineering, and physics, to model and predict relationships between variables


It is a statistical procedure to find the best fit for a set of data points by minimizing the sum of the squares of the offset or residuals (difference between an observed value and the fitted value). When we square each of those errors and add them all up, the total is as small as possible (that is why it is called "least squares").

Fig. Error between actual and interpolated


Now let us do a certain example using Matlab.


=====================================================================
% Least‐Square Regression of second-order interpolation
disp('General formula of 2nd Least SQ. Regression: y=a0+a1x+a2*x^2')
A=[0 1 2 3 4 5]; % Here we define the coefficient matrix
B=[2.1 7.7 13.6 27.2 40.9 61.1];
n=length(A);
A2=A.^2; %yield element-by-element multiplication of both matrices
A3=A.^3;
A4=A.^4;
AB=A.*B;
A2B=A2.*B;
% define Summation
sum_A=sum(A)
sum_A2=sum(A2)
sum_A3=sum(A3)
sum_A4=sum(A4)
sum_B=sum(B)
sum_AB=sum(AB)
sum_A2B=sum(A2B)
% Construct the sum matrix
C=[n sum_A sum_A2; sum_A sum_A2 sum_A3; sum_A2 sum_A3 sum_A4]
D=[sum_B; sum_AB;sum_A2B]
% Now apply crammer's rule
% The coefficient matrix as C and D
detC = det(C);
% Compute the determinant of C with column 1 replaced by D
detC1 = det([D C(:,2:3)]);
detC2 = det([C(:,1) D C(:,3)]);
detC3 = det([C(:,1:2) D]);
% Compute the solutions using Cramer's Rule|| determine the value of a(i)
a0 = detC1 / detC;
a1 = detC2 / detC;
a2 = detC3 / detC;
% Print the solutions to the console| i.e print the coefficients
fprintf('a0 = %f\n', a0);
fprintf('a1 = %f\n', a1);
fprintf('a2 = %f\n', a2);
fprintf('y=%f+%f*x+%f*x^2',a0,a1,a2)

The result of this simple code can be shown as follows:
======================================================


least_sq_Second_order
The general formula of 2nd Least SQ. Regression: y=a0+a1x+a2*x^2

sum_A =
15
sum_A2 =
55
sum_A3 =
225
sum_A4 =
979
sum_B =
152.6000
sum_AB =
585.6000

sum_A2B =
2.4888e+03
C =
6 15 55
15 55 225
55 225 979
D =
1.0e+03 *
0.1526
0.5856
2.4888
a0 = 2.478571
a1 = 2.359286
a2 = 1.860714
y=2.478571+2.359286*x+1.860714*x^2
 

=======================================================================
The manually calculated result is the same: You may check by considering the following table and operations.






No comments:

Post a Comment