Hello everybody,

recently I've decided to implement some ML library in C#. But mine ultimate goal of that C# library was to make it use as much as possible TPL features of C#.

I'm aware about other ML libraries, but as often is said, if you wish to understand how some ML algorithm works, program it by yourself. I've decided to start with Linear regression.

Here you can see how mine library utilizes processor:

as you can see it uses 85% of mine 4-cores CPU, which is impossible with only one thread running.

First step for me was to understand how that algorithm works. In order to understand it I've took course Machine Learning Specialization

from University of Washington.

I'll omit formulas on how to implement linear regression, but I want to describe following case.

In order to test correctness of mine linear regression model, I've decided to throw on it system of linear equations:

roots of this system of equations are the following: x1 = 3, x2 = 8, x3 = 24.

Below goes code, that feeds into Linear regression that system of equations, and executes linear regression training:

public void TestLinearRegression()
{
double [][] inputs = new double[3][];
inputs[0] = new double[3];
inputs[1] = new double[3];
inputs[2] = new double[3];
/*
system of equations:
x1 + x2 + x3 = 35
x1 + x2 - x3 = -13
x1 - x2 + x3 = 19
x1 should be 3, x2 should be 8, x3 should be 24
*/
inputs[0][0] = 1;
inputs[0][1] = 1;
inputs[0][2] = 1;
inputs[1][0] = 1;
inputs[1][1] = 1;
inputs[1][2] = -1;
inputs[2][0] = 1;
inputs[2][1] = -1;
inputs[2][2] = 1;
double [] outputs = new double[3];
outputs[0] = 35;
outputs[1] = -13;
outputs[2] = 19;
var normalizer = new MultiThreadedNormalization();
var regressor = new LinearRegressor(normalizer, epsilon:0.0000001);
var matrix = new Matrix(inputs);
var weights = regressor.Fit(matrix, outputs);
var inputs2 = new double[1][];
inputs2[0] = new double[3];
inputs2[0][0] = 2;
inputs2[0][1] = 2;
inputs2[0][2] = 2;
var inputMatrix = new Matrix(inputs2);
var result = regressor.Predict(inputMatrix);

As a reesult of execution of this code following 4 weights were received:

You may wonder, why four weights? Why not three if we have just three roots? x1, x2, x3. In order to understand this, you should keep in mind that linear regression adds one more weight. So in total result linear regression will give you one more important kriteria which is treated like free element.

Now let's take a look on those weights closer. As you can see we have four values: -1.435, 7.991, 23.991, 6.864. If to compare those four values with x1, x2, x3, we see that x2 is very close to weight2, x3 is close to weight3. Weight 1 is pretty far from 3. And even combined Weight 1 plus weight 4 doesn't give good approximation of x1. But if to throw on linear regression algorithm some values, result of calculations will be 67.96 which is relatively close to real value 70.