Results

This is a summary of the results. The data below are the yield of a run of some machine learning to the datasets Wine Quality and Housing Price.

Note

This is a toy project to test the building of a reproducible research project. Therefore, the results are not the primary focus

That said, the results are not bad. Different models were tested on the three datasets and different preprocessing methods were applied, as well as evaluation metrics.

For each dataset, three different preprocessing methods were tested: min-max scaling, z-normalisation and polynomial features.

Then the following models were tested: Linear Regression and Decision Tree Classifier.

The evaluation metrics used were: explained variance, r2, mean absolute error, mean squared error and root mean squared error.

The results are summarized in the following tables and all the details are available when running the code (see _quick_start-label):

Red_wine Dataset model for Random_State_3

Expl Var

r2

MAE

MSE

RMSE

min-max_scaling Linear_Regression Decision_Tree_Cla

0.360000 -0.470000

0.360000 -0.490000

0.520000 0.780000

0.460000 1.060000

0.680000 1.030000

z-normalisation Linear_Regression Decision_Tree_Cla

0.360000 -0.280000

0.360000 -0.280000

0.530000 0.660000

0.460000 0.910000

0.680000 0.950000

polynomial_featur Linear_Regression Decision_Tree_Cla

0.290000 -0.210000

0.290000 -0.210000

0.550000 0.640000

0.500000 0.860000

0.710000 0.930000

White_wine Dataset model for Random_State_3

Expl Var

r2

MAE

MSE

RMSE

min-max_scaling Linear_Regression Decision_Tree_Cla

0.200000 -0.820000

-0.570000 -0.890000

0.900000 0.860000

1.200000 1.440000

1.100000 1.200000

z-normalisation Linear_Regression Decision_Tree_Cla

0.290000 -0.290000

0.290000 -0.290000

0.570000 0.670000

0.540000 0.980000

0.740000 0.990000

polynomial_featur Linear_Regression Decision_Tree_Cla

0.250000 -0.320000

0.250000 -0.330000

0.560000 0.690000

0.570000 1.010000

0.760000 1.010000

House Dataset model for Random_State_3

Expl Var

r2

MAE

MSE

RMSE

min-max_scaling Linear_Regression Decision_Tree_Cla

0.600000 0.670000

0.600000 0.660000

3.970000 3.730000

33.430000 28.020000

5.780000 5.290000

z-normalisation Linear_Regression Decision_Tree_Cla

0.630000 0.460000

0.620000 0.450000

3.750000 4.720000

31.150000 45.290000

5.580000 6.730000

polynomial_featur Linear_Regression Decision_Tree_Cla

0.720000 0.660000

0.720000 0.650000

2.910000 3.700000

23.090000 29.180000

4.800000 5.400000