The prediction model for Otsu's, Sauvola's and Shijian's binarization algorithms were generated with the methodology described in section~\ref{subsection-prediction}. The coefficients associated with the most significant selected features, their p-values and the intercept of the linear predictive function are detailed in Tables~\ref{otsuPredictionModel}, \ref{sauvolaPredictionModel} and \ref{shijianPredictionModel}. If a feature is not present in a table, then it was not selected by the step wise algorithm. As mentioned in the previous section, the cross validation for each model gives the pair $(\bar{\alpha}, \bar{R^{2}})$.
\paragraph{Otsu's binarization method}
-The most significant selected features for Otsu's prediction model are $ \mIInk $, $ v_{I}$, $ v_{B} $, $ \mu_{B} $, $\mu$ and $v$ (see Table~\ref{otsuPredictionModel} for the coefficients of the predictive function). For Otsu's prediction model, we can explain the feature selection by the fact that Otsu's binarization method is based on global thresholding. That is why features such as $\mIInk$, $\mu$ and $v$ are significant and have such low p-values. The model's $ R^{2}$ equals $0.93$, which is considered very good \ref{cohen}.
+The most significant selected features for Otsu's prediction model are $ \mIInk $, $ v_{I}$, $ v_{B} $, $ \mu_{B} $, $\mu$ and $v$ (see Table~\ref{otsuPredictionModel} for the coefficients of the predictive function). For Otsu's prediction model, we can explain the feature selection by the fact that Otsu's binarization method is based on global thresholding. That is why features such as $\mIInk$, $\mu$ and $v$ are significant and have such low p-values. The model's $ R^{2}$ equals $0.93$, which is considered very good \cite{cohen}.
The cross-validation gives a $\bar{\alpha}$ coefficient of $0.989$ and $\bar{R^{2}}$ of $0.987$. These results indicate that our model does not depend on the chosen training data.
Among the 18 features, most models embed about 7 features. Globally the selected features are consistent with the binarization algorithm : the step wise selection process tends to keep global (resp. local) features for global (resp. local) binarization algorithms. We also note that $\mS$ is never selected by any prediction model. Indeed, the binarization accuracy is measured at the pixel level (f-score). With this accuracy measure, the feature $\mSG$ becomes more significant than $\mS$, which may not have been the case with another evaluation measure.
-The $R^{2}$ values show the quality of each prediction model. The prediction models of Sahoo and Niblack binarization methods were not kept for the statistical validation step since the $R^{2}$ values were below 0.7. For these two binarization models new features have to be created in order to obtain more accurate prediction models.
+The $R^{2}$ values show the quality of each prediction model. The prediction models of Sahoo and Niblack binarization methods were not kept for the statistical validation step since the $R^{2}$ values were below 0.7. We can point out that the $R^2$ value lowers when few dedicated features are selected in the model (e.g. 1 for Sahoo's and 0 for Niblack's). Therefore new features should be designed in theses cases in order to obtain more accurate prediction models.
The two values $\bar{R^2}$ and $mpe$ show the accuracy of each prediction model on the validation step. A $\bar{R^{2}}$ value higher than $0.7$ indicates that it is possible to predict the results of a binarization method~\cite{cohen}. As a result, $12$ binarization methods can be well predicted. The mean percentage error ($mpe$) is the average difference between predicted f-scores and real f-scores. This value is around $5\%$.
Riddler & $ \mIInk$; $v$; $v_{D}$; $v_{I} $ & 0.75 & 0.98 & 5\% \\
-Sahoo & $ \mIInk$; $\mu$; $s_{B}$; $v_{I}$; $\mu_{D}$; $\mu_{I} $ & 0.68 & 0.99 & 5\% \\
+Sahoo & $ \mIInk$; $\mu$; $s_{B}$; $v_{I}$; $\mu_{D}$; $\mu_{I} $ & 0.68 & - & - \\
Shanbag & $\mIInk$; $s$; $v$; $s_{D}$; $s_{I}$; $v_{D}$; $v_{I}$ & 0.73 & 0.98 & 6\% \\
White & $ \mIInk$; $\mSG$; $s$; $v$; $\mu_{D}$; $\mu_{I}$; $v_{D}$ & 0.92 & 0.99 & 7\% \\
-Niblack & $ \mu$; $v$; $s_{G}$; $v_{B}$; $\mu_{B}$ & 0.59 & 0.93 & 11\% \\
+Niblack & $ \mu$; $v$; $s_{G}$; $v_{B}$; $\mu_{B}$ & 0.59 & - & - \\