0dd2fcc
Merge
committed
Commits
Comments (0)
Files changed (1)

+17 20IJDAR/prediction.tex
IJDAR/prediction.tex
\caption{Otsu prediction model : all selected features are significant (pvalue $<0.1$), and the model is likely to correctly predict future unknown images given that the $R^{2}$ value is higher than $0.9$. $\hat{mpe}$ denotes the mean percentage error.}
+\caption{Otsu prediction model : all selected features are significant (pvalue $<0.1$), and the model is likely to correctly predict future unknown images given that the $R^{2}$ value is higher than $0.9$. The mean percentage error is denoted by $mpe$.}
The same experiment was conducted on the other binarization methods (see Table~\ref{otherPredictionModel}). All prediction models have an $\bar{R^{2}}$ value higher than $0.7$, indicating that it is possible to predict the results of $12$ binarization methods.
+The same experiment was conducted on the other binarization methods. Table~\ref{otherPredictionModel} sums up the selected features and the significant information to validate or not a binarization prediction model.
+Among the 18 features, most models embed about 7 features. Globally the selected features are consistent with the binarization algorithm : the step wise selection process tends to keep global (resp. local) features for global (resp. local) binarization algorithms. We also note that $\mS$ is never selected by any prediction model. Indeed, the binarization accuracy is measured at the pixel level (fscore). With this accuracy measure, the feature $\mSG$ becomes more significant than $\mS$, which may not have been the case with another evaluation measure.
We also note that $\mS$ is never selected by any prediction model. Indeed, the binarization accuracy is measured at the pixel level (fscore). With this accuracy measure, the feature $\mSG$ becomes more significant than $\mS$, which may not have been the case with another evaluation measure.
+A $\bar{R^{2}}$ value higher than $0.7$ indicates that it is possible to predict the results of a binarization method~\cite{cohen}. As a result, $12$ binarization methods can be well predicted. The mean percentage error ($mpe$) is the average difference between predicted fscores and real fscores. This value is around $5\%$.
\caption{Accuracy of the prediction model for the other eight binarization methods. The selected features are different from one method to another. The accuracy and robustness of the prediction models are good (cross validation $\bar{R^{2}} > 0.7$). $\hat{mpe}$ denotes the mean percentage error of each model.}
+\caption{Accuracy of the prediction model for the other eight binarization methods. The selected features are different from one method to another. The accuracy and robustness of the prediction models are good (cross validation $\bar{R^{2}} > 0.7$). The mean percentage error of each model is denoted by $mpe$.}
Table \ref{selectionRes} presents some fscore statistics obtained from binarizing the DIBCO dataset. The first line corresponds to the best theoretical fscores (having the ground truth, we know for each image the binarization method that will provide the best fscore). The second line corresponds to the fscores obtained using only Shijian's method. The last line corresponds to the fscores obtained using our automatic binarization selection.
+More generally, table \ref{selectionRes} presents some fscore statistics obtained from binarizing the DIBCO dataset. The first line corresponds to the best theoretical fscores (having the ground truth, we know for each image the binarization method that will provide the best fscore). The second line corresponds to the fscores obtained using only Shijian's method. The last line corresponds to the fscores obtained using our automatic binarization selection.
We analyse the accuracy of our binarization method selection algorithms in several ways. First, the method has a slightly better (2\%) mean accuracy than using only Shijian's method. Importantly, note that our algorithm has a higher global accuracy (the standard deviation equals $0.04$). Last, the worst binarization result of our method is much higher than Shijian's (56\%).
Second, we compared our method with the optimal selection that we can compute from the ground truth. The results are very similar, indicating that the prediction models are accurate enough to select the best binarization method for each image (70\% perfect match). The mean error of our method is $0.009$ (standard deviation equals $0.02$), and, the worst error equals $0.06$.
+We analyse the accuracy of our binarization method selection algorithms in several ways. As expected, the method has only a slightly better (2\%) mean accuracy than using only Shijian's method. What is significant is that the standard deviation lowers from $0.12$ to $0.04$. It means that the worst binarization result of our method is much higher than Shijian's (56\%).
+We also compared our method with the optimal selection that we can compute from the ground truth. The results are very similar, indicating that the prediction models are accurate enough to select the best binarization method for each image (70\% perfect match). The mean error of our method is $0.009$ (standard deviation equals $0.02$), and, the worst error equals $0.06$.