# Commits

committed 0dd2fcc Merge

On vaide toute les méthodes ?

• Participants
• Parent commits e450ca1, 6cf1dc5

# File IJDAR/prediction.tex

 \begin{figure*}[!htbp]
 \begin{center}
 \includegraphics[width=500px]{imgs/shema.png}
-\caption{Overall process to create a prediction model for a specific binarisation algorithm.}
+\caption{Overall process to create a prediction model for a specific binarization algorithm.}
 \label{shema}
 \end{center}
 \end{figure*}

 \begin{center}
 \begin{table}[ht]
-\caption{Otsu prediction model : all selected features are significant (p-value $<0.1$), and the model is likely to correctly predict  future unknown images given that the $R^{2}$ value is higher than $0.9$. $\hat{mpe}$ denotes the mean percentage error.}
+\caption{Otsu prediction model : all selected features are significant (p-value $<0.1$), and the model is likely to correctly predict  future unknown images given that the $R^{2}$ value is higher than $0.9$. The mean percentage error is denoted by $mpe$.}
 \label{otsuPredictionModel}
 {\small
 \hfill{}
 $\mu$       	& 	$2.44e-02$ 	& 	$<10^{-4}$ \\
 $v$        	& 	$3.26e-04$ 	&       $<10^{-4}$ \\
 \hline
-\multicolumn{3}{|c|}{$R^2 = 0.93, \hat{mpe} = 5\%$}\\
+\multicolumn{3}{|c|}{$R^2 = 0.93, mpe = 5\%$}\\
 \hline
 \end{tabular}}
 \hfill{}

 \begin{center}
 \begin{table}[ht]
-\caption{Sauvola prediction models. $\hat{mpe}$ denotes the mean percentage error}
+\caption{Sauvola prediction models.}
 \label{sauvolaPredictionModel}
 {\small
 \hfill{}
 $s_{I}$   		&	$1.34e-01$	&  $<10^{-4}$  \\
 $v_{I}$     		&     $4.41e-04$    	&  $<10^{-4}$  \\
 \hline
-\multicolumn{3}{|c|}{$R^2 = 0.83, \hat{mpe} = 10\%$ }\\
+\multicolumn{3}{|c|}{$R^2 = 0.83, mpe = 10\%$ }\\
 \hline
 %\hline
 \multicolumn{3}{|c|}{ Sauvola (manually chosen parameters) prediction model }\\
 $s_{I}$   		&	$1.43e-01$	&  $<10^{-4}$  \\
 $v_{I}$     		&       $4.26e-04$    	&  $<10^{-4}$  \\
 \hline
-\multicolumn{3}{|c|}{$R^2 = 0.84, \hat{mpe} = 7\%$ }\\
+\multicolumn{3}{|c|}{$R^2 = 0.84, mpe = 7\%$ }\\
 \hline

 \end{tabular}}

 \begin{center}
 \begin{table}[ht]
-\caption{Shijian prediction model. $\hat{mpe}$ denotes the mean percentage error.}
+\caption{Shijian prediction model. The mean percentage error is denoted by $mpe$.}
 \label{shijianPredictionModel}
 {\small
 \hfill{}
 $s_{D}$  	        & 	$1.33e-01$  		&  $<10^{-3}$  \\
 $\mu_{I}$      		&	$-4.00e-04$		&  $< 0.5$ \\
 \hline
-\multicolumn{3}{|c|}{$R^2 = 0.86, \hat{mpe} = 5\%$ }\\
+\multicolumn{3}{|c|}{$R^2 = 0.86, mpe = 5\%$ }\\
 \hline
 \end{tabular}}
 \hfill{}
 \subsection{Accuracy of other prediction models}
 \label{subsection-other-prediction}

-The same experiment was conducted on the other binarization methods (see Table~\ref{otherPredictionModel}). All prediction models have an $\bar{R^{2}}$ value higher than $0.7$, indicating that it is possible to predict the results of $12$ binarization methods.
+The same experiment was conducted on the other binarization methods. Table~\ref{otherPredictionModel} sums up the selected features and the significant information to validate or not a binarization prediction model.

-<------- modifier à partir d'ici pour la prochaine fois --->
-expliquer un peu plus le tableau
-ajouter erreur moyenne
+Among the 18 features, most models embed about 7 features. Globally the selected features are consistent with the binarization algorithm : the step wise selection process tends to keep global (resp. local) features for global (resp. local) binarization algorithms. We also note that $\mS$ is never selected by any prediction model. Indeed, the binarization accuracy is measured at the pixel level (f-score). With this accuracy measure, the feature $\mSG$ becomes more significant than $\mS$, which may not have been the case with another evaluation measure.

-We also note that $\mS$ is never selected by any prediction model. Indeed, the binarization accuracy is measured at the pixel level (f-score). With this accuracy measure, the feature $\mSG$ becomes more significant than $\mS$, which may not have been the case with another evaluation measure.
-
+The two values $\bar{R^2}$ and $mpe$ show the quality of each prediction model.
+A $\bar{R^{2}}$ value higher than $0.7$ indicates that it is possible to predict the results of a binarization method~\cite{cohen}. As a result, $12$ binarization methods can be well predicted. The mean percentage error ($mpe$) is the average difference between predicted f-scores and real f-scores. This value is around $5\%$.

 \begin{center}
 \begin{table}[ht]
-\caption{Accuracy of the prediction model for the other eight binarization methods. The selected features are different from one method to another. The accuracy and robustness of the prediction models are good (cross validation $\bar{R^{2}} > 0.7$). $\hat{mpe}$ denotes the mean percentage error of each model.}
-\label{otherPredictionModel}
+\caption{Accuracy of the prediction model for the other eight binarization methods. The selected features are different from one method to another. The accuracy and robustness of the prediction models are good (cross validation $\bar{R^{2}} > 0.7$). The mean percentage error of each model is denoted by $mpe$.}
 \hfill{}
 \begin{tabular}{|c|p{3cm}|c|c|c|}

 \hline
-Method &  Selected Features & $R^{2}$ & $\bar{R^{2}}$ & $\hat{mpe}$ \\
+Method &  Selected Features & $R^{2}$ & $mpe$ \\
 \hline
 Bernsen & $\mIInk$; $\mA$; $\mSG$; $v$; $v_{D}$; $v_{I}$ & 0.83 & 0.96 & 6\% \\
 \hline
 \end{figure}


-Table \ref{selectionRes} presents some f-score statistics obtained from binarizing the DIBCO dataset. The first line corresponds to the best theoretical f-scores (having the ground truth, we know for each image the binarization method that will provide the best f-score). The second line corresponds to the f-scores obtained using only Shijian's method. The last line corresponds to the f-scores obtained using our automatic binarization selection.
+More generally, table \ref{selectionRes} presents some f-score statistics obtained from binarizing the DIBCO dataset. The first line corresponds to the best theoretical f-scores (having the ground truth, we know for each image the binarization method that will provide the best f-score). The second line corresponds to the f-scores obtained using only Shijian's method. The last line corresponds to the f-scores obtained using our automatic binarization selection.

-We analyse the accuracy of our binarization method selection algorithms in several ways. First, the method has a slightly better (2\%) mean accuracy than using only Shijian's method. Importantly, note that our algorithm has a higher global accuracy (the standard deviation equals $0.04$). Last, the worst binarization result of our method is much higher than Shijian's (56\%).
-Second, we compared our method with the optimal selection that we can compute from the ground truth. The results are very similar, indicating that the prediction models are accurate enough to select the best binarization method for each image (70\% perfect match). The mean error of our method is $0.009$ (standard deviation equals $0.02$), and, the worst error equals $0.06$.
+We analyse the accuracy of our binarization method selection algorithms in several ways. As expected, the method has only a slightly better (2\%) mean accuracy than using only Shijian's method. What is significant is that the standard deviation lowers from $0.12$ to $0.04$. It means that the worst binarization result of our method is much higher than Shijian's (56\%).
+We also compared our method with the optimal selection that we can compute from the ground truth. The results are very similar, indicating that the prediction models are accurate enough to select the best binarization method for each image (70\% perfect match). The mean error of our method is $0.009$ (standard deviation equals $0.02$), and, the worst error equals $0.06$.