orange-reliability / docs / rst / Orange.evaluation.reliability.rst

Full commit

Reliability estimation (Orange.evaluation.reliability)

Reliability assessment statistically predicts reliability of single predictions. Most of implemented algorithms for regression are taken from Comparison of approaches for estimating reliability of individual regression predictions, Zoran Bosnić, 2008. Implementations for classification follow descriptions in Evaluating Reliability of Single Classifications of Neural Networks, Darko Pevec, 2011.

The following example shows basic usage of reliability estimation methods:

The important points of this example are:

It is also possible to do reliability estimation on whole data table, not only on single instance. Next example demonstrates usage of a cross-validation technique for reliability estimation. Reliability estimations for first 10 instances get printed:

Reliability Methods

For regression, all the described measures can be used, except for the \(O_{ref}\) . Classification domains are supported by the following methods: BAGV, LCV, CNK and DENS, \(O_{ref}\) .

Sensitivity Analysis (SAvar and SAbias)

Variance of bagged models (BAGV)

Local cross validation reliability estimate (LCV)

Local modeling of prediction error (CNK)

Bagging variance c-neighbours (BVCK)

Mahalanobis distance

Mahalanobis to center

Density estimation using Parzen window (DENS)

Internal cross validation (ICV)

Stacked generalization (Stacking)

Reference Estimate for Classification (\(O_{ref}\) )

Reliability estimation wrappers

Reliability estimation results

There is a dictionary named :obj:`METHOD_NAME` that maps reliability estimation method IDs (ints) to method names (strings).

In this module, there are also two constants for distinguishing signed and absolute reliability estimation measures:


Reliability estimation scoring


This script prints out Pearson's R coefficient between reliability estimates and actual prediction errors, and a corresponding p-value, for each of the reliability estimation measures used by default.

Estimate               r       p
SAvar absolute        -0.077   0.454
SAbias signed         -0.165   0.105
SAbias absolute        0.095   0.352
LCV absolute           0.069   0.504
BVCK absolute          0.060   0.562
BAGV absolute          0.078   0.448
CNK signed             0.233   0.021
CNK absolute           0.058   0.574
Mahalanobis absolute   0.091   0.375
Mahalanobis to center  0.096   0.349


Bosnić, Z., Kononenko, I. (2007) Estimation of individual prediction reliability using local sensitivity analysis. Applied Intelligence 29(3), pp. 187-203.

Bosnić, Z., Kononenko, I. (2008) Comparison of approaches for estimating reliability of individual regression predictions. Data & Knowledge Engineering 67(3), pp. 504-516.

Bosnić, Z., Kononenko, I. (2010) Automatic selection of reliability estimates for individual regression predictions. The Knowledge Engineering Review 25(1), pp. 27-47.

Pevec, D., Štrumbelj, E., Kononenko, I. (2011) Evaluating Reliability of Single Classifications of Neural Networks. Adaptive and Natural Computing Algorithms, 2011, pp. 22-30.