Commits

Anonymous committed 75bf746

Updates to the documentation.

  • Participants
  • Parent commits d2bc01f

Comments (0)

Files changed (3)

File docs/rst/Orange.evaluation.reliability.rst

 ********************************************************
 
 Reliability assessment aims to predict reliabilities of individual
-predictions. 
-
-Most of implemented algorithms for regression described in
-"Comparison of approaches for estimating reliability of individual
-regression predictions, Zoran Bosnić, 2008" for regression and in
-in "Evaluating Reliability of Single
-Classifications of Neural Networks, Darko Pevec, 2011" for classification.
+predictions. Most of implemented algorithms for regression described in
+[Bosnic2008]_ and in [Pevec2011]_ for classification.
 
 We can use reliability estimation with any Orange learners. The following example:
 
  * Constructs reliability estimators (implemented in this module),
- * Combines a regular learner.
-   (:class:`~Orange.classification.knn.kNNLearner` in this case) with
-   reliability estimators.
+ * The :obj:`Learner` wrapper combines a regular learner, here a :obj:`~Orange.classification.knn.kNNLearner`, with reliability estimators.
  * Obtains prediction probabilities from the constructed classifier
    (:obj:`Orange.classification.Classifier.GetBoth` option). The resulting
-   probabilities have and additional attribute, :obj:`reliability_estimate`
-   attribute, :class:`Orange.evaluation.reliability.Estimate`.
+   probabilities have an additional attribute, :obj:`reliability_estimate`,
+   that contains a list of :class:`Orange.evaluation.reliability.Estimate`.
 
 .. literalinclude:: code/reliability-basic.py
     :lines: 7-
 .. literalinclude:: code/reliability-run.py
     :lines: 7-
 
+Reliability estimation wrappers
+===============================
+
+.. autoclass:: Learner
+   :members: __call__
+
+.. autoclass:: Classifier
+   :members: __call__
+
+
 Reliability Methods
 ===================
 
-For regression, you can use all the described measures except :math:`O_{ref}`. Classification is
+All measures except :math:`O_{ref}` work with regression. Classification is
 supported by BAGV, LCV, CNK and DENS, :math:`O_{ref}`.
 
 Sensitivity Analysis (SAvar and SAbias)
 
 
 Stacked generalization (Stacking)
--------------------------------
+---------------------------------
 
 .. autoclass:: Stacking
 
 
 .. autoclass:: ReferenceExpectedError
 
-Reliability estimation wrappers
-===============================
-
-.. autoclass:: Learner(box_learner, name="Reliability estimation", estimators=[SensitivityAnalysis(), LocalCrossValidation(), BaggingVarianceCNeighbours(), Mahalanobis(), MahalanobisToCenter()], **kwds)
-    :members:
-
-.. autoclass:: Classifier
-    :members:
-
 Reliability estimation results
 ==============================
 
 References
 ==========
 
-Bosnić, Z., Kononenko, I. (2007) `Estimation of individual prediction
-reliability using local sensitivity analysis. <http://www.springerlink
-.com/content/e27p2584387532g8/>`_ *Applied Intelligence* 29(3), pp. 187-203.
+.. [Bosnic2007]  Bosnić, Z., Kononenko, I. (2007) `Estimation of individual prediction reliability using local sensitivity analysis. <http://www.springerlink.com/content/e27p2584387532g8/>`_ *Applied Intelligence* 29(3), pp. 187-203.
 
-Bosnić, Z., Kononenko, I. (2008) `Comparison of approaches for estimating
-reliability of individual regression predictions. <http://www.sciencedirect
-.com/science/article/pii/S0169023X08001080>`_ *Data & Knowledge Engineering*
-67(3), pp. 504-516.
+.. [Bosnic2008] Bosnić, Z., Kononenko, I. (2008) `Comparison of approaches for estimating reliability of individual regression predictions. <http://www.sciencedirect .com/science/article/pii/S0169023X08001080>`_ *Data & Knowledge Engineering* 67(3), pp. 504-516.
 
-Bosnić, Z., Kononenko, I. (2010) `Automatic selection of reliability estimates
-for individual regression predictions. <http://journals.cambridge
-.org/abstract_S0269888909990154>`_ *The Knowledge Engineering Review* 25(1),
-pp. 27-47.
+.. [Bosnic2010] Bosnić, Z., Kononenko, I. (2010) `Automatic selection of reliability estimates for individual regression predictions. <http://journals.cambridge .org/abstract_S0269888909990154>`_ *The Knowledge Engineering Review* 25(1), pp. 27-47.
 
-Pevec, D., Štrumbelj, E., Kononenko, I. (2011) `Evaluating Reliability of
-Single Classifications of Neural Networks. <http://www.springerlink.com
-/content/48u881761h127r33/export-citation/>`_ *Adaptive and Natural Computing
-Algorithms*, 2011, pp. 22-30.
+.. [Pevec2011] Pevec, D., Štrumbelj, E., Kononenko, I. (2011) `Evaluating Reliability of Single Classifications of Neural Networks. <http://www.springerlink.com /content/48u881761h127r33/export-citation/>`_ *Adaptive and Natural Computing Algorithms*, 2011, pp. 22-30.

File docs/rst/conf.py

 
 # Example configuration for intersphinx: refer to the Python standard library.
 intersphinx_mapping = {
-    'python': ('http://python.readthedocs.org/en/latest/', None),
+    'python': ('http://python.readthedocs.org/en/latest/', None)
 }
 

File orangecontrib/reliability/__init__.py

 
     :rtype: :class:`Orange.evaluation.reliability.ReferenceExpectedErrorClassifier`
 
-    Reference reliability estimation method for classification as used in Evaluating Reliability of Single
-    Classifications of Neural Networks, Darko Pevec, 2011.
+    Reference reliability estimation method for classification [Pevec2011]_:
 
-    :math:`O_{ref} = 2 (\hat y - \hat y ^2) = 2 \hat y (1-\hat y)`
+    :math:`O_{ref} = 2 (\hat y - \hat y ^2) = 2 \hat y (1-\hat y)`,
 
     where :math:`\hat y` is the estimated probability of the predicted class.
 
-    Note that for this method, in contrast with all others, a greater estimate means lower reliability (greater
-    expected error).
+    Note that for this method, in contrast with all others, a greater estimate means lower reliability (greater expected error).
 
     """
     def __init__(self, name="reference"):
     :rtype: :class:`Orange.evaluation.reliability.MahalanobisClassifier`
     
     Mahalanobis distance reliability estimate is defined as
-    `mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_
+    `Mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_
     to the evaluated instance's :math:`k` nearest neighbours.
 
     
     :rtype: :class:`Orange.evaluation.reliability.MahalanobisToCenterClassifier`
     
     Mahalanobis distance to center reliability estimate is defined as a
-    `mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_
+    `Mahalanobis distance <http://en.wikipedia.org/wiki/Mahalanobis_distance>`_
     between the predicted instance and the centroid of the data.
 
     
     
     :rtype: :class:`Orange.evaluation.reliability.BaggingVarianceCNeighboursClassifier`
     
-    BVCK is a combination (average) of Bagging variance and local modeling of
+    BVCK is an average of Bagging variance and local modeling of
     prediction error.
     
     """
 
         return [Estimate(DENS, ABSOLUTE, DENS_ABSOLUTE)]
 
+
+def _normalize(data):
+    dc = Orange.core.DomainContinuizer()
+    dc.classTreatment = Orange.core.DomainContinuizer.Ignore
+    dc.continuousTreatment = Orange.core.DomainContinuizer.NormalizeByVariance
+    domain = dc(data)
+    data = data.translate(domain)
+    return data
+
+class _NormalizedLearner(Orange.classification.Learner):
+    """
+    Wrapper for normalization.
+    """
+    def __init__(self, learner):
+        self.learner = learner
+
+    def __call__(self, data, *args, **kwargs):
+        return self.learner(_normalize(data), *args, **kwargs)
+
 class Stacking:
+    """
 
-    def __init__(self, stack_learner, estimators=None, folds=10, save_data=False):
+    This methods develops a model that integrates reliability estimates
+    from all available reliability scoring techniques. To develop such
+    model it needs to performs internal cross-validation, similarly to :class:`ICV`.
+
+    :param stack_learner: a data modelling method. Default (if None): unregularized linear regression with prior normalization.
+    :type stack_learner: :obj:`Orange.classification.Learner` 
+
+    :param estimators: Reliability estimation methods to choose from. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter`.
+    :type estimators: :obj:`list` of reliability estimators
+ 
+    :param folds: The number of fold for cross validation (default 10).
+    :type box_learner: :obj:`int`
+
+    :param save_data: If True, save the data used for training the
+        model for intergration into resulting classifier's .data attribute (default False).
+    :type box_learner: :obj:`bool`
+ 
+    """
+ 
+    def __init__(self, 
+        stack_learner=None, 
+        estimators=None, 
+        folds=10, 
+        save_data=False):
         self.stack_learner = stack_learner
         self.estimators = estimators
         self.folds = folds
         self.save_data = save_data
+        if self.stack_learner is None:
+            self.stack_learner=_NormalizedLearner(Orange.regression.linear.LinearRegressionLearner(ridge_lambda=0.0))
         if self.estimators is None:
              self.estimators = [SensitivityAnalysis(),
                            LocalCrossValidation(),
                 error = ex[-1].value - pred[0].value
                 data_cv.append(estimates + [ abs(error) ])
 
-            print "DCV", len(data_cv)
-
         lf = None
 
         #induce the classifier on cross-validated reliability estimates
         #induce reliability estimates on the whole data set
         lf = Learner(learner, estimators=self.estimators)(data)
 
-        if self.save_data:
-            self.classifier_data = classifier_data
-
-        return StackingClassifier(stack_classifier, lf, newdomain)
+        return StackingClassifier(stack_classifier, lf, newdomain, data=classifier_data if self.save_data else None)
 
 
 class StackingClassifier:
 
-    def __init__(self, stacking_classifier, reliability_classifier, domain):
+    def __init__(self, stacking_classifier, reliability_classifier, domain, data=None):
         self.stacking_classifier = stacking_classifier
-        print self.stacking_classifier
         self.domain = domain
         self.reliability_classifier = reliability_classifier
+        self.data = data
 
     def convert(self, instance):
         """ Return example in the space of reliability estimates. """
         return [ Estimate(r, ABSOLUTE, STACKING) ]
 
 class ICV:
-    """ Perform internal cross validation (as in Automatic selection of
-    reliability estimates for individual regression predictions,
-    Zoran Bosnic, 2010) and return id of the method
-    that scored best on this data.
+    """ Selects the best reliability estimator for
+    the given data with internal cross validation [Bosnic2010]_.
+
+    :param estimators: reliability estimation methods to choose from. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter` ]
+    :type estimators: :obj:`list` of reliability estimators
+ 
+    :param folds: The number of fold for cross validation (default 10).
+    :type box_learner: :obj:`int`
+ 
     """
   
     def __init__(self, estimators=None, folds=10):
 
 class Learner:
     """
-    Reliability estimation wrapper around a learner we want to test.
-    Different reliability estimation algorithms can be used on the
-    chosen learner. This learner works as any other and can be used as one,
-    but it returns the classifier, wrapped into an instance of
+    Adds reliability estimation to any learner: multiple reliability estimation 
+    algorithms can be used simultaneously.
+    This learner can be used as any other learner,
+    but returns the classifier wrapped into an instance of
     :class:`Orange.evaluation.reliability.Classifier`.
     
-    :param box_learner: Learner we want to wrap into a reliability estimation
+    :param box_learner: Learner to wrap into a reliability estimation
         classifier.
     :type box_learner: :obj:`~Orange.classification.Learner`
     
-    :param estimators: List of different reliability estimation methods we
-                       want to use on the chosen learner.
+    :param estimators: List of reliability estimation methods. Default (if None): :class:`SensitivityAnalysis`, :class:`LocalCrossValidation`, :class:`BaggingVarianceCNeighbours`, :class:`Mahalanobis`, :class:`MahalanobisToCenter`.
     :type estimators: :obj:`list` of reliability estimators
     
-    :param name: Name of this reliability learner
+    :param name: Name of this reliability learner.
     :type name: string
     
     :rtype: :class:`Orange.evaluation.reliability.Learner`
     def __call__(self, instances, weight=None, **kwds):
         """Learn from the given table of data instances.
         
-        :param instances: Data instances to learn from.
+        :param instances: Data to learn from.
         :type instances: Orange.data.Table
         :param weight: Id of meta attribute with weights of instances
         :type weight: int
+
         :rtype: :class:`Orange.evaluation.reliability.Classifier`
         """
 
  
 class Classifier:
     """
-    A reliability estimation wrapper for classifiers.
-
-    What distinguishes this classifier is that the returned probabilities (if
-    :obj:`Orange.classification.Classifier.GetProbabilities` or
-    :obj:`Orange.classification.Classifier.GetBoth` is passed) contain an
-    additional attribute :obj:`reliability_estimate`, which is an instance of
-    :class:`~Orange.evaluation.reliability.Estimate`.
-
+    A reliability estimation wrapper for classifiers. 
+    The returned probabilities contain an
+    additional attribute :obj:`reliability_estimate`, which is a list of
+    :class:`~Orange.evaluation.reliability.Estimate` (see :obj:`~Classifier.__call__`).
     """
 
     def __init__(self, instances, box_learner, estimators, blending, blending_domain, rf_classifier, **kwds):
         When :obj:`result_type` is set to
         :obj:`Orange.classification.Classifier.GetBoth` or
         :obj:`Orange.classification.Classifier.GetProbabilities`,
-        an additional attribute :obj:`reliability_estimate`,
-        which is an instance of
-        :class:`~Orange.evaluation.reliability.Estimate`,
+        an additional attribute :obj:`reliability_estimate`
+        (a list of :class:`~Orange.evaluation.reliability.Estimate`)
         is added to the distribution object.
         
         :param instance: instance to be classified.