orange-multitarget / docs / rst / ClusteringRandomForest.rst

Miran Levar 273d15d 

Aleš Erjavec 4e65e12 
Miran Levar 273d15d 

Aleš Erjavec 4e65e12 

.. image:: ../../orangecontrib/multitarget/widgets/icons/ClusteringRandomForest.png
   :alt: Widget icon

   - Data
   		Data to be used for learning.

   - Learner or Classifier


.. image:: images/crf1.*
   :alt: Usage example

A clustering random forest is a random forest consisting of clustering trees. The usage is straightforward and the setting are described below.

* Number of trees in forest

    Number of trees in forest. 

* Stop splitting nodes at depth

    Maximal depth of tree.
* Minimal majority class proportion

    Minimal proportion of the majority class value each of the class variables has to reach
    to stop induction (only used for classification). 

* Min mean squared error

    Minimal mean squared error each of the class variables has to reach
    to stop induction (only used for regression). 

* Min. instances in leaves

    Minimal number of instances in leaves. Instance count is weighed.

* Feature scorer

        * Inter dist (default) - Euclidean distance between centroids of clusters
        * Intra dist - average Euclidean distance of each member of a cluster to the centroid of that cluster
        * Silhouette - silhouette ( measure calculated with euclidean distances between clusters instead of elements of a cluster.
        * Gini-index - calculates the Gini-gain index, should be used with class variables with nominal values