Commits

Andrej T. committed 55d9c65

Reverted changes in the tutorial.

Comments (0)

Files changed (15)

docs/tutorial/rst/basic-exploration.rst

 examples. If you are curious how we do this, here is the code
 (:download:`sample_adult.py <code/sample_adult.py>`)::
 
-   import Orange
-   data = Orange.core.ExampleTable("adult")
-   selection = Orange.core.MakeRandomIndices2(data, 0.03)
+   import orange
+   data = orange.ExampleTable("adult")
+   selection = orange.MakeRandomIndices2(data, 0.03)
    sample = data.select(selection, 0)
    sample.save("adult_sample.tab")
 
 values, and class distribution. Below is the script that does all
 this (:download:`data_characteristics.py <code/data_characteristics.py>`, :download:`adult_sample.tab <code/adult_sample.tab>`)::
 
-   import Orange
-   data = Orange.core.ExampleTable("adult_sample")
+   import orange
+   data = orange.ExampleTable("adult_sample")
    
    # report on number of classes and attributes
    print "Classes:", len(data.domain.classVar.values)
    # count number of continuous and discrete attributes
    ncont=0; ndisc=0
    for a in data.domain.attributes:
-       if a.varType == Orange.core.VarTypes.Discrete:
+       if a.varType == orange.VarTypes.Discrete:
            ndisc = ndisc + 1
        else:
            ncont = ncont + 1
 through the attributes (variable ``a`` is an iteration variable that in is
 each loop associated with a single attribute).  The field ``varType``
 contains the type of the attribute; for discrete attributes, ``varType``
-is equal to ``Orange.core.VarTypes.Discrete``, and for continuous ``varType`` is
-equal to ``Orange.core.VarTypes.Continuous``.
+is equal to ``orange.VarTypes.Discrete``, and for continuous ``varType`` is
+equal to ``orange.VarTypes.Continuous``.
 
 To obtain the number of instances for each class, we first
 initialized a vector c that would of the length equal to the number of
 
    print "Continuous attributes:"
    for a in range(len(data.domain.attributes)):
-       if data.domain.attributes[a].varType == Orange.core.VarTypes.Continuous:
+       if data.domain.attributes[a].varType == orange.VarTypes.Continuous:
            d = 0.; n = 0
            for e in data:
                if not e[a].isSpecial():
 out in a readable form (part of :download:`data_characteristics3.py <code/data_characteristics3.py>`)::
 
    print "\nNominal attributes (contingency matrix for classes:", data.domain.classVar.values, ")"
-   cont = Orange.core.DomainContingency(data)
+   cont = orange.DomainContingency(data)
    for a in data.domain.attributes:
-       if a.varType == Orange.core.VarTypes.Discrete:
+       if a.varType == orange.VarTypes.Discrete:
            print "  %s:" % a.name
            for v in range(len(a.values)):
                sum = 0
 defined. Let us use this function to compute the proportion of missing
 values per each attribute (:download:`report_missing.py <code/report_missing.py>`)::
 
-   import Orange
-   data = Orange.core.ExampleTable("adult_sample")
+   import orange
+   data = orange.ExampleTable("adult_sample")
    
    natt = len(data.domain.attributes)
    missing = [0.] * natt
 -------------------------------
 
 For some of the tasks above, Orange can provide a shortcut by means of
-``Orange.core.DomainDistributions`` function which returns an object that
+``orange.DomainDistributions`` function which returns an object that
 holds averages and mean square errors for continuous attributes, value
 frequencies for discrete attributes, and for both number of instances
 where specific attribute has a missing value.  The use of this object
 is exemplified in the following script (:download:`data_characteristics4.py <code/data_characteristics4.py>`)::
 
-   import Orange
-   data = Orange.core.ExampleTable("adult_sample")
-   dist = Orange.core.DomainDistributions(data)
+   import orange
+   data = orange.ExampleTable("adult_sample")
+   dist = orange.DomainDistributions(data)
    
    print "Average values and mean square errors:"
    for i in range(len(data.domain.attributes)):
-       if data.domain.attributes[i].varType == Orange.core.VarTypes.Continuous:
+       if data.domain.attributes[i].varType == orange.VarTypes.Continuous:
            print "%s, mean=%5.2f +- %5.2f" % \
                (data.domain.attributes[i].name, dist[i].average(), dist[i].error())
    
    print "\nFrequencies for values of discrete attributes:"
    for i in range(len(data.domain.attributes)):
        a = data.domain.attributes[i]
-       if a.varType == Orange.core.VarTypes.Discrete:
+       if a.varType == orange.VarTypes.Discrete:
            print "%s:" % a.name
            for j in range(len(a.values)):
                print "  %s: %d" % (a.values[j], int(dist[i][j]))

docs/tutorial/rst/classification.rst

 from the data that incorporates class-labeled instances, like
 :download:`voting.tab <code/voting.tab>`::
 
-   >>> data = Orange.core.ExampleTable("voting.tab")
+   >>> data = orange.ExampleTable("voting.tab")
    >>> data[0]
    ['n', 'y', 'n', 'y', 'y', 'y', 'n', 'n', 'n', 'y', '?', 'y', 'y', 'y', 'n', 'y', 'republican']
    >>> data[0].getclass()
    <orange.Value 'party'='republican'>
-   
 
 Supervised data mining attempts to develop predictive models from such
 data that, given the set of feature values, predict a corresponding
 
 There are two types of objects important for classification: learners
 and classifiers. Orange has a number of build-in learners. For
-instance, ``BayesLearner`` is a naive Bayesian learner. When
-data is passed to a learner (e.g., ``BayesLearner(data))``, it
+instance, ``orange.BayesLearner`` is a naive Bayesian learner. When
+data is passed to a learner (e.g., ``orange.BayesLearner(data))``, it
 returns a classifier. When data instance is presented to a classifier,
 it returns a class, vector of class probabilities, or both.
 
 will use it to classify the first five instances from this data set
 (:download:`classifier.py <code/classifier.py>`)::
 
-   import Orange
-   from Orange.classification.bayes import _BayesClassifier as BayesClassifier
-
-   data = Orange.core.ExampleTable("voting")
-   classifier = BayesLearner(data)
+   import orange
+   data = orange.ExampleTable("voting")
+   classifier = orange.BayesLearner(data)
    for i in range(5):
        c = classifier(data[i])
        print "original", data[i].getclass(), "classified as", c
 democrats have a class index 1. We find this out with print
 ``data.domain.classVar.values`` (:download:`classifier2.py <code/classifier2.py>`)::
 
-   import Orange
-   from Orange.classification.bayes import _BayesLearner as BayesLearner
-
-   data = Orange.core.ExampleTable("voting")
-   classifier = BayesLearner(data)
+   import orange
+   data = orange.ExampleTable("voting")
+   classifier = orange.BayesLearner(data)
    print "Possible classes:", data.domain.classVar.values
    print "Probabilities for democrats:"
    for i in range(5):
-       p = classifier(data[i], orange.core.GetProbabilities)
+       p = classifier(data[i], orange.GetProbabilities)
        print "%d: %5.3f (originally %s)" % (i+1, p[1], data[i].getclass())
 
 The output of this script is::
 the use of classification trees and to assemble the learner with
 some usual (default) components. Here is a script with it (:download:`tree.py <code/tree.py>`)::
 
-   import Orange
-   from Orange.orng import orngTree
-
-   data = Orange.core.ExampleTable("voting")
+   import orange, orngTree
+   data = orange.ExampleTable("voting")
    
-   tree = orngTree.TreeLearner(data, same_majority_pruning=1, m_pruning=2)
+   tree = orngTree.TreeLearner(data, sameMajorityPruning=1, mForPruning=2)
    print "Possible classes:", data.domain.classVar.values
    print "Probabilities for democrats:"
    for i in range(5):
-       p = tree(data[i], Orange.core.GetProbabilities)
+       p = tree(data[i], orange.GetProbabilities)
        print "%d: %5.3f (originally %s)" % (i+1, p[1], data[i].getclass())
    
    orngTree.printTxt(tree)
 
    Possible classes: <republican, democrat>
    Probabilities for democrats:
-   1: 0.002 (originally republican)
-   2: 0.001 (originally republican)
-   3: 0.995 (originally democrat)
-   4: 0.998 (originally democrat)
-   5: 0.998 (originally democrat)
+   1: 0.051 (originally republican)
+   2: 0.027 (originally republican)
+   3: 0.989 (originally democrat)
+   4: 0.985 (originally democrat)
+   5: 0.985 (originally democrat)
 
 Notice that all of the instances are classified correctly. The last
 line of the script prints out the tree that was used for
 classification::
 
-   physician-fee-freeze=n
-   |    adoption-of-the-budget-resolution=n
-   |    |    education-spending=n
-   |    |    |    synfuels-corporation-cutback=n
-   |    |    |    |    mx-missile=n
-   |    |    |    |    |    handicapped-infants=n: republican (100.00%)
+   physician-fee-freeze=n: democrat (98.52%)
+   physician-fee-freeze=y
+   |    synfuels-corporation-cutback=n: republican (97.25%)
+   |    synfuels-corporation-cutback=y
+   |    |    mx-missile=n
+   |    |    |    el-salvador-aid=y
+   |    |    |    |    adoption-of-the-budget-resolution=n: republican (85.33%)
+   |    |    |    |    adoption-of-the-budget-resolution=y
+   |    |    |    |    |    anti-satellite-test-ban=n: democrat (99.54%)
+   |    |    |    |    |    anti-satellite-test-ban=y: republican (100.00%)
+   |    |    |    el-salvador-aid=n
+   |    |    |    |    handicapped-infants=n: republican (100.00%)
+   |    |    |    |    handicapped-infants=y: democrat (99.77%)
+   |    |    mx-missile=y
+   |    |    |    religious-groups-in-schools=y: democrat (99.54%)
+   |    |    |    religious-groups-in-schools=n
+   |    |    |    |    immigration=y: republican (98.63%)
+   |    |    |    |    immigration=n
+   |    |    |    |    |    handicapped-infants=n: republican (98.63%)
    |    |    |    |    |    handicapped-infants=y: democrat (99.77%)
-   |    |    |    |    mx-missile=y
-   |    |    |    |    |    religious-groups-in-schools=n
-   |    |    |    |    |    |    crime=n: democrat (99.77%)
-   |    |    |    |    |    |    crime=y: republican (99.74%)
-   |    |    |    |    |    religious-groups-in-schools=y
-   |    |    |    |    |    |    superfund-right-to-sue=y: democrat (99.54%)
-   |    |    |    |    |    |    superfund-right-to-sue=n
-   |    |    |    |    |    |    |    crime=n: democrat (99.77%)
-   |    |    |    |    |    |    |    crime=y
-   |    |    |    |    |    |    |    |    aid-to-nicaraguan-contras=n
-   |    |    |    |    |    |    |    |    |    handicapped-infants=n: republican (99.74%)
-   |    |    |    |    |    |    |    |    |    handicapped-infants=y: democrat (99.77%)
 
 The printout includes the feature on which the tree branches in the
 internal nodes. For leaves, it shows the the class label to which a
 (new ones) and prints prediction for first 10 instances of voting data
 set (:download:`handful.py <code/handful.py>`)::
 
-   import Orange
-   from Orange.orng import orngTree
-
-   data = Orange.core.ExampleTable("voting")
-
+   import orange, orngTree
+   data = orange.ExampleTable("voting")
+   
    # setting up the classifiers
-   majority = Orange.classification.majority.MajorityLearner(data)
-   bayes = Orange.classification.bayes._BayesLearner(data)
-   tree = orngTree.TreeLearner(data, same_majority_pruning=1, m_pruning=2)
-   knn = Orange.classification.knn.kNNLearner(data, k=21)
-
+   majority = orange.MajorityLearner(data)
+   bayes = orange.BayesLearner(data)
+   tree = orngTree.TreeLearner(data, sameMajorityPruning=1, mForPruning=2)
+   knn = orange.kNNLearner(data, k=21)
+   
    majority.name="Majority"; bayes.name="Naive Bayes";
    tree.name="Tree"; knn.name="kNN"
-
+   
    classifiers = [majority, bayes, tree, knn]
-
+   
    # print the head
    print "Possible classes:", data.domain.classVar.values
    print "Probability for republican:"
    for l in classifiers:
        print "%-13s" % (l.name),
    print
-
+   
    # classify first 10 instances and print probabilities
    for example in data[:10]:
        print "(%-10s)  " % (example.getclass()),
-       for classifier in classifiers:
-           p = classifier(example, Orange.core.GetProbabilities)
+       for c in classifiers:
+           p = apply(c, [example, orange.GetProbabilities])
            print "%5.3f        " % (p[0]),
        print
 
 and for the rest of the code we would not worry about it any
 more). The script then prints the header with the names of the
 classifiers, and finally uses the classifiers to compute the
-probabilities of classes. The output of our script is::
+probabilities of classes. Note for a special function ``apply`` that
+we have not met yet: it simply calls a function that is given as its
+first argument, and passes it the arguments that are given in the
+list. In our case, ``apply`` invokes our classifiers with a data
+instance and request to compute probabilities. The output of our
+script is::
 
    Possible classes: <republican, democrat>
    Probability for republican:
-   Original Class Majority      Naive Bayes   Tree          kNN          
-   (republican)   0.386         1.000         0.949         1.000        
-   (republican)   0.386         1.000         0.973         1.000        
-   (democrat  )   0.386         0.995         0.011         0.048        
-   (democrat  )   0.386         0.002         0.015         0.000        
-   (democrat  )   0.386         0.043         0.015         0.018        
-   (democrat  )   0.386         0.228         0.015         0.192        
-   (democrat  )   0.386         1.000         0.973         0.665        
-   (republican)   0.386         1.000         0.973         0.861        
-   (republican)   0.386         1.000         0.973         1.000        
-   (democrat  )   0.386         0.000         0.015         0.000  
+   Original Class Majority      Naive Bayes   Tree          kNN
+   (republican)   0.386         1.000         0.949         1.000
+   (republican)   0.386         1.000         0.973         1.000
+   (democrat  )   0.386         0.995         0.011         0.138
+   (democrat  )   0.386         0.002         0.015         0.468
+   (democrat  )   0.386         0.043         0.015         0.035
+   (democrat  )   0.386         0.228         0.015         0.442
+   (democrat  )   0.386         1.000         0.973         0.977
+   (republican)   0.386         1.000         0.973         1.000
+   (republican)   0.386         1.000         0.973         1.000
+   (democrat  )   0.386         0.000         0.015         0.000
 
 .. note::
    The prediction of majority class classifier does not depend on the

docs/tutorial/rst/code/classifier.py

 # Uses:        voting.tab
 # Referenced:  c_basics.htm
 
-import Orange
-from Orange.classification.bayes import _BayesLearner as BayesLearner
-
-data = Orange.core.ExampleTable("voting")
-classifier = BayesLearner(data)
+import orange
+data = orange.ExampleTable("voting")
+classifier = orange.BayesLearner(data)
 for i in range(5):
     c = classifier(data[i])
     print "%d: %s (originally %s)" % (i+1, c, data[i].getclass())

docs/tutorial/rst/code/classifier2.py

 # Uses:        voting.tab
 # Referenced:  c_basics.htm
 
-import Orange
-from Orange.classification.bayes import _BayesLearner as BayesLearner
-
-data = Orange.core.ExampleTable("voting")
-classifier = BayesLearner(data)
+import orange
+data = orange.ExampleTable("voting")
+classifier = orange.BayesLearner(data)
 print "Possible classes:", data.domain.classVar.values
 print "Probabilities for democrats:"
 for i in range(5):
-    p = classifier(data[i], Orange.core.GetProbabilities)
+    p = classifier(data[i], orange.GetProbabilities)
     print "%d: %5.3f (originally %s)" % (i+1, p[1], data[i].getclass())

docs/tutorial/rst/code/data_characteristics.py

 # Uses:        adult_sample.tab
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
+import orange
+data = orange.ExampleTable("adult_sample.tab")
 print "Classes:", len(data.domain.classVar.values)
 print "Attributes:", len(data.domain.attributes), ",",
 
 # count number of continuous and discrete attributes
 ncont = 0; ndisc = 0
 for a in data.domain.attributes:
-    if a.varType == Orange.core.VarTypes.Discrete:
+    if a.varType == orange.VarTypes.Discrete:
         ndisc = ndisc + 1
     else:
         ncont = ncont + 1

docs/tutorial/rst/code/data_characteristics2.py

 # Uses:        adult_sample.tab
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
+import orange
+data = orange.ExampleTable("adult_sample.tab")
 print "Classes:", len(data.domain.classVar.values)
 print "Attributes:", len(data.domain.attributes), ",",
 
 # count number of continuous and discrete attributes
 ncont = 0; ndisc = 0
 for a in data.domain.attributes:
-    if a.varType == Orange.core.VarTypes.Discrete:
+    if a.varType == orange.VarTypes.Discrete:
         ndisc = ndisc + 1
     else:
         ncont = ncont + 1

docs/tutorial/rst/code/data_characteristics3.py

 # Classes:     DomainContingency
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
+import orange
+data = orange.ExampleTable("adult_sample.tab")
 
 print "Continuous attributes:"
 for a in range(len(data.domain.attributes)):
-    if data.domain.attributes[a].varType == Orange.core.VarTypes.Continuous:
+    if data.domain.attributes[a].varType == orange.VarTypes.Continuous:
         d = 0.; n = 0
         for e in data:
             if not e[a].isSpecial():
         print "  %s, mean=%3.2f" % (data.domain.attributes[a].name, d / n)
 
 print "\nNominal attributes (contingency matrix for classes:", data.domain.classVar.values, ")"
-cont = Orange.core.DomainContingency(data)
+cont = orange.DomainContingency(data)
 for a in data.domain.attributes:
-    if a.varType == Orange.core.VarTypes.Discrete:
+    if a.varType == orange.VarTypes.Discrete:
         print "  %s:" % a.name
         for v in range(len(a.values)):
             sum = 0

docs/tutorial/rst/code/data_characteristics4.py

 # Uses:        adult_sample.tab
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
-dist = Orange.core.DomainDistributions(data)
+import orange
+data = orange.ExampleTable("adult_sample.tab")
+dist = orange.DomainDistributions(data)
 
 print "Average values and mean square errors:"
 for i in range(len(data.domain.attributes)):
-    if data.domain.attributes[i].varType == Orange.core.VarTypes.Continuous:
+    if data.domain.attributes[i].varType == orange.VarTypes.Continuous:
         print "%s, mean=%5.2f +- %5.2f" % \
           (data.domain.attributes[i].name, dist[i].average(), dist[i].error())
 
 print "\nFrequencies for values of discrete attributes:"
 for i in range(len(data.domain.attributes)):
     a = data.domain.attributes[i]
-    if a.varType == Orange.core.VarTypes.Discrete:
+    if a.varType == orange.VarTypes.Discrete:
         print "%s:" % a.name
         for j in range(len(a.values)):
             print "  %s: %d" % (a.values[j], int(dist[i][j]))

docs/tutorial/rst/code/handful.py

 # Classes:     MajorityLearner, BayesLearner, orngTree.TreeLearner, kNNLearner
 # Referenced:  c_otherclass.htm
 
-import Orange
-from Orange.orng import orngTree
-
-data = Orange.core.ExampleTable("voting")
+import orange, orngTree
+data = orange.ExampleTable("voting")
 
 # setting up the classifiers
-majority = Orange.classification.majority.MajorityLearner(data)
-bayes = Orange.classification.bayes._BayesLearner(data)
-tree = orngTree.TreeLearner(data, same_majority_pruning=1, m_pruning=2)
-knn = Orange.classification.knn.kNNLearner(data, k=21)
+majority = orange.MajorityLearner(data)
+bayes = orange.BayesLearner(data)
+tree = orngTree.TreeLearner(data, sameMajorityPruning=1, mForPruning=2)
+knn = orange.kNNLearner(data, k=21)
 
 majority.name="Majority"; bayes.name="Naive Bayes";
 tree.name="Tree"; knn.name="kNN"
 # classify first 10 instances and print probabilities
 for example in data[:10]:
     print "(%-10s)  " % (example.getclass()),
-    for classifier in classifiers:
-        p = classifier(example, Orange.core.GetProbabilities)
+    for c in classifiers:
+        p = apply(c, [example, orange.GetProbabilities])
         print "%5.3f        " % (p[0]),
     print

docs/tutorial/rst/code/lenses.py

 # Classes:     ExampleTable
 # Referenced:  load_data.htm
 
-import Orange
-data = Orange.core.ExampleTable("lenses")
+import orange
+data = orange.ExampleTable("lenses")
 print "Attributes:",
 for i in data.domain.attributes:
     print i.name,

docs/tutorial/rst/code/report_missing.py

 # Uses:        adult_sample.tab
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
+import orange
+data = orange.ExampleTable("adult_sample.tab")
 
 natt = len(data.domain.attributes)
 missing = [0.] * natt

docs/tutorial/rst/code/sample_adult.py

 # Classes:     ExampleTable, MakeRandomIndices2
 # Referenced:  basic_exploration.htm
 
-import Orange
-data = Orange.core.ExampleTable("adult_sample.tab")
-selection = Orange.core.MakeRandomIndices2(data, 0.03)
+import orange
+data = orange.ExampleTable("adult_sample.tab")
+selection = orange.MakeRandomIndices2(data, 0.03)
 sample = data.select(selection, 0)
 sample.save("adult_sample_sampled.tab")

docs/tutorial/rst/code/tree.py

 # Classes:     orngTree.TreeLearner
 # Referenced:  c_otherclass.htm
 
-import Orange
-from Orange.orng import orngTree
+import orange, orngTree
+data = orange.ExampleTable("voting")
 
-data = Orange.core.ExampleTable("voting")
-
-tree = orngTree.TreeLearner(data, same_majority_pruning=1, m_prunning=2)
+tree = orngTree.TreeLearner(data, sameMajorityPruning=1, mForPruning=2)
 print "Possible classes:", data.domain.classVar.values
 print "Probabilities for democrats:"
 for i in range(5):
-    p = tree(data[i], Orange.core.GetProbabilities)
+    p = tree(data[i], orange.GetProbabilities)
     print "%d: %5.3f (originally %s)" % (i+1, p[1], data[i].getclass())
 
 print

docs/tutorial/rst/load-data.rst

 Python. In the interactive Python shell, import Orange and the data
 file:
 
->>> import Orange
->>> data = Orange.core.ExampleTable("lenses")
+>>> import orange
+>>> data = orange.ExampleTable("lenses")
 >>>
 
 This creates an object called data that holds your data set and
 reads lenses data, prints out names of the attributes and class, and
 lists first 5 data instances (:download:`lenses.py <code/lenses.py>`)::
 
-   import Orange
-   data = Orange.core.ExampleTable("lenses")
+   import orange
+   data = orange.ExampleTable("lenses")
    print "Attributes:",
    for i in data.domain.attributes:
        print i.name,
 :download:`car.names <code/car.names>` and run the following code::
 
    > python
-   >>> car_data = Orange.core.ExampleTable("car")
+   >>> car_data = orange.ExampleTable("car")
    >>> print car_data.domain.attributes
    <buying, maint, doors, persons, lugboot, safety>
    >>>
 spreadsheet, you may now store your C4.5 data file to a Orange native
 (.tab) format:
 
->>> Orange.core.saveTabDelimited ("car.tab", car_data)
+>>> orange.saveTabDelimited ("car.tab", car_data)
 >>>
 
 Similarly, saving to C4.5 format is possible through ``orange.saveC45``.
 located. You may either need to specify absolute path of your data
 files, like (type your commands in Interactive Window):
 
->>> car_data = Orange.core.ExampleTable("c:/orange/car")
+>>> car_data = orange.ExampleTable("c:/orange/car")
 >>>
 
 or set a working directory through Python's os library:

docs/tutorial/rst/start.rst

 type import Orange, brackets are in the following to denote shell's
 prompt):
 
->>> import Orange
+>>> import orange
 >>> 
 
 If this leaves no error and warning, Orange and python are properly