Commits

Marko Toplak committed 3f1f69f

Updates to GSEA documentation.

Comments (0)

Files changed (9)

docs/reference-html/obiGsea.htm

-<html>
-
-<head>
-<title>obiGsea: Gene Set Enrichment Analysis</title>
-<link rel=stylesheet href="style.css" type="text/css">
-<link rel=stylesheet href="style-print.css" type="text/css" media=print>
-</head>
-
-<body>
-<h1>obiGsea: Gene Set Enrichment Analysis</h1>
-<index name="modules/gene set enrichment analysis">
-
-<p>Gene Set Enrichment Analysis (GSEA) is a method which tries to identify groups of genes that are
-regulated together. It is implemented in module obiGsea, which is included in Orange for Functional Genomics package.
-To use obiGsea you need to install Orange for Functional Genomics.</p>
-
-<p>GSEA takes gene
-expression data for multiple samples with their phenotypes and computes
-gene set enrichment for given gene sets. To use it run
-<code>runGSEA</code> method with the following arguments:</p>
-
-<h2>runGSEA</h2>
-<index name="gsea">
-<index name="GSEA">
-<index name="gene+set">
-
-<p class=section>Arguments</p>
-<dl class=arguments>
-
-  <dt>data</dt>
-  <dd>An <A href="ExampleTable.htm"><CODE>ExampleTable</CODE></A> with gene expression data.</dd>
-
-  <dt>matcher</dt>
-  <dd>An initialized gene matcher.</dd> 
-
-  <dt>classValues</dt>
-  <dd>A pair of class values describing phenotypes that are chosen as two distinct phenotypes on which gene correlations
-  are computed. Only examples with one of chosen class values are considered for analysis. If not specified, all in <code>classVar</code> attribute descriptor are used.</dd>
-
-  <dt>geneSets</dt>
-  <dd>A python dictionary of gene sets, where key is a gene set name which points to a list of gene aliases for genes
-  in the gene set. Default: gene sets in your collection directory.</dd>
-
-  <dt>n</dt>
-  <dd>GSEA computes gene set significance by permutation tests. This parameter specifies the number
-  of permutations. Default: 100.</dd>
-
-  <dt>permutation</dt>
-  <dd>Type of permutation. If <code>"class"</code>, class values (phenotypes) are permuted. This is the default. 
-  However, if number of samples is small (less than 10), it is advisable to use <code>"gene"</code> permutations even
-  though they ignore gene-gene interactions.</dd>
-
-  <dt>minSize, maxSize</dt>
-  <dd>Minimum and maximum number of genes from gene set also present in the data set for that gene set to be analyzed.
-  Defaults: 3 and 1000.</dd>
-
-  <dt>minPart</dt>
-  <dd>Minimum fraction of genes from the gene set also present in the data set for that gene set to be analyzed. Default: 0.1.</dd> 
-
-  <dt>atLeast</dt>
-  <dd>Number of valid attribute values for each class value needed for a certain attribute (gene) to be considered in analysis. Attributes failing to meet this criterium are ignored. Default: 3.</dd>
-
-  <dt>phenVar</dt>
-  <dd>Explains marking of different phenotypes. By default the <CODE>data.domain.classVar</CODE> is used to distinguish them, if it exists. If the argument is set to <code>False</code>, GSEA presumes that the genes are already ranked. If the value of <code>phenVar</code> is a string, then an entry from <code>attributes</code> dictionary of individual attributes specifies the phenotype. In latter case, each attribute represents one sample.</dd>
-
-  <dt>geneVar</dt>
-  <dd>Specifies locations of gene names. If <code>True</code>, gene names are attribute names. If it is a string, then they are taken from the corresponding entry from <code>attributes</code> dictionary of individual attributes. If each attribute specifies a sample, then the user should pass the meta variable containing the gene names. Defaults to attribute names if each example specifies one sample.</dd>
-
-</dl>
-
-<!-- es, nes, pval, fdr, os, ts, genes -->
-
-Method <code>runGSEA</code> returns a dictionary where key is a gene set label and its value a dictionary
-of:
-<ul>
-<li> enrichment score (key <code>es</code>),
-<li> normalized enrichment score (key <code>nes</code>),
-<li> P-value (key <code>p</code>),
-<li> FDR (key <code>fdr</code>),
-<li> whole gene set size (key <code>size</code>),
-<li> number of matched genes from the gene set (key <code>matched_size</code>),
-<li> gene names from the data set for matched genes from the gene set (key <code>genes</code>).
-</ul>
-
-<p>A note on gene name matching. Gene name matching is performed with the help of KEGG database. 
-A gene from a gene set is tried to be matched with a gene from the data set. If an alias for a gene from the
-gene set is the same as an alias for a gene in the data set, then those aliases are matched. If not,
-it is checked if gene alias from the gene set and gene alias from the data set are both gene
-aliases of the same gene according to KEGG database for a given organism. If they are, we have a match.</p>
-
-<h3>Example 1</h3>
-
-<p>We present a simple usage example. Data used here are not gene expression
-data. For the method to work we had to specify our own sets of attributes that seem to "belong together".</p>
-
-<p class="header"><a href="gsea1.py">gsea1.py</a> (uses <a href=
-"http://www.ailab.si/orange/doc/datasets/iris.tab">iris.tab</a>)</p>
-
-<xmp class=code>import orange, obiGsea, obiGene
-
-data = orange.ExampleTable("iris")
-
-gen1 = dict([
-    ("sepal",["sepal length", "sepal width"]),
-    ("petal",["petal length", "petal width", "petal color"])
-    ])
-
-res = obiGsea.runGSEA(data, matcher=obiGene.matcher([]), minSize=2, geneSets=gen1)
-print "%5s  %6s %6s %s" % ("LABEL", "NES", "P-VAL", "GENES")
-for name,resu in res.items():
-    print "%5s  %6.3f %6.3f %s" % (name, resu["nes"], resu["p"], str(resu["genes"]))
-</xmp>
-
-<p>Corresponding output:</p>
-
-<xmp class=code>LABEL     NES  P-VAL GENES
-petal  -1.117  0.771 ['petal length', 'petal width']
-sepal   1.087  0.630 ['sepal length', 'sepal width']
-</xmp>
-
-<p>We can see that a "gene" labeled "petal color" was not used, because it couldn't be matched to any attribute in the data set.</p>
-
-<h3>Example 2: using correlation data directly</h3>
-
-<p>
-GSEA can also directly use correlation data between individual genes and a phenotype. Two kinds of input data trigger this functionality: (1) input data with only one example (attribute names are gene names) or (2) there is only one continuous attribute in the given data set's domain. In latter case gene correlations are read from the continuous attribute and the gene names are taken from the first <code>StringVariable</code> in the domain. We have choosen the latter representation for this example, since this is the formatting returned by <code>obiDicty</code> module. Only results for ten pathways are listed.
-</p>
-
-<p class="header"><a href="gsea2.py">gsea2.py</a></p>
-
-<xmp class=code>import obiDicty
-import obiGeneSets
-import obiGsea
-import orange
-import obiGene
-
-dbc = obiDicty.DatabaseConnection()
-data = dbc.getData(sample='pkaC-', time="8")[0] #get first chip
-
-print "First 10 examples"
-for ex in data[:10]:
-    print ex
-
-matcher=obiGene.matcher([[obiGene.GMKEGG("ddi"),obiGene.GMDicty()]])
-
-genesets =  obiGeneSets.collections([":kegg:ddi"])
-res = obiGsea.runGSEA(data, matcher=matcher, minPart=0.05, geneSets=genesets,
-    permutation="gene")
-
-print "GSEA results"
-print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "P-VAL", "SIZE", "MATCHED")
-for name,resu in res.items()[:10]:
-    print "%-40s %6.3f %6.3f %6d %7d" % (name[:30], resu["nes"], resu["p"],
-        resu["size"], resu["matched_size"])
-</xmp>
-
-<p>Corresponding output:</p>
-
-<xmp class=code>First 10 examples
-[-0.055], {"DDB":'#*'}
-[0.003], {"DDB":'#17S_ribosomal_gene'}
-[-0.168], {"DDB":'#B-elongation_factor'}
-[-0.134], {"DDB":'DDB_G0272032'}
-[0.330], {"DDB":'DDB_G0291982'}
-[0.229], {"DDB":'DDB_G0268264'}
-[0.108], {"DDB":'DDB_G0293286'}
-[-0.130], {"DDB":'DDB_G0284295'}
-[0.189], {"DDB":'DDB_G0285435'}
-[0.082], {"DDB":'DDB_G0287595'}
-GSEA results
-LABEL                                       NES  P-VAL   SIZE MATCHED
-[KEGG] Alanine and aspartate m            1.367  0.146     21      12
-[KEGG] Cysteine metabolism                1.328  0.098      8       3
-[KEGG] Folate biosynthesis                0.703  0.862     12       4
-[KEGG] 3-Chloroacrylic acid de           -0.734  0.750      4       3
-[KEGG] Porphyrin and chlorophy           -0.990  0.517     14       5
-[KEGG] Drug metabolism - other            0.951  0.545     15      12
-[KEGG] Sulfur metabolism                 -0.538  0.980      6       3
-[KEGG] N-Glycan biosynthesis              1.384  0.111     22       6
-[KEGG] Terpenoid biosynthesis            -1.316  0.125      7       3
-[KEGG] Aminoacyl-tRNA biosynth           -1.105  0.292     42      12
-</xmp>
-
-<h3>Example 3: attributes as samples (from Gene Expression Omnibus)</h3>
-
-<p>
-Data sets where attributes represents samples can also be used, if <code>phenVar</code> and <code>geneVar</code> parameters are specified. 
-Only results for top ten pathways are listed. To obtain valid results, we would have to increase the number of data set permutations (parameter <code>n</code>).
-</p>
-
-<p class="header"><a href="gsea3.py">gsea3.py</a></p>
-
-<xmp class=code>import obiGeneSets
-import obiGsea
-import orange
-import obiGene
-import obiGEO
-
-import obiGEO
-gds = obiGEO.GDS("GDS10")
-data = gds.getdata() 
-
-print "Possible phenotype descriptors:"
-print map(lambda x: x[0], obiGsea.allgroups(data).items())
-
-matcher=obiGene.matcher([obiGene.GMKEGG("9606")])
-
-phenVar = "tissue"
-geneVar = "gene" #use gene meta variable for gene names
-
-genesets =  obiGeneSets.collections([":kegg:hsa"])
-res = obiGsea.runGSEA(data, matcher=matcher, minPart=0.05, geneSets=genesets, 
-    permutation="class", n=10, phenVar=phenVar, geneVar=geneVar)
-
-print
-print "GSEA results (choosen descriptor: tissue)"
-print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "FDR", "SIZE", "MATCHED") 
-for name,resu in sorted(res.items(), key=lambda x: x[1]["fdr"])[:10]: 
-    print "%-40s %6.3f %6.3f %6d %7d" % (name[:30], resu["nes"], resu["fdr"], 
-        resu["size"], resu["matched_size"]) 
-</xmp>
-
-<p>Corresponding output:</p>
-
-<xmp class=code>Possible phenotype descriptors:
-['disease state', 'strain', 'tissue']
-
-GSEA results (choosen descriptor: tissue)
-LABEL                                       NES    FDR   SIZE MATCHED
-[KEGG] DNA replication                    1.783  0.000     36      35
-[KEGG] Cell cycle                         1.832  0.000    119      96
-[KEGG] T cell receptor signali            2.006  0.000     94      70
-[KEGG] Non-homologous end-join            1.795  0.000     14      13
-[KEGG] Glycine, serine and thr           -1.828  0.022     42      33
-[KEGG] Porphyrin and chlorophy           -1.829  0.028     41      24
-[KEGG] Thyroid cancer                     1.666  0.036     29      24
-[KEGG] Adipocytokine signaling           -1.880  0.037     67      53
-[KEGG] Ubiquitin mediated prot            1.597  0.051    137     112
-[KEGG] Glycosaminoglycan degra           -1.899  0.055     18      16
-</xmp>
-
-
-
-<HR>
-<H2>References</H2>
-
-<P>Subramanian, Aravind   and Tamayo, Pablo   and Mootha, Vamsi  K.  and Mukherjee, Sayan   and Ebert, Benjamin  L.  and Gillette, Michael  A.  and Paulovich, Amanda   and Pomeroy, Scott  L.  and Golub, Todd  R.  and Lander, Eric  S.  and Mesirov, Jill  P. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 2005.</P>
-
-</body>
-</html>
-

docs/rst/reference/code/gsea1.py

     ("petal", ["petal length", "petal width", "petal color"])
     ]))
 
-res = Orange.bio.gsea.runGSEA(data, matcher=Orange.bio.gene.matcher([]), minSize=2, geneSets=gen1)
+res = Orange.bio.gsea.run(data, gene_sets=gen1, matcher=Orange.bio.gene.matcher([]), min_size=2)
 print "%5s  %6s %6s %s" % ("LABEL", "NES", "P-VAL", "GENES")
 for gs,resu in res.items():
     print "%5s  %6.3f %6.3f %s" % (gs.id, resu["nes"], resu["p"], str(resu["genes"]))

docs/rst/reference/code/gsea2.py

 matcher = gene.matcher([[gene.GMKEGG("dicty"), gene.GMDicty()]])
 genesets =  geneset.collections((("KEGG",), "dicty"))
 
-res = gsea.runGSEA(data, matcher=matcher, minPart=0.05, 
-    geneSets=genesets, permutation="gene")
+res = gsea.direct(data, matcher=matcher, min_part=0.05, 
+    gene_sets=genesets)
 
 print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "P-VAL", "SIZE", "MATCHED")
 for name,resu in sorted(res.items()[:10], key=lambda x: x[1]["p"]): 

docs/rst/reference/code/gsea3.py

-import Orange
-from Orange.bio import dicty, geneset, gsea, gene, geo
-
-gds = geo.GDS("GDS10")
-data = gds.getdata() 
-
-matcher = gene.matcher([gene.GMKEGG("Homo sapiens")])
-genesets = geneset.collections((("KEGG",), "Homo sapiens"))
-
-#the number of permutations (n) should be much higher
-res = gsea.runGSEA(data, matcher=matcher, minPart=0.05, 
-    geneSets=genesets, permutation="class", n=10, 
-    phenVar="tissue", geneVar="gene") 
-
-print
-print "GSEA results (descriptor: tissue)"
-print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "FDR", "SIZE", "MATCHED") 
-for gs, resu in sorted(res.items(), key=lambda x: x[1]["fdr"])[:10]: 
-    print "%-40s %6.3f %6.3f %6d %7d" % (gs.name[:30],
-        resu["nes"], resu["fdr"], resu["size"], resu["matched_size"]) 

docs/rst/reference/code/gsea4.py

-import Orange
-from Orange.bio import dicty, geneset, gsea, gene, geo
-
-gds = geo.GDS("GDS10")
-data = gds.getdata(transpose=True) 
-
-matcher = gene.matcher([gene.GMKEGG("Homo sapiens")])
-genesets = geneset.collections((("KEGG",), "Homo sapiens"))
-
-#the number of permutations (n) should be much higher
-res = gsea.runGSEA(data, matcher=matcher, minPart=0.05, 
-    geneSets=genesets, permutation="class", n=10, 
-    phenVar=data.domain["tissue"], geneVar=True) 
-
-print
-print "GSEA results (descriptor: tissue)"
-print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "FDR", "SIZE", "MATCHED") 
-for gs, resu in sorted(res.items(), key=lambda x: x[1]["fdr"])[:10]: 
-    print "%-40s %6.3f %6.3f %6d %7d" % (gs.name[:30],
-        resu["nes"], resu["fdr"], resu["size"], resu["matched_size"]) 

docs/rst/reference/code/gsea_genes.py

+import Orange
+from Orange.bio import dicty, geneset, gsea, gene, geo
+
+gds = geo.GDS("GDS10")
+data = gds.getdata() 
+
+matcher = gene.matcher([gene.GMKEGG("Homo sapiens")])
+genesets = geneset.collections((("KEGG",), "Homo sapiens"))
+
+#the number of permutations (n) should be much higher
+res = gsea.run(data, gene_sets=genesets, matcher=matcher, 
+    min_part=0.05, permutation="phenotype", n=10, 
+    phen_desc="tissue", gene_desc="gene") 
+
+print
+print "GSEA results (descriptor: tissue)"
+print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "FDR", "SIZE", "MATCHED") 
+for gs, resu in sorted(res.items(), key=lambda x: x[1]["fdr"])[:10]: 
+    print "%-40s %6.3f %6.3f %6d %7d" % (gs.name[:30],
+        resu["nes"], resu["fdr"], resu["size"], resu["matched_size"]) 

docs/rst/reference/code/gsea_instances.py

+import Orange
+from Orange.bio import dicty, geneset, gsea, gene, geo
+
+gds = geo.GDS("GDS10")
+data = gds.getdata(transpose=True) 
+
+matcher = gene.matcher([gene.GMKEGG("Homo sapiens")])
+genesets = geneset.collections((("KEGG",), "Homo sapiens"))
+
+#the number of permutations (n) should be much higher
+res = gsea.run(data, gene_sets=genesets, matcher=matcher, 
+    min_part=0.05, permutation="phenotype", n=10, 
+    phen_desc=data.domain["tissue"], gene_desc=True) 
+
+print
+print "GSEA results (descriptor: tissue)"
+print "%-40s %6s %6s %6s %7s" % ("LABEL", "NES", "FDR", "SIZE", "MATCHED") 
+for gs, resu in sorted(res.items(), key=lambda x: x[1]["fdr"])[:10]: 
+    print "%-40s %6.3f %6.3f %6d %7d" % (gs.name[:30],
+        resu["nes"], resu["fdr"], resu["size"], resu["matched_size"]) 

docs/rst/reference/gsea.rst

 .. py:currentmodule:: Orange.bio.gsea
 
 
-Gene Set Enrichment Analysis (GSEA) aims to identify enriched gene sets
+Gene Set Enrichment Analysis (GSEA) [GSEA]_ aims to identify enriched gene sets
 given gene expression data for multiple samples with their phenotypes.
 
-.. autofunction:: Orange.bio.gsea.runGSEA
+.. autofunction:: Orange.bio.gsea.run
+
+.. autofunction:: Orange.bio.gsea.direct
 
 Examples: gene expression data
 ------------------------------
 
 With samples as instances (in rows):
 
-.. literalinclude:: code/gsea4.py
+.. literalinclude:: code/gsea_instances.py
 
 With samples as features (in columns):
 
-.. literalinclude:: code/gsea3.py
+.. literalinclude:: code/gsea_genes.py
 
 Both scripts output::
 
     RNA polymerase                            0.550  0.943     24       7
     Glycosylphosphatidylinositol(GPI)-a      -0.540  0.946     19       4
 
-
+.. [GSEA] Subramanian, Aravind et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 2005.

orangecontrib/bio/gsea.py

     return es,l
 
 def gseaE(data, subsets, rankingf=None, \
-        n=100, permutation="class", **kwargs):
+        n=100, permutation="class", callback=None):
     """
     Run GSEA algorithm on an example table.
 
         es = enrichmentScoreRanked(subset, lcor, ordered, rev2=rev2)[0]
         enrichmentScores.append(es)
 
-    runOptCallbacks(kwargs)
+    runOptCallbacks(callback)
 
     #print "PERMUTATION", permutation
 
         if permutation == "class":
             d2 = shuffleClass(data, 2000+i) #fixed permutation
             r2 = rankingf(d2)
-
         else:
             r2 = shuffleList(lcor, random.Random(2000+i))
 
             esn = enrichmentScoreRanked(subset, r2, ordered2, rev2=rev22)[0]
             enrichmentNulls[si].append(esn)
 
-        runOptCallbacks(kwargs)
+        runOptCallbacks(callback)
 
     return gseaSignificance(enrichmentScores, enrichmentNulls)
 
 
-def runOptCallbacks(rargs):
-    if "callback" in rargs:
+def runOptCallbacks(callback):
+    if callback is not None:
         try:
-            [ a() for a in rargs["callback"] ]
+            [ a() for a in callback ]
         except:
-            rargs["callback"]()
-            
+            callback()            
 
-def gseaR(rankings, subsets, n=100, **kwargs):
+def gseaR(rankings, subsets, n, callback=None):
     """
     """
-
-    if "permutation" in kwargs:
-        if kwargs["permutation"] == "class":
-            raise Exception("Only gene permutation possible")
-
     enrichmentScores = []
- 
     ordered = orderedPointersCorr(rankings)
     
     def rev(l):
         es = enrichmentScoreRanked(subset, rankings, ordered, rev2=rev2)[0]
         enrichmentScores.append(es)
     
-    runOptCallbacks(kwargs)
+    runOptCallbacks(callback)
 
     enrichmentNulls = [ [] for a in range(len(subsets)) ]
 
             esn = enrichmentScoreRanked(subset, r2, ordered2, rev2=rev22)[0]
             enrichmentNulls[si].append(esn)
 
-        runOptCallbacks(kwargs)
+        runOptCallbacks(callback)
 
     return gseaSignificance(enrichmentScores, enrichmentNulls)
 
                 self.namesToIndices[at.name].append(i)
         return reduce(lambda x,y:x+y, [ self.namesToIndices[gname] for gname in genes ], [])
 
-    def compute_gene_weights(self, gsweights, gsetsnum, nattributes):
-        """
-        Computes gene set weights for all specified weights.
-        Expects gene sets in form { name: [ num_attributes ] }
-        GSWeights are 
-        """
-        pass
-
     def to_gsetsnum(self, gsets):
         """
         Returns a dictionary of given  gene sets in gsetnums format.
         """
-        return dict( (gs,self.genesIndices(nth(self.genesets[gs],1))) for gs in gsets)
+        return dict( (gs, self.genesIndices(nth(self.genesets[gs],1))) for gs in gsets)
 
-    def compute(self, minSize=3, maxSize=1000, minPart=0.1, n=100, **kwargs):
+    def compute(self, minSize=3, maxSize=1000, minPart=0.1, n=100, callback=None, rankingf=None, permutation="class"):
 
         subsetsok = self.selectGenesets(minSize=minSize, maxSize=maxSize, minPart=minPart)
 
-        geneweights = None
-
         gsetsnum = self.to_gsetsnum(subsetsok.keys())
         gsetsnumit = gsetsnum.items() #to fix order
 
         if len(gsetsnum) == 0:
             return {} # quick return if no genesets
 
-        if len(self.gsweights) > 0:
-            #set geneset
-            geneweights = [1]*len(data.domain.attributes)
-
         if len(itOrFirst(self.data)) > 1:
-            gseal = gseaE(self.data, nth(gsetsnumit,1), n=n, geneweights=geneweights, **kwargs)
+            gseal = gseaE(self.data, nth(gsetsnumit,1), n=n, callback=callback, permutation=permutation, rankingf=rankingf)
         else:
             rankings = [ self.data[0][at].native() for at in self.data.domain.attributes ]
-            gseal = gseaR(rankings, nth(gsetsnumit,1), n=n, **kwargs)
+            gseal = gseaR(rankings, nth(gsetsnumit,1), n, callback=None)
 
         res = {}
 
 
         return res
 
+def direct(data, gene_sets, matcher, min_size=3, max_size=1000, min_part=0.1,
+    gene_desc=None, n=100, callback=None):
+    """ Gene Set Enrichment analysis for pre-computed correlations
+    between genes and phenotypes. 
+    
+    :param Orange.data.Table: Precomputed correlations as a data set 
+        with a single continuous variable or a single 
+        :obj:`~Orange.data.Instance`.
+
+    See :obj:`run` for other parameters.
+
+    :return: See :obj:`run`. 
+    """
+
+    assert len(data.domain.attributes) == 1 or len(data) == 1
+    return runGSEA(data, geneSets=gene_sets, matcher=matcher, minSize=min_size, 
+        maxSize=max_size, minPart=min_part, n=n, geneVar=gene_desc, callback=callback)
+
+def run(data, gene_sets, matcher, min_size=3, max_size=1000, min_part=0.1,
+    at_least=3, phenotypes=None, gene_desc=None, phen_desc=None, n=100, 
+    permutation="phenotype", callback=None, rankingf=None):
+    """ Run Gene Set Enrichment Analysis.
+
+    :param Orange.data.Table data: Gene expression data.  
+    :param Orange.bio.geneset.GeneSets gene_sets: Gene sets.  
+    :param Orange.bio.gene.Matcher matcher: Initialized gene matcher.
+    :param tuple phenotypes: A pair describing two distinct phenotypes.
+        Each element can also be a list of values. Only examples with
+        a chosen phenotypes are analysed. Default: values phenotypes
+        of ``phen_desc``.
+    :param n: Number of permutations for significance computation. Default: 100.
+    :param str permutation: Permutation type, "phenotype" (default) for 
+        phenotypes, "gene" for genes.
+    :param int min_size:
+    :param int max_size: Minimum and maximum allowed number of genes from
+        gene set also the data set. Defaults: 3 and 1000.
+    :param float min_part: Minimum fraction of genes from the gene set
+        also in the data set. Default: 0.1.
+    :param int at_least: Minimum number of valid gene values for each 
+        phenotype (the rest are ignored). Default: 3.
+    :param phen_desc: Location of data on phenotypes. By default the
+        ``data.domain.class_var`` is used if it exists. If string, the
+        corresponding entry from ``attributes`` dictionary of individual
+        features specifies the phenotype. In latter case, each attribute
+        represents one sample.
+    :param gene_desc: Locations of gene names. If True, gene names
+        are attribute names. If a string, that entry of the individual
+        features'``attributes`` dictionary is used. If each attribute
+        specifies a sample, then the user should pass the meta variable
+        containing the gene names. Defaults to attribute names if each
+        example specifies one sample.
+
+    :return: | a dictionary where key is a gene set and values are:
+        | { es: enrichment score, 
+        | nes: normalized enrichment score, 
+        | p: P-value, 
+        | fdr: FDR, 
+        | size: gene set size,
+        | matched_size: genes matched to the data, 
+        | genes: gene names from the data set }
+
+    """
+    assert len(data.domain.attributes) > 1 or len(data) > 1
+    assert permutation in ["phenotype", "gene"]
+    if permutation == "phenotype":
+        permutation = "class"
+    return runGSEA(data, geneSets=gene_sets, matcher=matcher, minSize=min_size, 
+        maxSize=max_size, minPart=min_part, n=n, permutation=permutation, 
+        geneVar=gene_desc, callback=callback, phenVar=phen_desc, 
+        classValues=phenotypes)
+
 def runGSEA(data, organism=None, classValues=None, geneSets=None, n=100, 
         permutation="class", minSize=3, maxSize=1000, minPart=0.1, atLeast=3, 
         matcher=None, geneVar=None, phenVar=None, caseSensitive=False, 
-        **kwargs):
-    """
-    Run Gene Set Enrichment Analysis.
-
-    :param str organism: The organism taxonomy id (or name). Needed 
-       to build default gene sets and gene matcher.
-    :param Orange.data.Table data: Gene expression data.
-    :param Orange.bio.gene.Matcher matcher: Initialized gene matcher. If not given, a gene matcher is built with the KEGG database.
-    :param tuple classValues: A pair of values describing two distinct phenotypes. Each element can also be a list of values.
-       are computed. Only examples with a chosen phenotypes are analysised. Default: use all phenotypes in the data.
-    :param Orange.bio.geneset.GeneSets geneSets: Gene sets. Default: ???.
-    :param n: Number of permutations for significance computation. Default: 100.
-    :param str permutation: Permutation type, "class" (default) for phenotypes, "gene" for genes. We recommend "gene" permutations for data sets with less than 10 samples even though they ignore gene-gene interactions.
-    :param int minSize:
-    :param int maxSize: Minimum and maximum allowed number of genes from gene set also the data set. Defaults: 3 and 1000.
-    :param float minPart: Minimum fraction of genes from the gene set also in the data set. Default: 0.1.
-    :param int atLeast: Minimum number of valid gene values for each phenotype (the rest are ignored). Default: 3.
-    :param phenVar: Location of data on phenotypes. By default the `data.domain.class_var` is used if it exists. If :obj:`phenVar` is False, the genes are already taken as ranked. If string, the corresponding entry from ``attributes`` dictionary of individual features specifies the phenotype. In latter case, each attribute represents one sample.
-    :param geneVar: Locations of gene names. If True, gene names are attribute names. If a string, that entry of the individual features'``attributes`` dictionary is used. If each attribute specifies a sample, then the user should pass the meta variable containing the gene names. Defaults to attribute names if each example specifies one sample.
-    
-    :return: a dictionary where key is a gene set label and 
-        its value a dictionary of 
-        (1) enrichment score (es), 
-        (2) normalized enrichment score (nes), 
-        (3) P-value (p), 
-        (4) FDR (fdr), 
-        (5) whole gene set size (size), 
-        (6) number of matched genes from the gene set (matched_size), 
-        (7) gene names of matched genes in the data set (genes).
-
-    """
+        rankingf=None, callback=None):
     gso = GSEA(data, organism=organism, matcher=matcher, 
         classValues=classValues, atLeast=atLeast, caseSensitive=caseSensitive,
         geneVar=geneVar, phenVar=phenVar)
     gso.addGenesets(geneSets)
     res1 = gso.compute(n=n, permutation=permutation, minSize=minSize,
-        maxSize=maxSize, minPart=minPart, **kwargs)
+        maxSize=maxSize, minPart=minPart, rankingf=rankingf,
+        callback=callback)
     return res1
 
 def etForAttribute(datal,a):