Source

orange-bioinformatics / docs / reference / obiGO.htm

<html>

<head>
<title>obiGO: Gene Ontology Handling Library</title>
<link rel=stylesheet href="style.css" type="text/css">
<link rel=stylesheet href="style-print.css" type="text/css" media=print>
</head>

<body>
<h1>obiGO: Gene Ontology Handling Library</h1>
<index name="modules/gene ontology GO">
<p>obiGO is a library for hadling gene ontology (GO) databases. <a href="http://www.geneontology.org/GO.doc.shtml">(More about GO).</a></p>

<p class=section>Attributes</p>
<dl class=attributes>
	<dt>evidenceTypes</dt>
	<dd>A dictionary with all evidence codes as keys and short description as values. </dd>
</dl>

<h2>Ontology</h2>
<p>Ontology is the main class representing a gene ontology.</p>
<p class=section>Attributes</p>
<dl class=attributes>
	<dt>terms</dt>
	<dd>A dictionary mapping term ids to instances of Term </dd>
</dl>

<p class=section>Methods</p>
<dl class=attributes>
	<dt>__init__(file=None, progressCallback=None)</dt>
	<dd>Initialize the ontology from <code>file</code> (if not given try and load ontology from default_database_path). The optional <code>progressCallback</code> will be called with a single argument to report on the progress.</dd>
	<dt>Load(progressCallback=None)</dt>
	<dd>A class method that tries to load the ontology file from default_database_path. It looks for a filename starting with 'gene_ontology'. The optional <code>progressCallback</code> will be called with a single argument to report on the progress.</dd>
	<dt>ParseFile(file, progressCallback=None)</dt>
	<dd>Parse the <code>file</code>. <code>file</code> can be a file name string or an open filelike object. The optional <code>progressCallback</code> will be called with a single argument to report on the progress.</dd>
	<dt>ExtractSuperGraph(terms)</dt>
	<dd>Return ids of all super terms of <code>terms</code> up to the most general one.</dd>
	<dt>ExtractSubGraph(terms)</dt>
	<dd>Return all sub terms of <code>terms</code>.</dd>
	<dt>GetTermDepth(term)</dt>
	<dd>Return the minimum depth of a term (length of the shortest path to this term from the top level term).</dd>
	<dt>GetDefinedSlimsSubsets()</dt>
    <dd>Return a list of defined subsets</dd>
	<dt>SetSlimsSubset(subset)</dt>
    <dd>Set the slims term subset to subset. If subset is a string it must equal one of the defined subsetdef.</dd>
	<dt>GetSlimTerms(termId)</dt>
    <dd>Return a list of slim terms for termId.</dd>
	<dt>DownloadOntology(file, progressCallback=None)</dt>
	<dd>A static method that downloads the ontology from the GO website and saves it in <code>file</code>.</dd>
	<dt>__getitem__(termId)</dt>
	<dd>return a Term object with <code>termID</code> as id or alt_id</dd>
	<dt>__len__()</dt>
	<dd>return the number of terms in the ontology</dd>
	<dt>__iter__()</dt>
	<dd>iterator over all term ids</dd>
	<dt>__contains__(id)</dt>
	<dd>check if a term with <code>id</code> is in the ontology (also checks alt_ids)</dd>
</dl>

<h2>Term</h2>
<p>Term is a class that represents a term in the ontology</p>
<p class=section>Attributes</p>
<dl class=attributes>
	<dt>id</dt>
	<dd>The term id</dd>
	<dt>name</dt>
	<dd>The term name</dd>
	<dt>namespace</dt>
	<dd>The namespace of the term</dd>
	<dt>def_</dt>
	<dd>The term def entry (Note the use of trailing unserscore to avoid conflict with a python keyword)</dd>
	<dt>is_a</dt>
	<dd>List of term ids this term is a subterm of.</dd>
	<dt>related</dt>
	<dd>List of (relType, termId) tuples with relType specifying the relationship type with termId</dd>
</dl>
<h2>Annotations</h2>
<p>Annotations object holds the annotations.</p>
<p class=section>Attributes</p>
<dl class=attributes>
	<dt>geneAnnotations</dt>
	<dd>A dictionary mapping a gene name (DB_Object_Symbol) to a set of all annotations of that gene</dd>
	<dt>termAnnotations</dt>
	<dd>A dictionary mapping a GO term id to a set of all annotations to that term</dd>
	<dt>geneNames</dt>
	<dd>A set of all gene names (all entrys from DB_Object_Symbol)</dd>
	<dt>geneNamesDict</dt>
	<dd>A dictionary mapping each unique identifier from DB_Object_ID, DB_Object_Symbol and DB_Object_Synonym to a list of all equivalent names</dd>
	<dt>aliasMapper</dt>
	<dd>A dictionary mapping each unique identifier from DB_Object_ID, DB_Object_Symbol and DB_Object_Synonym to a DB_Object_Symbol equivalent</dd>
	<dt>annotations</dt>
	<dd>A list of all AnnotationRecord instances</dd>
</dl>
<p class=section>Methods</p>
<dl class=attributes>
	<dt> __init__(file=None, ontology=None, genematcher=None, progressCallback=None)</dt>
	<dd>Initialize the annotations from <code>file</code> by calling <code>ParseFile</code> on it. If file does not exist asume it is the name of the organism to be loaded from default_database_path. The <code>ontology</code> argument if present must be an instance of Ontology class. <code>genematcher</code> should be an instance of obiGene.Macher and defaults to obiGene.GMGO. The optional <code>progressCallback</code> will be called with a single argument to report on the progress.</dd>
	<dt>Load(org, ontology=None, progressCallback=None)</dt>
	<dd>A class method that tries to load the association file for the given organism from default_database_path. It trys to match the <code>org</code> with GO organism codes and if it fails, searches for org in NCBI Taxonomy using obiTaxonomy module.</dd>
	<dt>ParseFile(self, file, progressCallback=None)</dt>
	<dd>Parse the <code>file</code>. <code>file</code> can be a file name string or an open filelike object. The optional <code>progressCallback</code> will be called with a single argument to report on the progress.</dd>
	<dt>GetAllAnnotations(term)</dt>
	<dd>Return all annotations that are annotated to term whose id entry equals <code>id</code></dd>
	<dt>GetAllGenes(id, evidenceCodes=None)</dt>
	<dd>Return a list of genes annotated by specified evidence codes to this and all subterms.</dd>
	<dt>GetEnrichedTerms(genes, reference=None, evidenceCodes=None, slimsOnly=False, aspect="P", prob=obiProb.Binomial(), progressCallback=None)</dt>
	<dd>Return a dictionary of enriched terms, with tuples of (list_of_genes, p_value, reference_count) for items and term ids as keys. P-Values are FDR adjusted if useFDR is True (default).</dd>
	<dt>GetAnnotatedTerms(genes, directAnnotationOnly=False, evidenceCodes=None, progressCallback=None)</dt>
	<dd>Return all terms that are annotated by genes with evidenceCodes.</dd>
	<dt>DownloadAnnotations(org, file, progressCallback=None)</dt>
	<dd>A static method that downloads the annotation file for organism <code>org</code> to <code>file</code></dd>
	<dt>__contains__(annotation)</dt>
	<dd>check in annotations is in this Annotations object</dd>
	<dt>__iter__</dt>
	<dd>iterate over all annotations in this object</dd>
	<dt>__len__()</dt>
	<dd>retrun the number of annotations in this object</dd>
	<dt></dt>
	<dd></dd>
</dl>

<h2>Examples</h2>
<p>Searching the annotation(part of <a href="obiGO-gene-annotations.py">obiGO-gene-annotations.py</a>)</p>
<xmp class=code>import obiGO
ontology = obiGO.Ontology.Load()
# Print names and definitions of all terms with "apoptosis" in the name
for term in [term for term in ontology.terms.values() if "apoptosis" in term.name.lower()]:
	print term.name, term.id
	print term.def_
annotations = obiGO.Annotations.Load("sgd", ontology=ontology)
annotations.GetEnrichedTerms(["YGR270W", "YIL075C", "YDL007W"])

gene = annotations.aliasMapper["YIL075C"]
print gene, "(YIL075C) directly annotated to the folowing terms:"
for a in annotations.geneAnnotations[gene]:
    print ontology[a.GO_ID].name, "with evidence code", a.Evidence_code
    
# Get all genes annotated to the same terms as YIL075C
ids = set([a.GO_ID for a in annotations.geneAnnotations[gene]])
for GOID in ids:
	ants = annotations.GetAllAnnotations(GOID)
	genes = set([a.geneName for a in ants])
	print ", ".join(genes), "annotated to", GOID, ontology[a.GO_ID].name
</xmp>

<p>Term enrichment (part of <a href="obiGO-enrichment.py">obiGO-enrichment.py</a>)</p>
<xmp class=code>res = annotations.GetEnrichedTerms(["YGR270W", "YIL075C", "YDL007W"])
print "Enriched terms:"
for GOId, (genes, p_value, ref) in res.items():
    if p_value < 0.05:
        print ontology[GOId].name, "with p-value: %.4f" %p_value, ", ".join(genes)

# And again for slims
ontology.SetSlimsSubset("goslim_yeast")

res = annotations.GetEnrichedTerms(["YGR270W", "YIL075C", "YDL007W"], slimsOnly=True)
print "Enriched slim terms:"
for GOId, (genes, p_value, _) in res.items():
    if p_value < 0.05:
        print ontology[GOId].name, "with p-value: %.4f" %p_value, ", ".join(genes)
</xmp>

<p>Mapping to slim terms (part of <a href="obiGO-slim-mapping.py">obiGO-slim-mapping.py</a>)</p></p>
<xmp class=code>ontology.SetSlimsSubset("goslim_yeast")
terms = annotations.GetAnnotatedTerms(["YGR270W", "YIL075C", "YDL007W"], directAnnotationOnly=True)
slims = set()
for term in terms:
    print term
    slims.update(ontology.GetSlimTerms(term))

print "Genes: YGR270W, YIL075C and YDL007W map to the folowing slims terms:"
for term in slims:
    print term, ontology[term].name
</xmp>