Source

orange-bioinformatics / docs / reference / obiChem.htm

<html>
<head>
<link rel=stylesheet href="style.css" type="text/css">
</head>
<body>

<h1>obiChem: A library for searching frequent molecular fragments</h1>
<index name="modules/molecular fragments">
<p>obi implements the following classes
<ul>
    <li>FragmentMiner   : The main class that does the search</li>
    <li>Fragment        : Representation of the fragment</li>
    <li>Fragmenter      : A class that is used to fragment an ExampleTable</li>
    <li>FragmentBasedLearner    : A learner wrapper class that first runs the molecular fragmentation on the data</li>
</ul>
</p>

<h2>FragmentMiner</h2>
<p>A class for finding frequent molecular fragments</p>
<p class=section>Attributes</p>
<dl class=attributes>
    <dt>active</dt>
    <dd>list of smiles codes of active molecules</dd>
    <dt>inactive</dt>
    <dd>list of smiles codes of inactive molecules</dd>
    <dt>minSupport</dt>
    <dd>minimum frequency in the active set of the fragments to search for</dd>
    <dt>maxSupport</dt>
    <dd>maximum frequency in the inactive set of the fragments to search for</dd>
    <dt>addWholeRings</dt>
    <dd>if True rings will be added as a whole rather then atom by atom</dd>
    <dt>canonicalPruning</dt>
    <dd>if True a cache of all cannonical codes of all fragments will be kept to avoid redundant search</dd>
    <dt>findClosed</dt>
    <dd>finds only fragments that are not sub-structures of any other fragment with the same support (default: True)</dd>
</dl>
<p class=section>Methods</p>
<dl class=methods>
    <dt>Search()</dt>
    <dd>Runs the fragment search algorithm and returns a list of found fragments</dd>
</dl>
<h3>Example</h3>
<XMP class=code>miner = FragmentMiner(active = ["NC(C)C(=O)O", "NC(CS)C(=O)O", "NC(CO)C(=O)O"], inactive = [], minSupport = 0.6)
for fragment in miner.Search():
    print fragment.ToSmiles() , "Support: %.3f" %fragment.Support()</XMP>

<h2>Fragment</h2>
<p>A class representing a molecular fragment</p>
<p class=section>Methods</p>
<dl class=methods>
    <dt>ToOBMol()</dt>
    <dd>Returns an openbabel.OBMol object representation</dd>
    <dt>ToSmiles()</dt>
    <dd>Returns a SMILES code representation</dd>
    <dt>ToCanonicalSmiles()</dt>
    <dd>Returns a canonical SMILES code representation</dd>
    <dt>Support()</dt>
    <dd>Returns the support of the fragment in the active set</dd>
    <dt>OcurrencesIn(smiles)</dt>
    <dd>Returns the number of times a fragment is containd in the molecule represented by the <code>smiles</code> code argument</dd>
    <dt>ContainedIn(smiles)</dt>
    <dd>Returns True if the fragment is present in the molecule represented by the <code>smiles</code> code argument</dd>
</dl>

<h2>Fragmenter</h2>
<p>An object that is used to fragment an ExampleTable</p>
<p class=section>Attributes</p>
<dl class=attributes>
    <dt>minSupport</dt>
    <dd>minimum frequency in the active set of the fragments to search for (default: 0.2)</dd>
    <dt>maxSupport</dt>
    <dd>maximum frequency in the inactive set of the fragments to search for (default: 0.2)</dd>
    <dt>findClosed</dt>
    <dd>finds only fragments that are not sub-structures of any other fragment with the same support (default: True)</dd>
</dl>
<p class=section>Methods</p>
<dl class=methods>
    <dt>__call__(data, smilesAttr, activeFunc)</dt>
    <dd>Takes a data-set, and runs the FragmentMiner on it. Returns a new data-set and the fragments.
        The new data-set contains new attributes that represent the presence of a fragment that was found.
        <p class>Arguments</p>
        <dl class=arguments>
            <dt>data</dt>
            <dd>the dataset</dd>
            <dt>smilesAttr</dt>
            <dd>the attribute in the data that contains the SMILES codes (if none is provided it will try to make a smart guess)</dd>
            <dt>activeFunc</dt>
            <dd>a function that takes an example from the data-set and returns True if the example should be
                    considered as active (if none is provided all examples are considered active)</dd>
       </dl>
    </dd>
</dl>
<h3>Example</h3>
<XMP class=code>fragmenter=Fragmenter(minSupport=0.1, maxSupport=0.05)
data, fragments=fragmenter(data, "SMILES")
</XMP>

<h2>FragmentBasedLearner</h2>
<p>A learner wrapper class that first runs the molecular fragmentation on the data.</p>
<p class=section>Attributes</p>
<dl class=attributes>
    <dt>smilesAttr</dt>
    <dd>Attribute in the data that contains the smiles codes (if none is provided it will try to make a smart guess)</dd>
    <dt>learner</dt>
    <dd>learner that will be used to actualy learn on the fragmented data (default: orngSVM.SVMLearner)</dd>
    <dt>minSupport</dt>
    <dd>minimum frequency in the active set of the fragments to search for</dd>
    <dt>maxSupport</dt>
    <dd>maximum frequency in the inactive set of the fragments to search for</dd>
    <dt>activeFunc</dt>
    <dd>a function that takes an example from the learning data-set and returns True if the example should be
                    considered as active (if none is provided all examples are considered active)</dd>
    <dt>findClosed</dt>
    <dd>finds only fragments that are not sub-structures of any other fragment with the same support (default: True)</dd>
    
</dl>