brentp / biostuff (http://hackmap.blogspot.com/)

miscellaneous bioinformatics modules.

Clone this repository (size: 311.8 KB): HTTPS / SSH
$ hg clone http://bitbucket.org/brentp/biostuff/
commit 53: 7020a83f89ce
parent 52: 9175ef48efe9
child 54: 16d5f73b12da
update version and changelog before uploading to pypi. default0.3.5
brentp
3 months ago

 NB: This is not the latest revision. For the latest view, go to tip.

View at rev
biostuff /
filename size last modified message
nwalign  
pyfasta  
simpletable  
skidmarks  
.hgignore 55 B 5 months ago fix errors in setup.py and dont use sudo in upload.sh
.hgtags 47 B 3 months ago Added tag 0.3.4 for changeset e09b67610acc
README.rst 2.2 KB 5 months ago update readme

README

Skidmarks

find runs (non-randomness) in sequences

>>> from skidmarks import gap_test, wald_wolfowitz, auto_correlation, serial_test
>>> serial_test('110000000000000111111111111')
{'chi': 18.615384615384617, 'p': 0.00032831021826061683}

nwalign

fast Needleman-Wunsch global alignment in cython. command-line and python usage

>>> import nwalign as nw
>>> nw.global_align("CEELECANTH", "PELICAN", matrix='PAM250')
('CEELECANTH', '-PELICA--N')

pyfasta

pythonic access to fasta sequence files

>>> from pyfasta import Fasta
>>> f = Fasta('some.fasta')
>>> f.keys()
['chr1', 'chr2', 'chr3']

>>> f['chr1'][10:20]
'actgatcgga'

simpletable

pytables wrapper for easy access to s structured data.

>>> class ATable(SimpleTable):
...     x = tables.Float32Col()
...     y = tables.Float32Col()
...     name = tables.StringCol(16)


>>> tbl = ATable('test_docs.h5', 'atable1')

# insert as with pytables.
>>> row = tbl.row
>>> for i in range(50):
...    row['x'], row['y'] = i, i * 10
...    row['name'] = "name_%i" % i
...    row.append()
>>> tbl.flush()

#access the entire array via the numpy array interface
>>> import numpy as np
>>> np.asarray(tbl)



#query the data (query() alias of tables' readWhere()
>>> tbl.query('(x > 4) & (y < 70)') #doctest: +NORMALIZE_WHITESPACE
array([('name_5', 5.0, 50.0), ('name_6', 6.0, 60.0)],
    dtype=[('name', '|S16'), ('x', '<f4'), ('y', '<f4')])