brentp / biostuff (http://hackmap.blogspot.com/)
miscellaneous bioinformatics modules.
Clone this repository (size: 314.6 KB): HTTPS / SSH
$ hg clone http://bitbucket.org/brentp/biostuff/
| commit 62: | 5e09d0e13822 |
| parent 61: | eb9ab1710e6e |
| child 63: | 6d084649fe4e |
bump version. update changelog for 0.3.7
NB: This is not the latest revision. For the latest view, go to tip.
| filename | size | last modified | ||
|---|---|---|---|---|
| nwalign | ||||
| pyfasta | ||||
| simpletable | ||||
| skidmarks | ||||
| .hgignore | 55 B | 5 months ago | fix errors in setup.py and dont use sudo in upload.sh | |
| .hgtags | 141 B | 3 months ago | Added tag 0.3.6 for changeset 52238ee2288b | |
| README.rst | 2.2 KB | 5 months ago | update readme |
README
Skidmarks
find runs (non-randomness) in sequences
>>> from skidmarks import gap_test, wald_wolfowitz, auto_correlation, serial_test
>>> serial_test('110000000000000111111111111')
{'chi': 18.615384615384617, 'p': 0.00032831021826061683}
nwalign
fast Needleman-Wunsch global alignment in cython. command-line and python usage
>>> import nwalign as nw
>>> nw.global_align("CEELECANTH", "PELICAN", matrix='PAM250')
('CEELECANTH', '-PELICA--N')
pyfasta
pythonic access to fasta sequence files
>>> from pyfasta import Fasta
>>> f = Fasta('some.fasta')
>>> f.keys()
['chr1', 'chr2', 'chr3']
>>> f['chr1'][10:20]
'actgatcgga'
simpletable
pytables wrapper for easy access to s structured data.
>>> class ATable(SimpleTable):
... x = tables.Float32Col()
... y = tables.Float32Col()
... name = tables.StringCol(16)
>>> tbl = ATable('test_docs.h5', 'atable1')
# insert as with pytables.
>>> row = tbl.row
>>> for i in range(50):
... row['x'], row['y'] = i, i * 10
... row['name'] = "name_%i" % i
... row.append()
>>> tbl.flush()
#access the entire array via the numpy array interface
>>> import numpy as np
>>> np.asarray(tbl)
#query the data (query() alias of tables' readWhere()
>>> tbl.query('(x > 4) & (y < 70)') #doctest: +NORMALIZE_WHITESPACE
array([('name_5', 5.0, 50.0), ('name_6', 6.0, 60.0)],
dtype=[('name', '|S16'), ('x', '<f4'), ('y', '<f4')])
