improvements to Dataset.select()
Issue #19
new
we are interested in making Dataset.select() more powerful
- it should allow selection as a DataFrame, or as a numpy array
- it should allow selection including or excluding nan's
-
it should allow selection of multiple columns at once, e.g. the call
obs, dist = d.select(['obs', 'distance'], rep='v65_rep1', region='Sox2')
should do something like
obs, dist = d.df.loc[d.df['region'] == 'Sox2', [('obs', 'v65_rep1'), ('distance', '')]].as_matrix().T
Comments (3)
-
reporter -
reporter - marked as trivial
-
reporter - removed responsible
- Log in to comment
this is unlikely to get a fix because
lib5c.structures.dataset.Dataset
is likely to be superceded by a new data structure based on the hic3defdr data layoutthis issue is still helpful for the discussion of the features we would like to see in that new data structure