improvements to Dataset.select()

Issue #19 new
Thomas Gilgenast created an issue

we are interested in making Dataset.select() more powerful

  • it should allow selection as a DataFrame, or as a numpy array
  • it should allow selection including or excluding nan's
  • it should allow selection of multiple columns at once, e.g. the call

    obs, dist = d.select(['obs', 'distance'], rep='v65_rep1', region='Sox2')
    

    should do something like

    obs, dist = d.df.loc[d.df['region'] == 'Sox2', [('obs', 'v65_rep1'), ('distance', '')]].as_matrix().T
    

Comments (3)

  1. Thomas Gilgenast reporter

    this is unlikely to get a fix because lib5c.structures.dataset.Dataset is likely to be superceded by a new data structure based on the hic3defdr data layout

    this issue is still helpful for the discussion of the features we would like to see in that new data structure

  2. Log in to comment