some Dataset.apply_*() functions don't work inplace
Issue #20
new
the apply_*()
functions are quite powerful and they are the preferred way to write new data into the Dataset as a function of the existing data
however, some of the apply_*()
functions (I think it's the ones that loop over regions) give incorrect output when the input column and the output column are the same (presumably because before the loop begins the column is overwritten with zeros or a suitable base value)
to solve this, these particular functions should check each output column name to make sure it is not also an input - if it is, the output column should be renamed to '<column_name>_temp'
, then the function should be applied in the loop, and finally we should resolve the differences with
self.df['<column_name>'] = self.df['<column_name>_temp']
del self.df['<column_name>_temp']
Comments (3)
-
reporter -
reporter - marked as trivial
-
reporter - removed responsible
- Log in to comment
this is unlikely to get a fix because
lib5c.structures.dataset.Dataset
is likely to be superceded by a new data structure based on the hic3defdr data layoutthis issue is still helpful for the discussion of the features we would like to see in that new data structure