some Dataset.apply_*() functions don't work inplace

the apply_*() functions are quite powerful and they are the preferred way to write new data into the Dataset as a function of the existing data

however, some of the apply_*() functions (I think it's the ones that loop over regions) give incorrect output when the input column and the output column are the same (presumably because before the loop begins the column is overwritten with zeros or a suitable base value)

to solve this, these particular functions should check each output column name to make sure it is not also an input - if it is, the output column should be renamed to '<column_name>_temp', then the function should be applied in the loop, and finally we should resolve the differences with

self.df['<column_name>'] = self.df['<column_name>_temp']
del self.df['<column_name>_temp']

Comments (3)