Multiple instances of same data

Issue #4 resolved
andrew_peterson repo owner created an issue

In regression/neuralnetwork.py, the same data is saved in multiple places with different names. That is, there is a "self.variables" item that contains identical information as self.weights and self.scalings. It is good to have the ability to have the weights/scalings come out as a vector (as in the self.update_variables and self.ravel_variables functions), but is a bad idea to save both forms of them. Not only is this memory inefficient, but it can lead to the user thinking they have updated the variables to find out that it was not updated in all places. This is what caused the issue with the integration with bunquant.

We should get rid of the self.variables item.

Comments (6)

  1. Alireza Khorshidi

    I tried removing self.variables, but then the tests of read/write did not pass. The problem is that:

    1- We want list of variables be written/read to/from json file. This requires self.variable inside my regression class (NeuralNetwork).

    2- Simultaneously, we want to provide the user with the ability to feed in weights and scalings directly, like regression=NeuralNetwork(weights=..., scalings=...). This requires self.weights and self.scalings inside my regression class (NeuralNetwork).

    Therefore, it seems that we have to keep both. Although it is not memory efficient (of course variables does not take much memory), but it avoids having to deform matrices multiple times.

    What do you think?

  2. andrew_peterson reporter

    What if weights and scalings were not stored then, but converted to a list of variables after the user feeds them?

  3. Alireza Khorshidi

    I guess that is possible. But then we will have to deform variables from list to weights and scalings matrices four times instead of just once, i.e. at the beginning of these functions: get_energy get_forces get_variable_der_of_energy get_variable_der_of_forces

    Which way do you think we should take? Saving in memory as self.weights and self.scalings which will speed up, or not saving which will take less memory?

  4. andrew_peterson reporter

    Maybe they should all be saved as self._weights, self._scalings, and self._variables so that the user does not think that they can modify these directly?

  5. Log in to comment