parallelization in analysis.py
In analysis.py, the Amp object is loaded from file rather than passed in. Since Amp.load does not allow the user to set the "cores" parameter, the user loses control over the degree of parallelization Amp is using.
e.g., if I do:
calc = Amp(descriptor=Gaussian(), model=NeuralNetwork(hiddenlayers=(5, 5)), cores=4)
calc.train(images='training_set.traj')
plot_parity('amp.amp', 'training_set.traj', plot_forces=False, plotfile='parity_plot_training.png)
then plot_parity is not conducted using 4 cores, but rather however many cores Amp.load('amp.amp') detects. There are a few obvious ways to fix this that depend largely on the code maintainer's style preference.
Comments (4)
-
reporter -
repo owner I propose that we:
-
Change the
load
keyword into acalc
keyword that can take in either a string (in which case it behaves as it does now) or an instantiated Amp object. -
Eliminate the
cores
keyword as the user now has control of this by the way Efrem originally suggested.
This applies to both
plot_sensitivity
andplot_parity_and_error
inamp.analysis
. -
-
reporter Easy enough. See pull request #25.
-
reporter - changed status to resolved
Resolved in pull request #25.
- Log in to comment
I figured that the most clear way is just to add "cores" as another optional argument to the plotting functions, so I made the changes and put in a pull request. It now works well on my system. (My cluster use Sun Grid Engine, so I needed this kind of fix.)