parallelization in analysis.py

Issue #187 resolved
Efrem Braun created an issue

In analysis.py, the Amp object is loaded from file rather than passed in. Since Amp.load does not allow the user to set the "cores" parameter, the user loses control over the degree of parallelization Amp is using.

e.g., if I do:

calc = Amp(descriptor=Gaussian(), model=NeuralNetwork(hiddenlayers=(5, 5)), cores=4)
calc.train(images='training_set.traj')
plot_parity('amp.amp', 'training_set.traj', plot_forces=False, plotfile='parity_plot_training.png)

then plot_parity is not conducted using 4 cores, but rather however many cores Amp.load('amp.amp') detects. There are a few obvious ways to fix this that depend largely on the code maintainer's style preference.

Comments (4)

  1. Efrem Braun reporter

    I figured that the most clear way is just to add "cores" as another optional argument to the plotting functions, so I made the changes and put in a pull request. It now works well on my system. (My cluster use Sun Grid Engine, so I needed this kind of fix.)

  2. andrew_peterson repo owner

    I propose that we:

    • Change the load keyword into a calc keyword that can take in either a string (in which case it behaves as it does now) or an instantiated Amp object.

    • Eliminate the cores keyword as the user now has control of this by the way Efrem originally suggested.

    This applies to both plot_sensitivity and plot_parity_and_error in amp.analysis.

  3. Log in to comment