Loss function issues (in parallel)

Issue #44 resolved
andrew_peterson repo owner created an issue

The loss function doesn't work, at least in parallel mode. I see at least three things:

(1) Note the output file tells you where to look for error messages. E.g.,

Establishing worker sessions.
 Session 0 (localhost): stderr written to /tmp/tmpHdIjCs.stderr
 Session 1 (localhost): stderr written to /tmp/tmpu30vpf.stderr
 Session 2 (localhost): stderr written to /tmp/tmpriok2I.stderr
 Session 3 (localhost): stderr written to /tmp/tmpqgzQK9.stderr

Look in those files to see why it doesn't seem to work "for some anonymous reason". The reason(s) are in those files.

(2) The new keywords ("energy_coefficient", "force_coefficient") need to be added to the parameters dictionary; this is what is passed to the workers. See the example for convergence. Only store parameters that you do not want passed to the workers outside this dictionary (e.g., cores). You can see this is the immediate reason it crashes in those tmp stderr files above.

(3) The __call__ method was renamed; this is how the workers call the function, so it cannot be renamed.

Comments (1)

  1. Log in to comment