two tests failing

Issue #156 resolved
John Kitchin created an issue

This is on the master branch. Also I killed them after about 10,000 seconds because they seemed to be hung on the 9th test.

.FF.....

======================================================================
FAIL: train_test.non_periodic_0th_bfgs_step_test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jkitchin/anaconda3/envs/amp/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/jkitchin/vc/projects/neural-network/py2.7/amp/tests/CuOPd_test/gaussian_neural_test/train_test.py", line 206, in non_periodic_0th_bfgs_step_test
    'Calculated value of loss function is wrong!'
AssertionError: Calculated value of loss function is wrong!
-------------------- >> begin captured stdout << ---------------------
train-nonperiodic/False-1
diff at 204 = 0.00356458469469

--------------------- >> end captured stdout << ----------------------

======================================================================
FAIL: train_test.periodic_0th_bfgs_step_test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jkitchin/anaconda3/envs/amp/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/jkitchin/vc/projects/neural-network/py2.7/amp/tests/CuOPd_test/gaussian_neural_test/train_test.py", line 416, in periodic_0th_bfgs_step_test
    'Calculated value of loss function is wrong!'
AssertionError: Calculated value of loss function is wrong!
-------------------- >> begin captured stdout << ---------------------
train-periodic/False-1
diff at 414 = 0.00509043299462

Comments (17)

  1. Alireza Khorshidi

    Just for our record of this issue, Muammar and I are just sitting beside each other, and running the same script python tests/CuOPd_test/gaussian_neural_test/train_test.py (for me with python 2.7.12 and for Muammar with python 2.7.13). The test passes on my system:

    alireza@alireza-ubuntu:~/Packages/amp/tests/CuOPd_test/gaussian_neural_test$ python train_test.py
    train-nonperiodic/False-1
    diff at 204 = 1.81898940355e-12
    train-nonperiodic/False-2
    diff at 204 = 1.81898940355e-12
    train-nonperiodic/False-3
    diff at 204 = 1.81898940355e-12
    train-nonperiodic/False-4
    diff at 204 = 2.72848410532e-12
    train-nonperiodic/False-5
    diff at 204 = 2.72848410532e-12
    train-nonperiodic/True-1
    diff at 204 = 1.81898940355e-12
    train-nonperiodic/True-2
    diff at 204 = 2.72848410532e-12
    train-nonperiodic/True-3
    diff at 204 = 9.09494701773e-13
    train-nonperiodic/True-4
    diff at 204 = 2.72848410532e-12
    train-nonperiodic/True-5
    diff at 204 = 1.81898940355e-12
    train-periodic/False-1
    diff at 414 = 5.45696821064e-12
    train-periodic/False-2
    diff at 414 = 5.45696821064e-12
    train-periodic/False-3
    diff at 414 = 5.45696821064e-12
    train-periodic/True-1
    diff at 414 = 5.45696821064e-12
    train-periodic/True-2
    diff at 414 = 5.45696821064e-12
    train-periodic/True-3
    diff at 414 = 5.45696821064e-12
    alireza@alireza-ubuntu:~/Packages/amp/tests/CuOPd_test/gaussian_neural_test$
    

    but for him it does not pass. He will post his error for our record.

  2. Muammar El Khatib

    The error I get on my computer is:

    train-nonperiodic/False-1
    diff at 204 = 0.00356458469469
    Traceback (most recent call last):
      File "train_test.py", line 467, in <module>
        non_periodic_0th_bfgs_step_test()
      File "train_test.py", line 206, in non_periodic_0th_bfgs_step_test
        'Calculated value of loss function is wrong!'
    AssertionError: Calculated value of loss function is wrong!
    

    which is the same as reported by John.

    I have set a pipeline, and I tested if using 2.7.12 works in bitbucket. But the test failed.

  3. John Kitchin reporter

    Those are the errors I get with 2.7.6 also on a Centos5 machine. I guess it is not related to the Python version.

  4. Muammar El Khatib

    Alireza and Muammar: We think we found what the issue is. The EMT calculator has been changed here (probably by Jakob Schiøtz) from ASE 3.11.0 to ASE 3.12.0, and it produces different energy and force values. If you run the following script with both versions:

    from ase import Atoms
    import numpy as np
    from ase.calculators.emt import EMT
    
    atoms = Atoms(symbols='Pt4',
                        pbc=np.array([False, False, False], dtype=bool),
                        cell=np.array(
                            [[1.,  0.,  0.],
                             [0.,  1.,  0.],
                                [0.,  0.,  1.]]),
                        positions=np.array(
                            [[0.,  0.,  0.],
                             [0.,  2.,  0.],
                                [0.,  0.,  3.],
                                [1.,  0.,  0.]]))
    
    atoms.set_calculator(EMT())
    energy = atoms.get_potential_energy(apply_constraint=False)
    forces = atoms.get_forces(apply_constraint=False)
    print("emt energy =", energy)
    print("emt forces =", forces)
    

    you will get these values with 3.11.0:

    ('emt energy =', 349.5272394713045)
    ('emt forces =', array([[ -1.19213658e+03,  -2.09440463e+01,   2.18092323e+00],
           [ -3.06745927e+00,   2.68974213e+01,   2.72315226e-01],
           [  4.21342441e-01,   1.81543484e-01,  -3.71726578e+00],
           [  1.19478269e+03,  -6.13491855e+00,   1.26402732e+00]]))
    

    but different values with 3.12.0:

    ('emt energy =', 349.5271296998319)
    ('emt forces =', array([[ -1.19213634e+03,  -2.09440302e+01,   2.18092302e+00],
           [ -3.06745630e+00,   2.68973994e+01,   2.72315136e-01],
           [  4.21342364e-01,   1.81543424e-01,  -3.71726525e+00],
           [  1.19478245e+03,  -6.13491261e+00,   1.26402709e+00]]))
    

    We did the same test with the EAM calculator instead of EMT, and got the same numbers in both versions. So we believe that changes in EMT is responsible.

    Anyhow, the reference values of the tests should be updated to be consistent with the new values of EMT.

  5. andrew_peterson repo owner

    Yes, great detective work on this!

    Assuming this turns out to the be reason, we should probably make sure the test works on both the new and old versions of ASE. That is, we would store both version's "correct" results within the test script, then choose which to compare to based on the ase version number.

  6. Muammar El Khatib

    As of commit 940a6df, one of the tests is still broken. I have run tests more than three times in this HEAD, and this one is failing in all cases:

    .
    ======================================================================
    FAIL: numeric_analytic_test.test
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
        self.test(*self.arg)
      File "/home/muammar/brown/git/amp/tests/misc_test/numeric_analytic_test.py", line 129, in test
        'image %i is wrong!' % (i, atom_no, image_no + 1)
    AssertionError: The calculated 1 force of atom 1 of image 2 is wrong!
    -------------------- >> begin captured stdout << ---------------------
    numeric_analytic_test/analytic-True-1
    numeric_analytic_test/analytic-True-2
    numeric_analytic_test/analytic-False-1
    numeric_analytic_test/analytic-False-2
    numeric_analytic_test/numeric-True-1
    numeric_analytic_test/numeric-True-2
    numeric_analytic_test/numeric-False-1
    numeric_analytic_test/numeric-False-2
    diff = 2.75636735658e-10
    diff = 3.24553217634e-10
    diff = 1.06334772584e-10
    diff = 3.09482994787e-12
    diff = 6.81192269258e-09
    
    --------------------- >> end captured stdout << ----------------------
    
    zernike_test.test: 107.6562s
    force_call_test.test: 70.7062s
    train_test.non_periodic_0th_bfgs_step_test: 39.8739s
    numeric_analytic_test.test: 28.6319s
    gaussian_test.test: 23.0520s
    force_call_test_tflow.non_periodic_test: 18.3521s
    train_test.periodic_0th_bfgs_step_test: 16.3144s
    force_call_test_tflow.periodic_test: 11.5164s
    train_test.test: 11.4086s
    test_gaussian_neural.train_test: 10.3602s
    test_gaussian_tflow.train_test: 8.2492s
    fpplot_test.test: 8.0066s
    force_call_test.periodic_test: 4.1937s
    test_NN_nodeplot.test_nodeplot: 1.2909s
    force_call_test.non_periodic_test: 0.7438s
    displaced_atom_test.test: 0.0882s
    rotated_atoms_test.test: 0.0060s
    ----------------------------------------------------------------------
    Ran 17 tests in 361.408s
    
    FAILED (failures=1)
    Makefile:8: recipe for target 'tests' failed
    make: *** [tests] Error 1
    

    But getting back to d149cdb, makes it works:

    zernike_test.test: 107.3407s
    force_call_test.test: 70.1402s
    train_test.non_periodic_0th_bfgs_step_test: 39.7613s
    numeric_analytic_test.test: 32.3257s
    gaussian_test.test: 22.7021s
    force_call_test_tflow.non_periodic_test: 18.4270s
    train_test.periodic_0th_bfgs_step_test: 16.3203s
    test_gaussian_neural.train_test: 11.5694s
    force_call_test_tflow.periodic_test: 11.4757s
    train_test.test: 11.3535s
    fpplot_test.test: 8.6003s
    test_gaussian_tflow.train_test: 8.2090s
    force_call_test.periodic_test: 4.1634s
    test_NN_nodeplot.test_nodeplot: 1.3652s
    force_call_test.non_periodic_test: 0.7502s
    displaced_atom_test.test: 0.0899s
    rotated_atoms_test.test: 0.0060s
    ----------------------------------------------------------------------
    Ran 17 tests in 365.576s
    
    OK
    

    It seems that changing initial guess of weights in neuralnetwork.py is causing this problem (?). I have run tests now about 7 times, and they pass.

  7. Alireza Khorshidi

    @muammar : Is your issue with the tests fixed?

    @andrewpeterson : This is the second time I am falling in the same trap. I have generated a trained-parameters file using an old version of EMT, and have updated my ase sometime. Now I try the trained-parameters file, and see it gives untrained results. This is because EMT has been changed upstream. We can add another parameter "parentcaluclator = {"name": ..., "version":...} into our saved .amp files, and then when someone does a force/energy call print a warning/information which calculator has this file been trained to. Would you suggest that?

  8. Muammar El Khatib

    Muammar El Khatib : Is your issue with the tests fixed?

    @akhorshi I did a fresh clone, and run the tests using python2. No issues here:

    ± % make py2tests                                                                                                                                                                                           !10213
    rm -fr /tmp/py2_amptests
    mkdir -p /tmp/py2_amptests
    cd /tmp/py2_amptests && nosetests /tmp/tests/amp/amp/..
    ............../home/muammar/brown/git/ase/ase/lattice/surface.py:17: UserWarning: Moved to ase.build
      warnings.warn('Moved to ase.build')
    ...
    ----------------------------------------------------------------------
    Ran 17 tests in 293.005s
    
    OK
    

    However, python3 is again suffering from the problem we had before releasing 0.6:

    muammar   9776  2.0  0.3 537388 57060 pts/10   Sl+  09:53   0:08 /usr/bin/python3 /usr/bin/nosetests3 /tmp/tests/amp/amp/..
    muammar   9964  0.1  0.0      0     0 ?        Zs   09:53   0:00 [python3] <defunct>
    muammar   9970  0.1  0.0      0     0 ?        Zs   09:53   0:00 [python3] <defunct>
    

    I have checked that 0.6 is not affected by this. I have to verify recent commits done to our codebase and fix the issue.

  9. Alireza Khorshidi

    @muammar Okey, then I will leave this issue open.

    @andrewpeterson Not again! The same thing as we found before. EMT has been changed from ASE v11 to v12. I just forgot in a calculation that my trained parameters were from v10, and I am now doing energy/force call in v14!

  10. andrew_peterson repo owner

    In that case, I don't think so. For EMT it might be straightforward, but for any "real" calculator for consistency the user would also need to know if the xc functionals were the same, the cutoff energies, the fermi smearing, etc. So I think this is best left to the user to responsible for having a consistent data set.

  11. Log in to comment