Amp says it has converged but RMSE from self calculation is much higher than convergence criterion

Issue #207 closed
Mingjie Liu created an issue

The code for the training of the NN.

from amp import Amp
from amp.descriptor.gaussian import Gaussian
from amp.model.neuralnetwork import NeuralNetwork

cutoff = 6.0
Gs ={"Au": [{"eta": 0.05, "type": "G2", "element": "Au"},
            {"eta": 0.05, "type": "G2", "element": "Pd"},
            {"eta": 4.0, "type": "G2", "element": "Au"}, 
            {"eta": 4.0, "type": "G2", "element": "Pd"}, 
            {"eta": 20.0, "type": "G2", "element": "Au"},
            {"eta": 20.0, "type": "G2", "element": "Pd"}, 
            {"eta": 80.0, "type": "G2", "element": "Au"}, 
            {"eta": 80.0, "type": "G2", "element": "Pd"}, 
            {"zeta": 1.0, "elements": ["Au", "Au"], "type": "G4", "gamma": 1.0, "eta": 0.005},
            {"zeta": 1.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Au", "Au"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Au"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005},
            {"zeta": 4.0, "elements": ["Au", "Au"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}],
     "Pd": [{"eta": 0.05, "type": "G2", "element": "Au"}, 
            {"eta": 0.05, "type": "G2", "element": "Pd"},
            {"eta": 4.0, "type": "G2", "element": "Au"}, 
            {"eta": 4.0, "type": "G2", "element": "Pd"}, 
            {"eta": 20.0, "type": "G2", "element": "Au"}, 
            {"eta": 20.0, "type": "G2", "element": "Pd"}, 
            {"eta": 80.0, "type": "G2", "element": "Au"}, 
            {"eta": 80.0, "type": "G2", "element": "Pd"}, 
            {"zeta": 1.0, "elements": ["Au", "Au"], "type": "G4", "gamma": 1.0, "eta": 0.005},
            {"zeta": 1.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005},
            {"zeta": 1.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Au", "Au"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 1.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Au"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": 1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Au"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Au", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}, 
            {"zeta": 4.0, "elements": ["Pd", "Pd"], "type": "G4", "gamma": -1.0, "eta": 0.005}]}

descriptor = Gaussian(cutoff = cutoff, Gs = Gs)

hiddenlayers = {"Au": [3, 3], "Pd": [3, 3]}
activation = 'tanh'

mode = 'atom-centered'

model = NeuralNetwork(hiddenlayers = hiddenlayers, activation = activation)

calc = Amp(descriptor = descriptor, model = model,cores=1,label='amp-parameters')

calc.train('training-set.traj')

The log file shows that it has converged with energy RMSE<0.001.

Howerver, using the following line of codes to calculate RMSE, it shows that the RMSE is about 0.5, which is much higher than 0.001.

from ase.io.trajectory import Trajectory
from amp import Amp
import numpy as np

calc = Amp.load('amp-parameters.amp',cores = 1)
train = Trajectory('training-set.traj')
de = []

for atoms in train:
    e = atoms.get_potential_energy() 
    atoms.set_calculator(calc)
    nne = atoms.get_potential_energy()
    de += [(e-nne)/len(atoms)]

de = np.array(de)

print(np.sqrt((de**2).mean()))   

Am I not loading the calculator correctly or there is a bug in the code? The log.txt file is also attached here. Thanks!

Comments (8)

  1. Muammar El Khatib

    Hi,

    I got it right:

    muammar@nuc ~/tmp/issue207 
      % python rmse.py                                                       !10018
    ('RMSE energy', 0.0010014395636748197)
    

    This is the script I used:

    from ase.io import Trajectory
    from amp import Amp
    from sklearn.metrics import mean_squared_error
    import numpy as np
    
    traj = Trajectory('training-set.traj')
    
    calc = Amp.load('amp-parameters.amp')
    
    dft = [image.get_potential_energy() / len(image) for image in traj]
    amp = [calc.get_potential_energy(image) / len(image) for image in traj]
    
    rmse = np.sqrt(mean_squared_error(dft, amp))
    print('RMSE energy', rmse)
    

    I did it the laziest way which is populating arrays from DFT and Amp, and using metrics from sklearn. I think the problem in your code is that you have to take the square of de in the sum. Check here.

    muammar@nuc ~/tmp/issue207 
      % python report.py                                                     !10029
    0.00100143956367
    
    muammar@nuc ~/tmp/issue207 
      % cat report.py                                                        !10030
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    #
    
    from ase.io.trajectory import Trajectory
    from amp import Amp
    import numpy as np
    
    calc = Amp.load('amp-parameters.amp',cores = 1)
    train = Trajectory('training-set.traj')
    de = []
    
    for atoms in train:
        e = atoms.get_potential_energy()
        atoms.set_calculator(calc)
        nne = atoms.get_potential_energy()
        de += [((e-nne)/len(atoms)) ** 2]
    
    de = np.array(de)
    
    print(np.sqrt((de).mean()))
    

    Best,

  2. Mingjie Liu reporter

    Hi,

    What is the number you get if you do not take the square of de in the sum? I still got the same number after changing the code. Thanks!

    Best,

  3. Muammar El Khatib

    What is the number you get if you do not take the square of de in the sum? I still got the same number after changing the code. Thanks!

    I apologize. I should have checked that, too... Just forget what I mentioned above, I think I overlooked your script. Doing everything from scratch, meaning deleting the fingerprint databases, and using the script you posted above:

      % python report_orig.py                                                !10020
    0.00100143956367
    
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    #
    
    from ase.io.trajectory import Trajectory
    from amp import Amp
    import numpy as np
    
    calc = Amp.load('amp-parameters.amp',cores = 1)
    train = Trajectory('training-set.traj')
    de = []
    
    for atoms in train:
        e = atoms.get_potential_energy()
        atoms.set_calculator(calc)
        nne = atoms.get_potential_energy()
        de += [(e-nne)/len(atoms)]
    
    de = np.array(de)
    
    print(np.sqrt((de**2).mean()))
    

    It gets it right, too!. I don't understand what is making it fail in your installation. Just to be clean -- can you repeat the calculation in a separated directory in case you have not done it? Can you post the amp-log.txt, too?.

  4. andrew_peterson repo owner

    I also tried this with an independent script, below, and am correctly getting ~0.001 eV/atom.

    import numpy as np
    import ase.io
    from amp import Amp
    
    calc = Amp.load('amp-parameters.amp')
    
    images = ase.io.Trajectory('training-set.traj')
    
    sum_sq_error = 0.
    count = 0
    for image in images:
        print(count)
        true = image.get_potential_energy()
        predicted = calc.get_potential_energy(image)
        print(true, predicted, true - predicted)
        sq_error = ((true - predicted) / len(image))**2
        sum_sq_error += sq_error
        count += 1
        rmse = np.sqrt(sum_sq_error / count)
        print(rmse)
    
  5. Mingjie Liu reporter

    Hi,

    I think I get it now. The amp-fingerprints.ampdb and amp-neighborlists.ampdb do not get overwritten after a new training is done. Previously, I trained a NN with different symmetry functions.

    If I do it in a new directory, I am able to get the correct answer. Thank you very much for the help!

    Best, Mingjie

  6. Muammar El Khatib

    @lmj1029 glad you figured it out. What I normally do to avoid this issue is to use the dblabel when fingerprinting my images. That might be helpful for you, too. Best.

  7. Log in to comment