Adding a method to calculate the RMSE of a validation set

Issue #212 resolved
Efrem Braun created an issue

After training a NNP to a set of training images, it's necessary to see which NNP iteration minimized a validation set's RMSE to see which iteration should be chosen as the final choice. I currently do this manually with the following code:

nn_iteration_calculation_frequency = 10
energy_data_iter_num = []
energy_data_RMSE = []
RMSE_validation = []
num_checkpoints = len([name for name in os.listdir('amp-checkpoints')])
for nn_iter in range(0, num_checkpoints, nn_iteration_calculation_frequency):
    energy_data_iter_num.append(nn_iter)
    calc = Amp.load('amp-checkpoints/'+str(nn_iter)+'.amp', cores=cores, dblabel=dblabel)
    energy_data = plot_parity_and_error(calc, validation_set_filename_output, dblabel=dblabel, plot_forces=False, label_parity='parity-validation-'+str(nn_iter), label_error='error-validation-'+str(nn_iter), returndata=True)
    energy_data_RMSE.append(energy_data)
    with open('energy_data_RMSE.json', 'w') as fout:
        json.dump(energy_data_RMSE, fout)
    err_sq = 0
    for j in energy_data:
        err_sq += (energy_data[j][3])**2
    RMSE_validation.append(math.sqrt(err_sq / len(energy_data)))
    np.savetxt('RMSE-vs-iteration-validation.txt', np.column_stack((energy_data_iter_num, RMSE_validation)), header='Iteration, RMSE')

This code takes about 2 minutes to calculate the RMSE of the validation set for one particular implementation. Calculating the RMSE for all checkpoints (say, around 1000) takes quite a long time. I'm sure that this can be sped up, since the neural net training itself takes a lot less time to calculate the RMSE of the training set, e.g., Amp both came up with new neural net parameters and calculated the RMSE of the training data in about 5 seconds during training for the same implementation.

I should be able to figure this out by looking into the difference between what the code does while training the NNP and while doing the lot_parity_and_error() method. Having a separate method that does this kind of thing on its own might be beneficial for a lot of users. It's on my "to do" list.

Comments (6)

  1. Efrem Braun reporter

    @muammar just showed me that the reason it's slower is that plot_parity_and_error calls calc.model.calculate_energy() in a non-parallelized manner over all images, whereas during training this is parallelized.

  2. Efrem Braun reporter

    @andrewpeterson said he'd prefer for there to be a method that calculates the RMSE of a validation set AFTER training rather than concurrently with it. So I'll get started on that.

  3. andrew_peterson repo owner

    I think something like the following will work, but I haven't tried it yet.

    calc = Amp.load('path/to/trained/calc.amp')
    calc.train('path/to/validation/images.traj')
    

    Then the first step (#0) in the output should give the convergence data (4 quantities) for the validation set with the parameters from the loaded file. If you set the convergence criteria very loose, it should exit after step #0.

    If that works, then we should either provide clear documentation or a simple function that replicates / calls this behavior.

  4. Log in to comment