Neural Net Returning a NaN

Issue #226 open
Michael Waters created an issue

I loosely trained a neural net for Zr/O on some random atomic structures. When I try to apply that NN to some structures derived from polymorphs of Zr, I often get NaNs for energy.

I’ve attached two similar structures. For the first one the NN works fine, for the other, the NN gives me NaN.

Comments (14)

  1. Alireza Khorshidi

    Mike, could you put a couple print statements in the source code (say here) to see what exactly gets NaN in the code? Is it fingerprints (the input of the model) or the output of the model?

  2. Michael Waters reporter

    Hi Alireza,

    The nan’s are occurring in some of the G4 fingerprint calculations for the structure2 I attached. The NN will return a NaN if there is fingerprint that is NaN. I am continuing to investigate.

  3. Michael Waters reporter

    Hi Alireza,

    I found the issue!

    On lines 113 and 114 of gaussian.f90:

                    costheta = &
                    dot_product(Rij_vector, Rik_vector) / Rij / Rik
                    term = (1.0d0 + g_gamma * costheta)**g_zeta
    

    costheta can be -1.0000000000000002 due to a rounding error!

    If g_gamma = 1.0, you can get a NaN from the exponentiation for term.

    Here’s my suggestion for a fix:

                    costheta = &
                    dot_product(Rij_vector, Rik_vector) / Rij / Rik
                    term = ABS(1.0d0 + g_gamma * costheta)**g_zeta
    
  4. andrew_peterson repo owner
    • changed status to open

    @Michael Waters wrote to the amp-users list:

    Hi Andrew,

    Most of my NaNs are gone. The remaining NaNs are fixed by ensuring that costheta can't be larger than 1, like this:

                    if (costheta < -1.0d0) then
                        costheta = -1.0d0
                   end if
                    if (costheta >  1.0d0) then
                        costheta =  1.0d0
                    end if
    

    Maybe Fortran model version 13 was unlucky?

    Best, -Mike

  5. andrew_peterson repo owner

    Can you send a system that shows the problem? You are the only one I have heard of this problem from, so we need a system that duplicates the error…. I think the other system must have only encountered the “-1” region. Also, it would be much appreciated if you could make the system minimal. We do our debugging in the pure-python version of Amp, and your previous structure was quite large, making that process a bit cumbersome. If you could search through your problematic structure to find which atom combination is causing the problem, you can just extract those atoms and make a new smaller system that hopefully still has the problem. Ok?

    P.S. I’m struggling to figure out why you would ever encounter costheta = 1.0. Doesn’t this mean that theta is 0? How would this happen other than having two atoms in the same place?

  6. Michael Waters reporter

    I think I know, the spacing looks like this center----> atom1 ----> atom2.

    How should I send you my files?

  7. andrew_peterson repo owner

    Oh right, they are in a line on the same side of the atom; if the cutoff radius is big enough this can occur. It's early in the morning in Copenhagen at the moment and my caffeine hadn't kicked in!

    You can just upload a trajectory to this issue page, like you did when you reported the issue originally.

  8. Michael Waters reporter

    Oh some info might help. This is a scan of energy-volume for BCC Zr. The first 3 images should give NaNs.

  9. andrew_peterson repo owner

    I can’t open the trajectory file you attached. My error is below. I’m using the latest version of ASE. Does it open correctly on your end? Perhaps you can re-upload it and maybe save it as an extxyz file as backup, since that’s plain text.

    $ ase -T gui beta-Zr-NaN-test.traj 
    Traceback (most recent call last):
      File "/home/aap/Dropbox/repositories/ase/bin/ase", line 3, in <module>
        main()
      File "/home/aap/Dropbox/repositories/ase/ase/cli/main.py", line 99, in main
        f(args)
      File "/home/aap/Dropbox/repositories/ase/ase/gui/ag.py", line 68, in run
        images.read(args.filenames, args.image_number)
      File "/home/aap/Dropbox/repositories/ase/ase/gui/images.py", line 182, in read
        self.initialize(images, names)
      File "/home/aap/Dropbox/repositories/ase/ase/gui/images.py", line 125, in initialize
        self.maxnatoms = max(len(atoms) for atoms in self)
    ValueError: max() arg is an empty sequence
    

  10. Log in to comment