- edited description
Bug/bad docstrings in gaussian.py for calculate_G2
I think there are some bugs in this code. On the Python side there is
Rij = np.linalg.norm(Rj - Ri)
However, Ri is not the position of the centered atom, it is an index of the atom, so it is an integer.
It should be
Rij = np.linalg.norm(Rj - neighborpositions[Ri])
with the way the code is currently documented.
Also the docstring suggests that this:
from amp.descriptor.gaussian import calculate_G2
print(calculate_G2(neighborsymbols=['O', 'O'],
neighborpositions=[[0, 0, 0],
[0, 0, 1.2]],
G_element={'O'},
eta=10,
cutoff=6,
Ri=0,
fortran=False))
should work, but it fails with
Traceback (most recent call last):
File "<stdin>", line 10, in <module>
File "/Users/jkitchin/vc/projects/neural-network/amp/amp/descriptor/gaussian.py", line 586, in calculate_G2
Rc = cutoff['kwargs']['Rc']
TypeError: 'int' object is not subscriptable
it looks like the code expects that cutoff is actually something like Cosine(6.5).todict().
If I set Fortran=True, I get a different set of errors:
Traceback (most recent call last):
File "<stdin>", line 10, in <module>
File "/Users/jkitchin/vc/projects/neural-network/amp/amp/descriptor/gaussian.py", line 561, in calculate_G2
G_number = [atomic_numbers[G_element]]
TypeError: unhashable type: 'set'
That suggests that G_element should be a string that is a chemical symbol, not a dictionary.
also, there is cutofffn = cutoff['name'], which suggests cutoff should be a dictionary, and not a float.
Also, with fortran True, Ri should be a position and not an integer.
Finally, is np.exp(-eta * (Rij 2.) / (Rc 2.)) correct? I have not seen Rij normalized by the cutoff radius in any equations in the literature. Is that done somewhere? Edit: I see this is the form used in eq 6 in A. Khorshidi, A.A. Peterson / Computer Physics Communications 207 (2016) 310–324. Still, it looks like it just modifies the eta parameter compared to the Behler notation.
Ok last edit: It looks like the loop over atoms counts the case where i=j. This means the G2 function is greater than it should be by one (where Rij=0).
Comments (5)
-
reporter -
reporter - edited description
-
reporter - edited description
-
@jkitchin, I am sorry, seems that we haven't done at all a good job in docstrings; in many places they have not been updated alongside with the changes in the code.
@andrewpeterson and @muammar: I tried to fix the issues in documentation pointed out by John. However, I guess we have an immediate need to reconsider the docstrings of all of the codes and update them. Our bad docstrings particularly highlight when the user wants to use just part of the code. Should I open a separate issue for updating docstrings?
Getting back to @jkitchin's questions, "Ri" here is position (a list of three floats) not index. The documentation is now corrected.
The correct way of your piece of code is
from amp.descriptor.gaussian import calculate_G2 from amp.descriptor.cutoffs import Cosine fp = calculate_G2(neighborsymbols=['O'], neighborpositions=[[0, 0, 1.246],], G_element='O', eta=1, cutoff=Cosine(6.5).todict(), Ri=[0, 0, 0], fortran=True) print(fp)
either with fortran=False or fortran=True. The above script calculates G2 for atom at position zero, the neighborpositions is just a single atom (in the above two-atom system) and the self-atom is not considered as neighbor. The above piece of code should give fp as 0.8791347761257509. If we now do the same with the code itself by something like:
# Make a series of images. from ase.structure import molecule from ase.visualize import view atoms = molecule('O2') view(atoms) atoms.set_pbc(False) images = [atoms] # Fingerprint using Amp. from amp.descriptor.gaussian import Gaussian descriptor = Gaussian(Gs = {'O': [{'type': 'G2', 'element': 'O', 'eta': 1.0}]}) # First approach for hashing images. from amp.utilities import hash_images images = hash_images(images) descriptor.calculate_fingerprints(hash_images(images)) descriptor.neighborlist.open() keys = descriptor.neighborlist.d.keys() print "neighborlist =", descriptor.neighborlist.d[keys[0]] descriptor.fingerprints.open() keys = descriptor.fingerprints.d.keys() print "fingerprints =", descriptor.fingerprints.d[keys[0]]
it should give
neighborlist = [(array([1]), array([[0, 0, 0]])), (array([0]), array([[0, 0, 0]]))]
fingerprints = [('O', [0.8791428635947317]), ('O', [0.8791428635947317])]
which is the same number as above, meaning that the code also does not consider self-atom as its neighbor.
Correct, we also did not see any division by Rc^2 in the literature. The default values of \eta here is about (6.5)^2 multiplied by the values of \eta in Behler. However, if one wants to choose a different value of cutoff, the \eta values should also be adjusted in the Behler approach, whereas here the \eta values do not need to be adjusted (please note that here \eta has no dimension, whereas in Behler it has the dimension of 1/(length**2)). Does that make sense to you @jkitchin?
BTW, in the above
calculate_G2
method, we should probably change the cutoff input argument such that it does not need atodict()
call. -
reporter Thanks for the clarifications. They all make sense. It makes sense the neighborpositions should not have the atom itself in it! Thanks for updating the docstrings for this!
- Log in to comment