CorrelationFunction behavior at 0
Currently, CorrelationFunction always sets the correlation function value at the first bin to be 0 (really, the value for the default constructor). This is acceptable if you're computing the self-correlation of one set to itself and don't want the first value, but if you're computing cross-correlation between two different sets, it is not necessarily what you want.
Consider the following code, which computes the correlation of one point at the origin (with associated value 1) to a randomly-generated set of points with associated values of the magnitude of their radial distance from the origin. The correlation function should then be identity.
import numpy as np
import numpy.testing as npt
from freud import trajectory, density
ref = np.array([[0, 0, 0]], dtype=np.float32)
refVals = np.array([1], dtype=np.float32)
rmax = 10.0
dr = 1.0
num_points = 10000
box_size = rmax*3.1
points = np.random.random_sample((num_points,3)).astype(np.float32)*box_size - box_size/2
pointRs = np.sqrt(np.sum(points**2, axis=-1))
cf = density.FloatCF(trajectory.Box(box_size), rmax, dr)
cf.compute(ref, refVals, points, pointRs)
import matplotlib, matplotlib.pyplot as pp
pp.plot(cf.getR(), cf.getRDF())
pp.show(block=True)
Currently, however, master
sets the value at the first bin to be 0 unconditionally.
Comments (28)
-
reporter -
reporter The referenced commit computes the correlation function for all bins instead of skipping the first one. This would mean that, when computing the correlation between a set and itself (i.e. cf.compute(points, vals, points, vals)), we get <vals[i]**2> in the first bin.
-
reporter - edited description
clarified naming in example script
-
reporter A few resolutions that spring to mind:
- Leave things as they are on master
- Compare pointers or some other sort of test to determine whether the arrays are the same and self/self correlations should be ignored
- Take a function argument to determine whether self/self correlations should be ignored
- Always compute self/self correlations and let people set bin 0 to be 0 if they want it
-
Would simply seeing if delta < 1e-6 or something be sufficient? That would mean that it's two diff points.
I would also be ok with the last one as that seems like reasonable behavior.
-
Pushed the rsq < 1e-6 check. Let me know if you like it.
-
I am also in favor of having the user reset the array[0] condition as needed.
-
reporter So far, the predominant use case is for the point arrays to be the same; I think just comparing the pointers at the start might be the most reasonable course of action.
-
You mean:
if (&points == &ref_points) // bin[0] = 0 else // bin[0] is calc'd
-
reporter Well, more properly, encasing the summation while looping over particles within something like
if(i != j || points != ref_points)
. -
I can go about implementing this suggestion.
-
We can document that when ref_points != points, ref_points should not contain any points that are in points. I'm not sure how often users would even want to do that anyways. But best to have it noted how that is handled.
-
- changed status to resolved
Implementing changes to fix issue
#3→ <<cset b9031219c14d>>
-
reporter Stylistically, I much prefer
if(!x && y) foo;
to
if(x) continue; if(y) foo;
but maybe that's just me...
-
It's more
if (x) continue foo
I guess I don't really care. If you want to change it go for it.
-
-
Implementing changes to fix issue
#3→ <<cset b9031219c14d>>
-
-
-
Implementing changes to fix issue
#3→ <<cset 19b988682159>>
-
-
Implementing changes to fix issue
#3→ <<cset bd317f29fd60>>
-
-
Implementing changes to fix issue
#3→ <<cset bd317f29fd60>>
-
-
Implementing changes to fix issue
#3→ <<cset 7c1af58ef57f>>
-
-
Implementing changes to fix issue
#3→ <<cset 19b988682159>>
- Log in to comment
Compute correlation function on all bins
refs
#3→ <<cset db1bf6ffcc68>>