RDF accumulate leaks memory and crashes

I am trying to make a very accurate computation of the RDF from a big-ish system of hard spheres (3D). I am using HOOMD-blue for the HPMC, and it works well, with ~22k spheres and 10e6 sweeps. I'm trying to compute RDF using data every 1000 steps, with a cutoff of 8 and a step of 0.001.

But when I try to compute the RDF using Freud following the tutorials, either online in the simulation through a callback or on a saved trajectory with many steps, Freud uses up all available memory as it is iterating through the snapshots, and then crashes with "MemoryError: std::bad_alloc" after 74 of the 1e4 snapshots in my trajectory. To my understanding, once it's finished with one snapshot, that memory should be freed.

Currently I'm working around this by running one script with an outer loop that uses subprocess.Popen to spawn another Python script that runs the RDF computation on 10 snapshots at a time, then saves the result to a .npy file and exits. In the outer loop, I then add up the RDFs and average at the end. This works fine, but it's of course an ugly hack.

I'm not doing anything special, so you should be able to replicate just by taking the example linked below, switching to 3D, increasing the number of spheres, setting rmax=8.0, dr=0.001, and doing enough callbacks (around 74 on my 64GB machine).

https://github.com/joaander/hoomd-examples/blob/master/Analysis%20-%20Quantitative%20-%20Online%20analysis%20with%20Freud.ipynb

Comments (8)