Memory management

Issue #152 resolved
Jens Glaser created an issue

I believe there is an issue with memory management for numpy arrays that are returned from C++ objects.

Here is a typical example I modeled after CubaticOrderParameter ( the function call results in undefined return values)

    def get_nematic_tensor(self):
        """
        :return: 3x3 matrix corresponding to the average particle orientation
        :rtype: :class:`numpy.ndarray`, shape= :math:`\\left(3, 3 \\right)`, dtype= :class:`numpy.float32`
        """
        nematic_tensor = self.thisptr.getNematicTensor()
        cdef np.npy_intp nbins[2]
        nbins[0] = <np.npy_intp>3
        nbins[1] = <np.npy_intp>3
        print(nematic_tensor.get()[0])
        cdef np.ndarray[np.float32_t,ndim=2] result = np.PyArray_SimpleNewFromData(2, nbins, np.NPY_FLOAT32, nematic_tensor.get())
        return result

the reason is that the numpy array is wrapped around the contents of the std::shared_ptr (nematic_tensor), which destroys its contents after it goes out of scope. So the array will actually be pointing to an invalid memory region.

The proper ways to pass arrays around between C++ and Cython are discussed on Stackoverflow

I haven't investigated how many methods in freud are affected.

Comments (5)

  1. Matthew Spellings

    The current model (that is used by virtually everything in freud, as far as I recall) is that the compute objects themselves own the memory associated with the arrays they will yield and that Compute.getX() will return a numpy array that references this memory stored by the compute. This could be made more robust (see #151), but at the moment if users want to keep the array around after the compute object owning it goes out of scope, they should instead store Compute.getX().copy() to explicitly make their own separate copy of the data. This should definitely be documented somewhere that is very easy to find and I don't see it explicitly mentioned anywhere with a quick scan of the documentation, so the documentation should certainly be improved.

    For your specific example, the local shared_ptr variable increases the reference count to 2, so it doesn't get deleted when that local shared_ptr goes out of scope, right? Am I perhaps misunderstanding your point?

  2. Jens Glaser reporter

    Here is an example (cpp/order/CubaticOrderParameter.cc)

    std::shared_ptr<float> CubaticOrderParameter::getCubaticTensor()
        {
        std::shared_ptr<float> cubatic_tensor = std::shared_ptr<float>(new float[81], std::default_delete<float[]>());
        memcpy(cubatic_tensor.get(), (void*)&m_cubatic_tensor.data, sizeof(float)*81);
        return cubatic_tensor;
        }
    

    The shared_ptr holds an array which is then handed over to python, where it will go out of scope.

    If other classes maintain ownership of their arrays these would of course be OK.

  3. Vyas Ramasubramani

    This specific case has been resolved on master. See Issue #151 for discussion of how to fix this more generally across all freud classes.

  4. Log in to comment