segfault with rdf.accumulate()

Issue #190 closed
Jens Glaser created an issue

When using the RDF with a PyPI installed freud, I get a segfault:

        gsdfile = gsd.pygsd.GSDFile(open('out.gsd','rb'))
        traj = gsd.hoomd.HOOMDTrajectory(gsdfile)
        box = freud.box.Box(*traj[0].configuration.box)
        rdf = freud.density.RDF(rmax=10,dr=0.1)
        for frame in traj[::100]:
            rdf.accumulate(box, frame.particles.position[:])

gdb output

(gdb) run analyze.py exec plot_rdf
Starting program: /opt/packages/python/Python-3.5.2-icc-mkl/bin/python3 analyze.py exec plot_rdf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Detaching after fork from child process 29715.
Detaching after fork from child process 29716.
Detaching after fork from child process 29718.
warning: File "/opt/packages/gcc/6.3.0/lib64/libstdc++.so.6.0.22-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
Detaching after fork from child process 29794.
Computing RDF for 6f4fcdb5dfb1bc43461631efe6508846

analyze.py:166: FreudDeprecationWarning: The computeCellList function is deprecated in favor of the compute method and will be removed in a future version of freud.
  rdf.accumulate(box, frame.particles.position)
[New Thread 0x7fffce763700 (LWP 29795)]
[New Thread 0x7fffcdf61700 (LWP 29797)]
[New Thread 0x7fffce362700 (LWP 29796)]
[New Thread 0x7fffcd75f700 (LWP 29798)]
[New Thread 0x7fffcd35e700 (LWP 29800)]
[New Thread 0x7fffcdb60700 (LWP 29799)]
[New Thread 0x7fffccf5d700 (LWP 29801)]
[New Thread 0x7fffcc75b700 (LWP 29802)]
[New Thread 0x7fffbfbfe700 (LWP 29805)]
[New Thread 0x7fffbffff700 (LWP 29804)]
[New Thread 0x7fffccb5c700 (LWP 29803)]
[New Thread 0x7fffbf7fd700 (LWP 29806)]
[New Thread 0x7fffbeffb700 (LWP 29808)]
[New Thread 0x7fffbf3fc700 (LWP 29807)]
[New Thread 0x7fffbe7f9700 (LWP 29809)]
[New Thread 0x7fffbebfa700 (LWP 29810)]
[New Thread 0x7fffbdff7700 (LWP 29814)]
[New Thread 0x7fffbe3f8700 (LWP 29812)]
[New Thread 0x7fffbdbf6700 (LWP 29817)]
[New Thread 0x7fffbd7f5700 (LWP 29815)]
[New Thread 0x7fffbd3f4700 (LWP 29818)]
[New Thread 0x7fffbcff3700 (LWP 29816)]
[New Thread 0x7fffbcbf2700 (LWP 29819)]
[New Thread 0x7fff97fff700 (LWP 29820)]
[New Thread 0x7fffbc7f1700 (LWP 29821)]
[New Thread 0x7fff97bfe700 (LWP 29822)]
[New Thread 0x7fff977fd700 (LWP 29823)]

Program received signal SIGSEGV, Segmentation fault.
operator() (r=<optimized out>, __closure=0x7fffce947960) at cpp/density/RDF.cc:250
250 cpp/density/RDF.cc: No such file or directory.
(gdb) bt
#0  operator() (r=<optimized out>, __closure=0x7fffce947960) at cpp/density/RDF.cc:250
#1  run_body (r=..., this=0x7fffce947940) at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/parallel_for.h:102
#2  work_balance<tbb::interface9::internal::start_for<tbb::blocked_range<long unsigned int>, freud::density::RDF::accumulate(freud::box::Box&, const freud::locality::NeighborList*, const vec3<float>*, unsigned int, const vec3<float>*, unsigned int)::<lambda(const tbb::blocked_range<long unsigned int>&)>, const tbb::auto_partitioner>, tbb::blocked_range<long unsigned int> > (range=..., start=..., this=0x7fffce947980)
    at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/partitioner.h:444
#3  execute<tbb::interface9::internal::start_for<tbb::blocked_range<long unsigned int>, freud::density::RDF::accumulate(freud::box::Box&, const freud::locality::NeighborList*, const vec3<float>*, unsigned int, const vec3<float>*, unsigned int)::<lambda(const tbb::blocked_range<long unsigned int>&)>, const tbb::auto_partitioner>, tbb::blocked_range<long unsigned int> > (range=..., start=..., this=0x7fffce947980)
    at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/partitioner.h:255
#4  tbb::interface9::internal::start_for<tbb::blocked_range<long unsigned int>, freud::density::RDF::accumulate(freud::box::Box&, const freud::locality::NeighborList*, const vec3<float>*, unsigned int, const vec3<float>*, unsigned int)::<lambda(const tbb::blocked_range<long unsigned int>&)>, const tbb::auto_partitioner>::execute(void) (this=0x7fffce947940) at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/parallel_for.h:127
#5  0x00007fffd416c6c9 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all (this=0x3, parent=..., 
    child=0x2faa02fd0) at ../../src/tbb/custom_scheduler.h:501
#6  0x00007fffd416a1ba in tbb::internal::generic_scheduler::local_spawn_root_and_wait (this=0x3, first=0x7fffce94ff00, 
    next=@0x2faa02fd0: <error reading variable>) at ../../src/tbb/scheduler.cpp:677
#7  0x00007fffd3536510 in spawn_root_and_wait (root=...) at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/task.h:749
#8  run (partitioner=..., body=..., range=...) at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/parallel_for.h:90
#9  parallel_for<tbb::blocked_range<long unsigned int>, freud::density::RDF::accumulate(freud::box::Box&, const freud::locality::NeighborList*, const vec3<float>*, unsigned int, const vec3<float>*, unsigned int)::<lambda(const tbb::blocked_range<long unsigned int>&)> > (body=..., range=...)
    at /opt/intel/compilers_and_libraries_2017.4.196/linux/tbb/include/tbb/parallel_for.h:186
#10 freud::density::RDF::accumulate (this=0x9dd610, box=..., nlist=<optimized out>, ref_points=0xa02fd0, ref_points@entry=0x7ffff0673320, 
    Nref=Nref@entry=716, points=points@entry=0xa02fd0, Np=716) at cpp/density/RDF.cc:276
#11 0x00007fffd354139c in __pyx_pf_5freud_7density_3RDF_6accumulate (__pyx_v_nlist=<optimized out>, __pyx_v_points=0x604520, 
    __pyx_v_ref_points=0x7fffcec1b120, __pyx_v_box=<optimized out>, __pyx_v_self=0x7fffcf31f8e8) at freud/density.cpp:13275
#12 __pyx_pw_5freud_7density_3RDF_7accumulate (__pyx_v_self=0x7fffcf31f8e8, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>)
    at freud/density.cpp:12779
#13 0x00007fffd43b80d4 in __Pyx_CyFunction_CallMethod (kw=0x0, arg=0x7fffcf716108, self=0x7ffff0673320, func=0x7fffd377f608) at freud/parallel.cpp:3457
#14 __Pyx_CyFunction_CallAsMethod (func=0x7fffd377f608, args=<optimized out>, kw=0x0) at freud/parallel.cpp:3520
#15 0x00007ffff7709e6a in PyObject_Call (func=func@entry=0x7fffd377f608, arg=arg@entry=0x7fffcf70fab0, kw=kw@entry=0x0) at Objects/abstract.c:2165
#16 0x00007ffff77f25f6 in do_call (nk=<optimized out>, na=<optimized out>, pp_stack=0x7fffffffb8e0, func=<optimized out>) at Python/ceval.c:4936
#17 call_function (oparg=<optimized out>, pp_stack=0x7fffffffb8e0) at Python/ceval.c:4732
#18 PyEval_EvalFrameEx (f=f@entry=0x9e33b8, throwflag=throwflag@entry=0) at Python/ceval.c:3236
#19 0x00007ffff77f6cce in fast_function (nk=<optimized out>, na=<optimized out>, n=1, pp_stack=0x7fffffffba60, func=<optimized out>) at Python/ceval.c:4803
#20 call_function (oparg=<optimized out>, pp_stack=0x7fffffffba60) at Python/ceval.c:4730
#21 PyEval_EvalFrameEx (f=f@entry=0x9e02b8, throwflag=throwflag@entry=0) at Python/ceval.c:3236
#22 0x00007ffff77f90f6 in _PyEval_EvalCodeWithName (_co=0x7fffeac5dc90, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, 
    argcount=2, kws=0x9c2228, kwcount=0, defs=0x0, defcount=0, kwdefs=kwdefs@entry=0x0, closure=0x0, name=name@entry=0x7fffeba6a2f0, 
    qualname=0x7fffeac59a50) at Python/ceval.c:4018
#23 0x00007ffff77f61a3 in fast_function (nk=<optimized out>, na=<optimized out>, n=2, pp_stack=0x7fffffffbc90, func=<optimized out>) at Python/ceval.c:4813
#24 call_function (oparg=<optimized out>, pp_stack=0x7fffffffbc90) at Python/ceval.c:4730
#25 PyEval_EvalFrameEx (f=f@entry=0x9c2008, throwflag=throwflag@entry=0) at Python/ceval.c:3236
#26 0x00007ffff77f90f6 in _PyEval_EvalCodeWithName (_co=0x7fffeac5ded0, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, 
    argcount=1, kws=0x7ffff7f7d9a8, kwcount=0, defs=0x7fffeac51220, defcount=2, kwdefs=kwdefs@entry=0x0, closure=0x0, name=name@entry=0x7ffff7ef47d8, 
---Type <return> to continue, or q <return> to quit--- q
qualnaQuit
(gdb) quit

Comments (4)

  1. Jens Glaser reporter

    I did some extended analysis and couldn't spot obvious problems with the offending code. I did notice that memory management is inconsistent (e.g. use of shared_ptrs as class member variables inside NeighborList.cc instead of unique_ptrs, but fixing those may have to wait until the lifetime of C++ objects in freud is double-checked).

    However, I was using GCC 6.3.0. Switching to GCC 8.2.0 solved the problem for me. Which begs the question if some part of e.g. the LinkedCell.cc code inside a parallel_for portion triggered a compiler bug in GCC 6. I can supply more details upon request.

  2. Bradley Dice

    Thanks for the bug report, @jens_glaser. I see you're using TBB 2017.4.196. The PyPI v0.10.0 of freud was built with TBB 2018.4, I think. Could that be part of the problem? Did you use the same TBB version when you used GCC 6.3 and 8.2?

  3. Log in to comment