ResumeDataDirectReader_getvirtual_ptr assertion error when profiling

Issue #2454 on hold
Pierre Tardy created an issue

Hello, I am running buildbot_profiler within pypy docker image The docker image I use is built here:

It is just a standard library/pypy image with buildbot installed in it, as well as buildbot_profiler:

When I start the profiler everything is fine until I add a bunch of load. Then pypy crashes with the following traceback.

RPython traceback:
  File "rpython_jit_metainterp_1.c", line 17685, in handle_jitexception_39
  File "rpython_jit_metainterp_4.c", line 1010, in execute_assembler__star_2_24
  File "rpython_jit_metainterp.c", line 2418, in ResumeGuardForcedDescr_handle_fail
  File "rpython_jit_metainterp.c", line 4616, in resume_in_blackhole
  File "rpython_jit_metainterp.c", line 11306, in blackhole_from_resumedata
  File "rpython_jit_metainterp.c", line 38639, in ResumeDataDirectReader_consume_one_section
  File "rpython_jit_codewriter.c", line 351, in enumerate_vars__unique_id
  File "rpython_jit_metainterp_2.c", line 12302, in ResumeDataDirectReader__callback_r
  File "rpython_jit_metainterp.c", line 12778, in ResumeDataDirectReader_getvirtual_ptr
Fatal RPython error: AssertionError
Aborted (core dumped)

The profiler is using the following code to collect stack traces with a classical statistical profiling approach (SIG_TIMER based):

I didn't investigate yet posting the bug first. After some feedback I see how I can make an easier to reproduce setup.

Comments (9)

  1. Pierre Tardy reporter

    Sorry, I was confused its actually pypy:2-5.6, which means its python2 and pypy 5.6 Any idea of what this assertion is about?

  2. Pierre Tardy reporter

    I was able to reproduce this on osx with official 5.6 binaries (brew binaries wont work because of the gettime issue)

    It requires a high level of concurrency to trigger. I need to connect at least 15 workers to the buildbot master to trigger the problem. Then it is very quick to crash, as soon as I enable the profiling on the master.

    Also, when I enable less workers, and the profiling works. I see a lot of bad stack traces

    only one frame with co_filename=="?" and co_name=="?" maybe its because the frame is inside the jit, I am not familiar with pypy.

    The method used is to call sys._current_frames() to get the stack traces of all the threads.

  3. Armin Rigo

    Trying to reproduce. Installed docker and downloaded docker pull buildbot/buildbot-master-pypy, but can't start it with docker run buildbot/buildbot-master-pypy:

    No master.cfg found nor 6BUILDBOT_CONFIG_URL !
    Please provide a master.cfg file in /usr/src/app or provide a 6BUILDBOT_CONFIG_URL variable via -e

    About sys._current_frames(): try help(sys._current_frames) in PyPy. It explains the limitations of this particular function in PyPy. You should look at vmprof if you are trying to profile Python code in PyPy.

  4. Log in to comment