Initializing a global PyObject * crashes while loading C++ extension

Issue #2798 new
Hrvoje Nikšić created an issue

The attached module crashes when compiled and imported from pypy 5.10.0.

The crash is apparently triggered by global variable initialization:

// global variable - crash goes away without it
PyObject *global = PyInt_FromLong(42);

This kind of global initialization is frequently used in our in-house CPython C++ extensions[1]. It is safe for dynamically loaded extensions because by the time the shared object gets dlopen-ed, Python has been initialized and the GIL acquired. In PyPy 5.10.0 it consistently crashes before executing the module initialization function.

Is this a bug in cpyext, or is the above pattern forbidden? Is there a workaround to make it work?

[1] The example is, of course, simplified - in actual code, the object would be a RAII wrapper object whose constructor invokes PyInt_FromLong. It is used in many different extensions and cannot be easily replaced with, say, a function that lazily constructs and returns an object.

Comments (8)

  1. Armin Rigo

    Ah, it doesn't work right now in cpyext. The issue is that we need custom initialization code after calling dlopen(). The call to PyInt_FromLong(), as I understand it, occurs before dlopen() returns. You can't easily do the same in C, so we kind of assumed it wouldn't occur...

    Maybe the only initialization that is relevant is that of cpyext itself. Does it work if you first import another module using cpyext, before you import the problematic module?

  2. Hrvoje Nikšić reporter

    The issue is that we need custom initialization code after calling dlopen(). The call to PyInt_FromLong(), as I understand it, occurs before dlopen() returns.

    That is correct. Could the custom initialization code simply be moved to before dlopen()?

    Does it work if you first import another module using cpyext, before you import the problematic module?

    Yes, then it works, thanks!

  3. Armin Rigo

    I see. The problem is that before dlopen() we don't know if it is a cpyext or a cffi module. Initializing cpyext anyway could hit the performance of small scripts that just need to import a few cffi modules, though how much is unclear. More fundamentally we should at some point give the cffi modules a different name, which would help for versioning anyway.

  4. Hrvoje Nikšić reporter

    Agreed. For my use case at work the provided workaround will be sufficient, but it would be better yet if this were fixed as you suggested.

  5. Log in to comment