Resources release issues

Issue #340 resolved
Gerard Marull-Paretas
created an issue

Hi,

I am facing some weird problems when releasing resources (cffi 1.11.0). My C library has pair functions to allocate/free multiple types of objects. In some case, Object B uses A internally, so what I do in Python is to keep a reference of A in B so that B is GC first (in C no refcount is implemented). I have tried these two things to release the resources:

  1. Use ffi.gc:
a = lib.create_a()
assert(a != ffi.NULL)
self._a = ffi.gc(a, lib.destroy_a)
  1. Use del
def __init__(self):
    self._a = ffi.NULL
    self._a = lib.create_a()
    assert(self._a != ffi.NULL)

def __del__(self):
    if self._a != ffi.NULL:
        lib.destroy_a(self._a)

However, in some cases I get errors like that (note: Servo contains a reference of Network):

Exception ignored in: <bound method Servo.__del__ of <ingenialink.Servo object at 0x7fb262c83f28>>
Traceback (most recent call last):
  File "/home/int.ingeniamc.com/gmarull/ws/ingenia/libs/ingenialink-python/ingenialink/__init__.py", line 498, in __del__
AttributeError: 'NoneType' object has no attribute 'il_servo_destroy'
Exception ignored in: <bound method Network.__del__ of <ingenialink.Network object at 0x7fb262c83ef0>>
Traceback (most recent call last):
  File "/home/int.ingeniamc.com/gmarull/ws/ingenia/libs/ingenialink-python/ingenialink/__init__.py", line 384, in __del__
AttributeError: 'NoneType' object has no attribute 'il_net_destroy'

Where I understand that "lib" is NoneType. In some other cases I also get (after net/servo destructors have been called):

*** Error in `python3': corrupted double-linked list: 0x0000000002dfe6d0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f9987aed7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x80c71)[0x7f9987af6c71]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f9987afa53c]
python3[0x559c15]
python3[0x50d1cd]
python3(_PyGC_CollectNoFail+0x27)[0x600e17]
python3(PyImport_Cleanup+0x354)[0x51af74]
python3(Py_Finalize+0x5e)[0x602e1e]
python3(Py_Exit+0x8)[0x602f18]
python3[0x60300a]
python3(PyErr_PrintEx+0x36)[0x603076]
python3(PyRun_SimpleFileExFlags+0x1d9)[0x603d39]
python3(Py_Main+0x456)[0x63e756]
python3(main+0xe1)[0x4cfbd1]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f9987a96830]
python3(_start+0x29)[0x5d46c9]

Any clue on this issue? Am I doing something wrong, or could it be a cffi bug?

Comments (10)

  1. Armin Rigo

    You're likely hitting the common CPython issue where code in __del__ is called during shutdown of the interpreter. For some bogus reason (which I know about but won't explain here because it is not relevant), CPython patches all names in all the modules with None. It means that your __del__ methods need to be written very carefully in order to avoid getting error messages at shutdown. This is not related to CFFI.

    The "corrupted double-linked list" issue I have no clue about. I cannot do anything about that without getting a reproducer first, sorry.

  2. Armin Rigo

    Also, note that the first issue is specific to __del__. If you were to use ffi.gc() in all cases, then you should not hit that issue. As a general rule I'd recommend to use ffi.gc() in all cases anyway, for several reasons.

  3. Gerard Marull-Paretas reporter

    Hi Armin,

    Thanks for your replies. I was not aware of all the __del__ oddities. So yes, the 'NoneType' issue occurs only when using __del__. I have read that keeping a reference of the module in your class or using it as a parameter could overcome this issue, but still, there are no guarantees that it will run. However, I occasionally see segfaults/aborts when using ffi.gc(). I will investigate it further...

    Thanks!

  4. Gerard Marull-Paretas reporter

    Hi,

    I have observed that, even though I keep a reference of "A" on "B", when I exit CPython "A" destructor is actually called before "B", and as "B" unsubscribes from some "A" resources, this leads to a crash. I am now using ffi.gc(). Is this something expected?

    Thanks!

  5. Armin Rigo

    I am not sure, but I think that what you're seeing is this: there are two instances aand b, and two cffi pointers _a and _b on them. The instance b has a reference to the instance a. However, at shutdown, what occurs is that both a and b go away roughly at the same time, leaving _a and _b dangling. Then that's why the two ffi.gc()-installed finalizers are called in a random order. You can't really ensure an ordering with ffi.gc()... Not sure how to solve this...

  6. Gerard Marull-Paretas reporter

    Hi,

    As I have control of the C library, I have ended implementing reference counting on the C side. This way, it does not matter the destruction order and hence I see no more errors. You can probably close the issue. If you want, I may detail such situations in the docs.

    Thanks! Gerard

  7. Armin Rigo

    Thanks! I'll write a quick note.

    Note another hack that I just thought about:

    def create_a_and_b():
        _a = ffi.gc(lib.make_a(), lib.destroy_a)
    
        def my_destructor(_b):
            lib.destroy_b(_b)
            _a    # reference to make sure that _a is destroyed after _b
        _b = ffi.gc(lib.make_b(), my_destructor)
    
  8. Log in to comment