mechanism for global library initialization in the presence of subinterpreters

Issue #233 resolved
Glyph
created an issue

Cryptography has been consistently plagued by the problem of subinterpreters.

In CPython, you can create a "subinterpreter", which is an entirely new namespace for modules to be imported. However, critically, this new namespace does not include a new namespace for extension modules; all extension modules and all their shared libraries are assumed to be global.

Cryptography needs to initialize its backends at import time. Speaking generally and hypothetically, this could be any number of shared libraries; speaking specifically, it's OpenSSL, because of course it's OpenSSL. OpenSSL needs to have some global state, which points at Python callbacks, filled out at library initialization time.

It's presently (maybe?) possible to hack this together by using imp.acquire_lock, but this API is deprecated, and not all global state in libraries is necessarily initialized or manipulated exclusively at import time. Relying implicitly on the interpreter lock in python 2.7 has not thus far been robust, since bug reports continue to flow in.

Can CFFI expose a static-library-initialization lock that can be used for these circumstances so that it is unambiguously shared between all subinterpreters? Is there already a mechanism for doing this? Would perhaps a decorator saying "don't release the GIL when I call this" be a possible way forward?

Comments (13)

  1. Armin Rigo

    Just to make it clear, the problem is only partly related to CFFI. You'd have the same problem if, say, you needed to import a random C library (written using the CPython C API) and then call some of its functions to initialize it: the GIL can be released in the middle.

    There are possible hacks, like for example using a shared lock object that is attached to a random shared place. What you're asking for might look like an official way to share objects between subinterpreters. Then it's a question for python-dev, or alternatively a tiny C extension module whose purpose is only that. Such a module could even be written with CFFI, but I suspect it is easier to write it as a regular CPython C extension.

  2. Glyph reporter

    I would say that the problem I'm describing here is entirely related to CFFI. If you write a Python/C extension module, you write an initmodule that initializes the library, entirely in C, you don't call out to any Python code and you don't release the GIL while you are doing it. This is what all the examples I've found while looking for ways to do this. CFFI is "unique" in that it requires you to initialize your library entirely in Python code.

    The other problem - that writing library initialization entirely in C, in a static initialization hook, is super gross - is a bigger problem with both C and the Python C API and not something we are likely to solve here :).

  3. Armin Rigo

    I am just guessing here, but:

    • I really think the existing import lock should prevent even two Python modules from initializing the same CFFI module in parallel.

    • But I can see another problem: if subinterpreter 1 has been running for while and subinterpreter 2 starts, then it will import its own copy of the Python modules, which will re-initialize OpenSSL. This can crash OpenSSL, even more so in case subinterpreter 1 is currently running an OpenSSL function in its own thread. That case would explain crashes and also means the GIL is not what we need to look at here (subinterpreter 1 doesn't have the GIL or any lock, which is correct, as it is running some OpenSSL function).

    I would need to see the original issues you mention to know if this theory is correct, but at least, it looks like one problem. It doesn't occur with a CPython C extension module because the initmodule is run once for the whole process, not because it runs atomically.

    I can't think of a general fix to add to CFFI for that, but you can easily fix OpenSSL: add static int initialized=0; in the set_source() and int initialized; in the cdef, and check that flag to know if OpenSSL needs to be initialized now. (There is only one copy of that flag across subinterpreters.)

  4. Glyph reporter

    Is there an atomic compare-and-swap function available to set initialized and test it at the same time, though? If you want a lock to isolate initialized you have to initialize it in Python, and then the problem reverts to the previous case where you have no lock for initializing the lock.

    If you do everything at import-time (cryptography didn't used to do this, although now it does, in an attempt to get locking protections) the global import lock will theoretically protect you in 2.7; however, the import locks in importlib are more fine-grained in 3.x and will not protect you, either globally within an interpreter or across interpreters.

  5. Armin Rigo

    How about ffi.init_once(func, tag), which would work like this: it executes func(), which is supposed to initialize some things. If it is called again, it does nothing---except in case it is called in a different thread when the previous call is not finished: in this case it blocks until that call is done.

    This would provide a clean solution to a few probems, like multiple subinterpreters, and also initializing parts of the library lazily (for that case the 'tag' argument is an object whose equality is used to distinguish unrelated init_once() calls; can be a string, like the name of the part of the library).

    Note that the initialization routines would still be potentially called in a random thread, with other things going on in parallel using the already-initialized parts of the library. You have to be careful of that case if you use multiple initialization routines.

    As far as I can tell, this would be a clean solution to the problem. Does it make sense? What am I missing? :-)

  6. Armin Rigo

    (fwiw: the inspiration is the C function pthread_once(). It may be me not reading the man page correctly, but I didn't find a mention that a call should block if another thread is currently in the pthread_once() call; but that's how it works on Linux and the only useful way to implement it, so it's really the same as what I'm suggesting.)

  7. Armin Rigo

    Yes, typically you'd replace this code:

    from _xyz_cffi import ffi, lib
    
    lib.init_my_library()     # possibly more messy
    

    with:

    from _xyz_cffi import ffi, lib
    
    def initlib():
        lib.init_my_library()
    ffi.init_once(initlib, "init")
    

    or possibly this kind of code for lazy initialization:

    from _xyz_cffi import ffi, lib
    
    def initlib():
        lib.init_my_library()
    
    def make_new_foo():
        ffi.init_once(initlib, "init")
        return lib.make_foo()
    

    i.e. in all cases, use a global function.

    I'm also thinking about caching the value returned by initlib() and returning it from all future ffi.init_once() calls. Given that initlib() is called only once even if there are multiple subinterpreters, this would give a way to share init-time data among subinterpreters.

  8. Armin Rigo

    ffi.init_once() as described here is now in cffi's trunk. (Fwiw the static-callback branch is not merged yet, as it's a deeper change still worth consideration, but it will likely be merged too.)

    (Both are still open to changes, of course, either immediately or until the 1.4 release.)

  9. Log in to comment