Proposal: clearer separation between API and ABI access

Issue #277 new
jbarlow NA
created an issue

I'd like to propose some changes to CFFI that create a some more conceptual separation between the ABI (in-line/out-line) and API access modes.

I can never remember how to set up API vs ABI access to a library in CFFI and have to check the cookbook every time. It's a great library but the current interface does not seem intuitive, at least not to me. It's possible to get an API when you wanted an ABI if you're not paying.

Part of the problem in my opinion is that the FFI object has 4 jobs. It creates both types of ABI access, API access, and also provides methods to marshal data between C and Python.

Requiring a C compiler is pretty major decision for a package maintainer. It creates a lot of extra work and has big implications for testing, containers, and such. (Is the package going to be installed in Docker containers? Can I get add a working compiler to that container?) manylinux1 has some some potential to improve this, but all the same, I think it makes sense for CFFI's users to clearly indicate "yes, I want an API / ABI".

I think this can be done with a few fairly simple changes:

1) Move the filename of the output module from .set_source() to .compile(). All of the other inputs to .set_source() are inputs for a C compiler, so I think this is a more logical location. Unsurprisingly .set_source() does not do any work, it just stores it for compile.

2) Offer APIBuilder and ABIBuilder classes. Deprecate cffi.FFI as a FFI builder, and give it the data marshalling job.

3) Rename ABIBuilder.compile() to .bytecompile(), as a way of indicating to the user that this function does not invoke the C compiler, but rather creates compiles the cdefs to a binary format and saves this as a Python module. Of course this is abuse of terminology to some extent – perhaps .serialize() instead would also emphasize that this stores data for quicker access.

4) Remove ABIBuilder.set_source(). Right now, the presence of C source in the second parameter is the magic that determines whether CFFI builds an API or ABI.

Simple example (ABI level, in-line)

from cffi import ABIBuilder
abi = ABIBuilder()
abi.cdef("""
  int printf(const char *format, ...);   // copy-pasted from the man page
""")
lib = abi.dlopen(None)

Real example (API level, out-of-line)

from cffi import APIBuilder
apibuilder = APIBuilder()

apibuilder.set_source(""" /* passed to C compiler */ """, libraries=[])

apibuilder.cdef("""
    int getpid(void);
""")

if __name__ == '__main__':
    apibuilder.compile("_example", verbose=True)

ABI, out of line

from cffi import ABIBuilder
abi = ABIBuilder()
abi.cdef("""
  int printf(const char *format, ...);   // copy-pasted from the man page
""")
abi.bytecompile("_example", verbose=True)

Comments (4)

  1. Armin Rigo

    In big and complicated cases, the build script needs a full-featured ffi object to do some testing. For example, it might call ffi.cast(); or in the more extreme examples, it calls ffi.new() and then invoke some C functions from a ffi.dlopen(), to inspect what it gives. I recently renamed ffi to ffibuilder in the documentation, but this ffibuilder is still an instance of FFI. If we go for a different solution involving completely separate classes like ABIBuilder and APIBuilder, we need a new way to achieve that result too...

    I agree in theory about putting the file name in compile() instead of set_source(), and calling the ABI version bytecompile(). Of course it's a bit of a mess to break backward compatibility without also adding the new A?IBuilder classes...

  2. jbarlow NA reporter

    I think it can be done without breaking backward compatibility. APIBuilder and ABIBuilder become the new interface and would be wrappers around the existing ffi object, which would become the "low-level interface".

  3. Armin Rigo

    Looks interesting. Note that there are already two types: _cffi_backend.FFI is a built-in type containing only what is needed at runtime, and cffi.FFI is a Python type that exposes the same interface plus methods like cdef() and set_source() and compile(). So I guess what you're proposing would be to deprecate all usages of the cffi.FFI type, and replace it with ABI/APIBuilder. It would have abi.ffi and api.ffi objects, in order to get a low-level _cffi_backend.FFI object. That leaves the Python-defined cffi.FFI type as the backward-compatiblity interface, not to be used any more: a mixture of both _cffi_backend.FFI and API/ABIBuilder...

    It's an interesting idea, but at the same time, it is yet another (slight) change of perspective, so I have to weight in issues like needing to write code that runs on older versions. It's not really a problem on CPython, but PyPy comes with a frozen version of cffi. Maybe it is time to think about splitting the cffi PyPI package in two, with a separate _cffi_backend package (which is built-in into PyPy), so that cffi itself can still be upgraded (and is not needed in some cases)...

  4. Armin Rigo

    Here is a more detailed point-to-point comment on your proposal. Starting from the API solution:

    Real example (API level, out-of-line)

    from cffi import APIBuilder
    apibuilder = APIBuilder()
    
    apibuilder.set_source(""" /* passed to C compiler */ """, libraries=[])
    
    apibuilder.cdef("""
        int getpid(void);
    """)
    
    if __name__ == '__main__':
        apibuilder.compile("_example", verbose=True)
    

    You are moving "_example" to a place that isn't executed at all if the module above is merely imported instead of being executed. That's a problem for setup.py. Right now the way it works in setup.py is to name the ffibuilder global variable (now the apibuilder), but it doesn't contain any "_example".

    Maybe give "_example" elsewhere, e.g. in the constructor? We would say apibuilder = APIBuilder("_example"), which makes it clear that we're building the APIBuilder for the _example module.

    I guess that the default value of verbose in the .compile() call could be True.

    ABI, out of line

    Same comment about abibuilder = ABIBuilder("_example").

    About abibuilder.bytecompile(): I agree that abibuilder.serialize() would be clearer, as that's really what that method does. Usually "byte-compilation" means something different: "turn .py into .pyc files".

    Simple example (ABI level, in-line)

    from cffi import ABIBuilder
    abi = ABIBuilder()
    abi.cdef("""
      int printf(const char *format, ...);   // copy-pasted from the man page
    """)
    lib = abi.dlopen(None)
    

    I think that for inline usage we should continue to use from cffi import FFI and not an ABIBuilder. In this case it's not really a builder anyway, in the sense that it's not about generating a file from an out-of-line module. You really need an ffi anyway, too.

    Do you want to do a prototype of this refactoring?

  5. Log in to comment