The C-centric design philosophy makes CFFI a pain to use

Issue #104 wontfix
Isaac Freeman
created an issue

Ok, this is going to be more of a rant about the overall design philosophy of CFFI, moreso than a specific feature request. More of a general "design smell" complaint. Please bear with me while I get some gripes off my chest.

First, I want to say I love CFFI for what it is on the surface. It makes wrapping C libraries simple and fast. But it involves a lot of repetitive typing and boiler-plate to get up and running, much of which could easily be avoided with just a little bit of pythonic introspection. And yes, I KNOW "you can't do that in C either". But that's the point! That's why I'm writing in python instead of C! :P

People program in python because of the high-level nature of the language. Introspection and a dynamic coding style are some of the most important strengths of python over lower-level languages. Sure, the syntax is nice and clean, and the standard library is very convenient, but most of that is icing on the cake. The real power comes from being able to dynamically manipulate objects at runtime and inspect objects to alter behavior without a lot of intermediate boilder-plate or near-duplicate code.

But the default design philosophy of CFFI, it seems, has been essentially "make it as difficult and inflexible as C wherever possible."

For example, there's no clear way to be able to tell if an object is a CFFI datatype. There is no clear class which can be passed to insinstance(). If you look at myctype.__class__ you get _cffi_backend.CData. Of course, this doesn't actually seem to exist anywhere. The closest I could find is cffi.FFI.CData. Why is this a class attribute instead of something easily found in the cffi module? Of course, this doesn't appear to be documented anywhere, you just have to dir() around in cffi to find it.

In #pypy on Freenode, it was suggested I could do ffi.typeof(obj) to test if it's a CData object, but that just raises a TypeError if it's not a CFFI object. Seems like overkill to have a try/except just to check the type of something.

Or, the lack of any way to get the type, argument list, etc from a CFFI C function. You have to have a reference to the original FFI instance to do ffi.typeof(myfunc). (I wrote a function which accepts a verifier object, does a dir() on it, and adds a .ffi instance attr to all the funcions, just so I could have a reference to the ffi object and do "myfunc.ffi.typeof(myfunc)" which just feels wrong...)

Oh, and this brings me to my next point. Want to write the same wrapper code for all your C structs? Well, you can either copy-and-paste the same boiler-plate for each struct, or iterate over the values in the myffi._parser._declarations dict. And the keys in this dict are useless because they're like "struct My_Somestruct" instead of just "My_Somestruct", so you either have to parse/mangle that string yourself, or do

for ctype in myffi.itervalues():
    if isinstance(ctype, (cffi.model.StructType,
    cobjs[ctype.get_c_name()] = CStructType(ffi, ctype)

(Actual code from my project. And again, most of this is undocumented, just have to dir() around to find it...)

Also, I tried opening a feature request a while ago asking that the automatic type coersion for passing Python objects to C functions be extended in a generic way (issue #101) but it was rejected because "you can't do that in C either". Just seems like a pointless design philosophy to me. I'm not expecting a fully ctypes-level of pythonicness, but some accomodation for, you know, python programmers, would be nice.

Comments (8)

  1. Isaac Freeman reporter

    One more very significant difference between python and C which makes the "C-like at all costs" philosophy broken, is in python we don't have multimethods, C++-like templates or macros, things which are usually used in C/C++ to make the rigidity of C bearable. What we do have in python is introspection, duck-typing and dynamic types. So trying to do it "the C way" in python ends up being more difficult in a lot of cases than doing it "the C way" in C. So, I propose CFFI try to do things "the Python way", at least in the areas where it wouldn't impose a measurable performance hit or overly complicate things.

  2. Armin Rigo

    We might have an issue with the documentation being hard to browse:

    • You're looking for ffi.CData, which is really documented.

    • You can get the type, argument list, etc from a CFFI C function: ffi.typeof(cfunc), ffi.typeof(cfunc).args, ffi.typeof(cfunc).result, and so on. This is also (succintly) documented.

    • You are supposed to have a reference to the ffi instance from anywhere. In most cases, you are supposed to have just a single global instance of the FFI class, so you don't need to attach it to objects left and right. The fact that FFI is a class that you're supposed to instantiate is mainly a way to handle multiple completely independent usages of cffi in the same program.

    • You are not supposed to use ffi._parser._declarations directly. If you want to iterate over some structs, there is indeed no nicer way than writing down an extra list of the names of these structs. I still think that it is a very low amount of code repetition that you need to do, for a far greater benefit, which is to handle only the structs that you need, rather than just all the structs found in your cdef. I'm sure that a loop over all the structs in the cdef will eventually contain logic like "ah but if it's that struct then I don't want actually to mangle it" and so on.

    • automatic type coersion for passing Python objects to C functions: I'm sitting on my position about that, sorry. I believe that extending the cfuncs to automatically invoke various callback methods on the Python objects does not actually give anything more than doing the same yourself in wrapper functions --- and is an extra constraint. Why should the users of your library have to define a special method called __cffi_to_cdata__() because your library happens to be implemented with cffi? This feels very wrong to me. Instead, use metaprogramming and define generically (e.g. in a decorator, or with a loop, or whatever) the exact wrapping logic that makes sense for you.

    • the rigidity of C is indeed made bearable with macros. Different kinds of macros need different ways to port to Python. For a hard example, take this macro:

      define MAX(a,b) ((a)<(b)?(b):(a))

    I don't see any way to generically wrap this macro, short of waiting until the call occurs, getting the type of the arguments, and compiling a bit extra C code to handle this case. This cannot work in the compilation model that we're targetting, which is to compile things only once in advance. (This has a lot of advantages, like redistribution, but makes a few things like this one harder.)

    Considering a particular subset of all the hard cases: C++-like overloaded functions, i.e. macros that can be called with different types, but where we a know in advance what the possible type combinations are. This can already be defined as cfuncs of different names, with a custom Python wrapper that selects the correct one. I tend to believe that it's not CFFI's job to impose to you a particular choice of how the correct version should be selected, although we could indeed choose the C++ rules about overloaded function calls. I'm ready to at least consider it, if you have examples.

  3. Isaac Freeman reporter
    1. In no other library that I'm aware of is the class of some type only available as an attribute of some other class or instance. Is CData different for each FFI instance? If not, why make it an instance attribute? Why not just isinstance(myobj, cffi.CData)?

    2. I'm not saying attaching the FFI instance to all the functions is the right way to do it. The right way would be to have the typeof, args, etc available on the function instance. This is called encapsulation, and is one of the big benefits of an OO language like Python. Having to pass an object to a method of another object just to get information intrinsic to the first object seems unnecessary, confusing and complicated.

    3. Making the FFI instance available everywhere requires either a global variable, or extra arguments to every function. Again this breaks encapsulation and needlessly complicates CFFI-using code. Information about an object should be available in that object.

    4. There are plenty of simple, consistent use cases for iterating over all the structs. For example, I want to create generator functions for all the structs, so users of my library can just Somestruct(fld1=4, fld2=2). It's a very straightforward thing. And yes, I do want to do this for all of the structs in my cdef. Why wouldn't I? The alternative is to expect users of my libs to access the FFI instance in my lib and call'Somestruct *'). I think this does much more to expose the underlying implementation details. And trhe corner case of "ah but if it's that struct..." can still be handled. Refusing to implement a general case just because there maybe be special cases here and there does not seem like a good design decision to me.

    5. The point of automatic type conversions is not that I want users of my lib to implement the methods. I want to use it internally in my lib. And sure, I could just add some code to my wrapper functions, but most of the C functions for my lib don't need wrappers, they work fine as they are. The only thing I would need a wrapper for would be to convert my Python objects to C structs/etc. So it would be basically the same code in all my wrappers, wrappers which I wouldn't even need to write at all if there was just some generalized type conversion. The difference between several dozen wrapper functions, and none is not "a very low amount of code", it's a major pain.

    6. And you missed my point about macros in C. I'm not talking about wrapping C macros with CFFI. I'm saying, the "as C-like as possible" design philosophy is broken because you simply can't make Python C-like without making it more difficult to use. When coding in C, much of the tedium can be abstracted away using macros. But the tedium of using CFFI cannot be abstracted away using macros in Python because Python doesn't have macros.

  4. Armin Rigo

    It's all a concious design decision to use a C-like approach rather than an fully OO approach. Hence'MyStruct *', x=5) and not MyStruct(x=5). Yes, you are supposed to make the FFI instance a global: a module that uses internally CFFI is supposed to start with import cffi; ffi=cffi.FFI(), and then use ffi everywhere, just as if ffi was the module instead of cffi.

    ..."because Python doesn't have macros." I think you're confusing various levels here. Obviously Python doesn't have the same macros as the C language. But it has far more powerful ways to do most things. I doubt there is anything you'd like to do that you could do in C using macros, and that you can't do in Python at all. For example, writing several dozen wrapper functions is only a pain if you don't know the basics of metaprogramming: you just need to generate the wrapper functions in a loop, or use decorators.

  5. Armin Rigo

    Note that if we allowed MyStruct(x=5), it would open an endless stream of further questions: how do you create an array of MyStructs then? What if I want a pointer to a pointer to a MyStruct? And so on. We could go the full ctypes way and invent Python syntax to express everything you can do in C. That's not the way we chose for CFFI. That's why there is no easy OO-ish shortcut for some operations. If you prefer it, feel free to use ctypes, obviously.

    There are two reasons for needing to write ffi.typeof(cdata) rather than cdata.gettype() or something similar. The first is just that it feels more Pythonic to me (Python also has functions like type(), len(), etc. that are not methods). The second is that reading a method out of a cdata can be confused with reading a field out of a cdata representing a C structure.

  6. Armin Rigo

    Closing this as a Won't Fix too, for much the same reasons as issue #101. I believe that cffi indeed offers only minimal nicenesses, but you can easily add more of them --- geared for a particular project rather than generic --- because Python is flexible enough for that. Please feel free to open other issues about specific problems where no clear pure Python solution can be found.

  7. Isaac Freeman reporter

    You're still completely missing my point in most cases.

    Firstly, there is a fundamental difference between using an FFI instance and using a module. A module can just be imported anywhere, whereas to get a reference to an FFI instance you either have to already have the FFI instance somewhere, or have it passed in to you. For example, in my cffiwrap module I can't import the user's FFI instance from anywhere so I have to rely on a wrapall function and the CFunction class to store the info I need. This is the difference between ffi.typeof and Python's built-in type. type is available everywhere, but ffi.typeof is only available if you can get a reference to the FFI instance.

    Maybe a compromise here would be to make it so that the object returned by type(my_cdata_obj) has the info that ffi.typeof gives. That way you can write modules which are abstracted away from the particular FFI.

    You mention metaprogramming a lot, but that's exactly what I'm trying to do. It's just that CFFI makes it difficult since I can't get type information about functions without having a reference to the FFI instance. In my cffiwrap module, the entire CFunction class and wrapall functions could go away if I could get type info without having to have the user pass in an FFI instance.

    In other words, when you say "Obviously Python doesn't have the same macros as the C language. But it has far more powerful ways to do most things." that's exactly what I'm trying to say. And that's exactly why I'm suggesting just a little bit of added introspection would be really helpful. Look over my cffiwrap module, you'll see that I am employing lots of metaprogramming and utilizing exactly those "more powerful ways to do things" in python. It's just that there are a few areas that are kind of ugly because I have to work around limitations of CFFI. Limitations that seem arbitrary and unnecessary to me.

    In fact, even my CObject class and a lot of other code could pretty much go away if CFFI had some mechanism for automatic coercion of python objects.

    As for your concerns with the MyStruct(x=5) notation, firstly, again, you're missing my point. I wasn't saying CFFI should offer that notation, I was saying I am making that notation. All I'm asking of CFFI is to have a cleaner way to iterate over the structs in an FFI.

    But anyways, the concerns you raise with the notation I'm using, confuses the class with the instance. MyStruct() would return an instance of the struct, but creation of arrays, etc, would be classmethods. In my cffiwrap I use instances of CStructType with array() instance methods, but they just return regular CFFI struct instances. So I can create arrays of arbitrary shape as well as pointers to pointers easily without interfering with element access. The only reason I'm using CStructType instances as struct generators instead of classes for each struct is because CFFI already provides a pretty good struct object. The only problem is not being able to iterate over a list of them without digging in to myffi._parser._declarations.

    So, all I'm really asking for is

    1. Some mechanism to make it easier to automatically coerce python objects to C types (doesn't have to be a _get_cdata() method, could be anything you think is better). Again, I think my point above still stands about lots of identical, unecessary wrapper functions just to convert the same object to the same C type for each function.

    2. A cleaner way to get a list of structs and unions from an ffi object other than myffi._parser._declarations. Maybe just myffi.typedefs. It could either be just a list of StructType objects or a dictionary mapping actual struct type names to StructType objects. Or anythine else, as long as it's an exposed and documented interface.

    3. Some way to get relevant information from a function object without needing to pass an ffi object around everywhere. Again, it doesn't have to be attributes on the function objection. Maybe if type(myfunc), or even something like cffi.typeof(myfunc) (as a module function) returned something with that information, if that seems more pythonic to you. It seems to make sense that information about the function should be retrievable without needing a reference to the FFI instance it came from.

    I'm not asking you to support any of the other higher-level stuff I talked about, those were just examples, and stuff I'm more than happy to implement myself. I'm just asking for a few basic things to make it easier to implement that higher-level stuff.

    If you decide you want to allow any of the bullet points above I'd be happy to open separate feature requests or even to implement them myself and send pull requests for them.

  8. Armin Rigo
    1. ...and again, my answer above still stands about not understanding why you need to write a lot of identical wrapper functions. Please provide concrete examples in this (or a different) feature request.

    2. I guess we could have a list of names of structs and unions declared. I'm ready to accept such a patch, but not if it means exposing the StructType or any other internal type from the internal module "cffi/".

    3. I don't understand your point, sorry. Maybe there is something wrong with the documented solution of starting your module with "import cffi; ffi=cffi.FFI()" and then using this global variable everywhere. But until you explain what, I can only assume that you simply dislike it out of a different philosophy of programming, or something. Please provide concrete examples directly in the feature request. (Just pointing to existing code bases doesn't really help: I need a focused, to-the-point example)

  9. Log in to comment