A way to allocate and initialize unmanaged memory with ffi.new-like syntax

Issue #115 resolved
Tyler Wade
created an issue

I would like to see cffi provide a way to allocate/initialize memory in a way similar to ffi.new, but without the memory being freed automatically by cffi. My use case is returning a string from a callback. Using ffi.new won't work because cffi will free the string automatically and while I can use malloc and copy the string manually, being able to simply write return ffi.unmanaged_new('char*', some_str) would be much easier and more readable. (I'm not suggesting the name unmanaged_new.)

Comments (17)

  1. Armin Rigo

    It opens a lot of questions: would the memory be allocated by the C malloc so that the C code can use free on it? What about systems that use a different pair of functions (more common on Windows but exists on Linux too)? Or would you have to use ffi.unmanaged_free()? In the latter case it might be enough to maintain the result of ffi.new() alive by storing it somewhere temporarily.

  2. Tyler Wade reporter

    Fair enough. A couple of possible alternatives/compromises:

    Have ffi.unmanaged_new take a malloc parameter: ffi.unmanaged_new('char*', lib.malloc, some_str)


    Provide a function for initializing a chunk of memory: ffi.memcpy(some_ptr, 'char*', some_str). This is potentially more useful, but it may be hard for the user to know how big the chunk of memory needs to be.

    Either would work fine for me.

  3. Armin Rigo

    Assignment already supports most of the same syntax:

    ffi.cdef("struct s { int x, y, z; }")
    p = ffi.new("struct s *")    # or any other way
    p[0] = [4, 5, 6]
    p[0] = {'x': 4, 'z': 6}
    a = ffi.new("char[]", 10)
    a[0:6] = "foobar"
    a = ffi.new("int[]", 10)
    a[0:3] = [10, 20, 30]
  4. Glyph

    I have run into this issue a couple of times in various ways - populating an array of pointers to structures, for example, where each pointer is malloced, or in a couple of cases allocating immortal objects that are handed off to a library as global values as part of initialization. I've generally worked around it with the WeakKeyDictionary or assigning the cdata to a global in Python.

  5. Armin Rigo

    I'm sorry if I'm slow, but I don't understand what you'd like to improve. I can imagine from your description that your code looks like this (first case):

        lst = [ffi.malloc("struct foo *") for i in range(10)]
        list_of_pointers_p[0:10] = lst

    How is it really? How do you suggest it could be improved?

    Same questions in the 2nd case you describe.

  6. Glyph

    What I want for that example is to write this:

    list_of_pointers[0:n] = [ffi.unmanaged_new("struct foo *", dict(int_member=i, char_member=str(i))) for i in range(n)]

    The improvement over your malloc example is that this can be constructed as an expression rather than a function which uses assignment to loop over the pointers. If there are members to struct foo other than int_member and char_member, ffi.[unmanaged_]new will zero them; malloc won't.

    I didn't realize that assignment had these same options (my eyes glazed over your last comment there before I posted mine, I did not see the dictionary syntax there). Assignment is not an expression though, which makes it difficult to nest these invocations. I guess what you're saying is that unmanaged_new is so easy to implement with assignment, why should it be provided by cffi? If that is the question then the answer is "because everything in C is so incredibly easy to screw up :)"


    p[0] = {'x': 4, 'z': 6}

    zero p[0].y?

  7. Armin Rigo

    Assignment is not an expression though, which makes it difficult to nest these invocations.

    Yes, but it is unclear if it is a good thing to provide this easy syntax. People will not realize that this variant needs explicit free (or can only be used at import time, where you don't care about leaks). Freeing nested structures cannot be done with a generic helper; custom code is needed. In some cases people want to use custom allocators, not the plain malloc(). In summary I see very little to turn into a general function. How about adding this helper into the documentation instead:

    def malloc_and_set(ctype, init):
         p = lib.malloc(ffi.sizeof(ctype))
         if not p: raise MemoryError
         p[0] = init
         return p
  8. Armin Rigo

    Or maybe a pair of optional keyword arguments to ffi.new() (sorry if I'm going in circles), allowing you to say:

    ffi.new("foo_t *", {'x': x, 'y': y}, alloc=lib.malloc, free=lib.free)
    ffi.new("foo_t *", {'x': x, 'y': y}, alloc=lib.malloc, free=None)   # no automatic free
  9. Glyph

    That latter one actually sounds great. That would also address an issue in OS X where some things need weird custom allocators sometimes. (I've never seen this be a practical issue, but all the C-level APIs, like CoreFoundation, take an "allocator" and it would be nice to be able to be consistent with that.)

  10. Glyph

    That would make the helper function:

    def malloc_and_set(ctype, init):
        p = lib.malloc(ffi.sizeof(ctype))
        if p == ffi.NULL:
            raise MemoryError()
        lib.memset(p, 0, ffi.sizeof(ctype))
        p[0] = init
        return p


    (and this is exactly why it should be included library-wise, if even you can't get all the details right on the first try it is probably not good to have people copy/pasting it)

  11. Armin Rigo

    There are even more problems with this malloc_and_set: it needs to cast p to something like item *, and it is missing the ffi.gc(p, lib.free).

    I'm thinking about recognizing "officially" that there are alternative allocators, with the following way to access them:

    allocator = ffi.new_allocator(alloc, free, should_clear)
    p = allocator("struct foo_s")
    q = allocator("int[5]", [50, 60, 70, 80, 90])

    The new_allocator() can be created with free=None if needed. The should_clear argument is an optional bool flag: if True (the default), the memory is zeroed after the allocator is called; if False, it is not (assumed to be either already zeroed, or else you don't mind getting the final pointer containing uninitialized data). In all cases, if the alloc() function returns NULL, a MemoryError is raised.

    This is easy to implement at the level of C, by wrapping the result of alloc() inside the same kind of object as ffi.gc().

  12. Log in to comment