pyGet() doesn't clean up __R__.namespace after function goes out of scope

Create issue
Issue #5 closed
Kevin Jin created an issue
pyExecp("__R__.namespace")
print <- pyGet("print")
pyExecp("__R__.namespace")
rm(print)
gc()
pyExecp("__R__.namespace")

I expected <built-in function print> to disappear from R.namespace after the gc() call like objects do. Instead it leaks memory in Python. My solution right now is to execute this code in before the call rm(print):

    e <- new.env()
    e$print <- print
    reg.finalizer(e, function(x) pyExec(sprintf("del(%s)", attr(print, "name"))))

Note: this also occurs with user-defined functions e.g.

pyExecp("__R__.namespace")
pyExecp("test = lambda x: x")
test <- pyGet("test")
pyExecp("__R__.namespace")
rm(test)
gc()
pyExecp("__R__.namespace")

Comments (6)

  1. Florian Schwendinger repo owner

    Thank you for pointing this out! A function should definitively be deleted if assigned to __R__.namespace, I have to look at it since it's possible that it had a reason, but it's also possible I just forgot. So it will occur for every callable Python object.

    What you could do to avoid that it, use pyFunction so it will never end up in __R__.namespace.

    pyExecp("__R__.namespace")
    pyExecp("test = lambda x: x")
    test <- pyFunction("test")
    pyExecp("__R__.namespace")
    

    But notice print has a somehow special treatment in Python to see so try,

    dir(__builtins__)
    __builtins__.range
    and
    __builtins__.print
    which results in an error
    

    Also pyGet is a very general function and a maybe more accurate name would be pyExecuteSingleLineAndReturnResult. So you could also do

    pyExecp("__R__.namespace")
    test <- pyGet("lambda x: x")
    ## or
    pyGet("[i for i in range(0, 10) if i > 5]")
    

    which still would not resolve the issue of the finalizer.

    So thank you, I will fix it in the next release.

  2. Kevin Jin reporter

    I was originally doing that, but the problem is that when you reassign the name that the function is assigned to, you "lose" the function. E.g.

    pyExecp("__R__.namespace")
    pyExecp("test = lambda x: x")
    test <- pyFunction("test")
    pyExecp("test = None")
    test(98) # throws an error
    

    I suppose that's a valid use case for avoiding pyFunction().

    The thing about returning lists in pyGet() is that lists are not stored in __R__.namespace so there's no problem with the finalizer. Only the result of dispatches to pyObject() and pyFunction() are stored in there (I imagine because lists and scalars are easily translated from Python to R whilst objects and functions must be wrapped, so the Python copies must be stored somewhere). Say however if you executed a line that created an object, then it will be stored in __R__.namespace. Once that object goes out of scope in R, then it will be deleted from __R__.namespace. Only in the case of creating functions are objects stored in __R__.namespace without being deleted when they fall out of scope in R.

    I suppose that it may have been a concern to delete functions, but recall that __R__.namespace only stores aliases of the functions rather than the functions themselves. If a function is deleted in __R__.namespace, it can still exist elsewhere.

  3. Florian Schwendinger repo owner

    I added a finializer to pyFunction and pyGet will clean up now also functions.

  4. Log in to comment