rpy2 fails to convert internal objects to numpy/pandas objects

Issue #227 resolved
christian roth
created an issue

I am working with the latest snapshot on OS X:

I see rather serious issues with the conversion between internal objects and python objects:

In [28]: robjects.r("1+1")
Out[28]: 
<FloatVector - Python:0x1084519e0 / R:0x100b225d8>
[2.000000]

Manual conversion works perfectly fine:

In [29]: import rpy2.robjects.numpy2ri as rpyn
In [30]: rpyn.ri2py(robjects.r("1+1"))
Out[30]: array([ 2.])

I have tried to track down this issue, but the ri2py conversion logic seems to be scattered across the code. Surprisingly, this rather severe issue does not seem to be covered by any test.

Comments (15)

  1. Laurent Gautier

    This is not a bug, but a result of having made the conversion system a little more coherent than it used to be.

    The conversion to "non-rpy2" objects is potentially very expensive (require a copy) and none is performed implicitly, although the implementation of such conversions can be provided.

    So what you are seeing is that calling R with robjects.r("1+1") returns a robjects object. If you to get "non-rpy2" objects:

    import rpy2.robjects.numpy2ri as rpyn
    rpyn.activate()
    
    from rpy2.robjects import conversion
    conversion.ri2py(robjects.r("1+1"))
    
    # obvious one can write a wrapper
    def my_r(string):
        return conversion.ri2py(robjects.r(string))
    
  2. christian roth reporter

    I see that this might be the desired behavior when working on scripts, but do you intend to keep this behavior in the ipython magic as well? I find it a little bit unintuitive especially when working with the notebook.

  3. Laurent Gautier

    That's via tricky one. Changing the configuration of the configuration can be made easy (ri2py can be a pass-through by default, and switching to a potentially expensive one be one register away).

    However, @Dav Clark will know better than me what the rmagic should be doing.

  4. Dav Clark

    I think this is a documentation bug. Specifically, @christian roth, if you inject the following line before executing your arithmetic, I think you get the behavior you prefer?

    rpyn.activate()
    

    this logic is also activated by the rpy2.ipython magic:

    In [1]: %load_ext rpy2.ipython
    
    In [2]: %R 1
    Out[2]: array([ 1.])
    

    The %load_ext line is effectively calling whichever activate function is most general, starting with pandas, then trying numpy, then settling on rpy2 proxy objects.

    So, would better documentation of that fix the current bug?

    I'm aware that folks have desires to have a bit more control over the conversion, and that the current system is complex. But before I didn't even understand it and parts of it weren't working. Next step is to clean things up, and I'd certainly welcome your input on that. I'll likely be resuming work over here:

    https://github.com/ContinuumIO/blaze/pull/502

    I'll give out commit rights to my blaze fork in a heartbeat!

  5. christian roth reporter

    Thanks for the reply. It seems that the conversion logic in the ipython magic does not work for me.

    In [1]:
    %load_ext rpy2.ipython
    %R 1
    
    Out[1]:
    <FloatVector - Python:0x10b5571b8 / R:0x108047598>
    [1.000000]
    

    I actually like the new conversion strategy - it really makes sense. From what I read in your comments, I think the design is perfectly fine and I'm just hitting a bug here.

  6. Dav Clark

    So, I can confirm that the exact code you have above should result in a numpy array (and on my machines it does). The next step here is the standard debugging stuff:

    1. Running rpy2 2.4.4?
    2. System architecture?
    3. numpy installed?
    4. Any other details you think would be relevant?
  7. christian roth reporter

    My rpy2 version is the latest snapshot of this bitbucket repo. This is on OS X in a conda environment with both numpy and pandas installed. Could you check for me if you can reproduce this error with the lastest rpy2 snapshot? If it does not occur, I will try to debug this issue in the next few days. Unfortunately, I have not too much time right now.

  8. Laurent Gautier

    @Dav Clark : What @christian roth is observing with the branch default is what 2.5.0 will be like unless changed (not counting "rmagic", default is quite close to a release 2.5.0). The old behaviour (obtain a non-rpy2 objects through ri2ro) was not very coherent and this is why the magic returns an robjects object.

    Having the R magic go through a conversion ri2py (rather than ri2ro) would be returning non-rpy2 objects (whenever possible). Thanks to the use of a generic/single dispatch, it would become easier than before for anyone to add customization.

  9. Dav Clark

    So, I wish I had more time to be responsive on this, but I've got some pressing deadlines. It is my intention to get this sorted out for future versions, but I might not start on this until November. And so it seems like 2.5.0 will thus likely not have full-auto-conversion.

  10. Laurent Gautier

    @christian roth I am not sure that rmagic is /broken/. I think that this is more a change in the API (resulting from a stricter behavior of the conversion function). Here it means that the *2ro conversions return robjects-level objects (and not sometimes rpy2 objects, sometimes not as it used to be the case). Now, the move to single dispatch/generics also means that customizing what the conversion return is becoming simpler.

    @Dav Clark I have been thinking a bit about that conversion business with the rmagic. What would think of having conversion functions just for it ? Something like ipython2ri and ri2ipython (names can change). It would make conversion rules specific to the rmagic possible. For example, higher costs of translation might be an acceptable trade off for convenience during interactive work in ipython but not in a script. The specific conversion would make that clear and remove to the need to an activation/deactivation dance within R magic. For rpy2-2.5.0, I could easy like add such functions to rmagic as "pass-through" and show how to register new converter so users like @christian roth get what they want with minimal effort. The usage experience collected would go into rpy2-2.6.0. What do you think ?

    [I wrote a proposal here: #230]

  11. Log in to comment