Tests failed related with 'numpy.ndarray' no attibute 'typeof'

Issue #218 closed
Liang-Bo Wang created an issue

Originally reported by @jcwright in Issue #194, tests failed on both Debain and OSX.

All failed tests can be reproduced by

$ pip install numpy ipython nose
$ nosetests rpy2.ipython.tests.test_rmagic.TestRmagic -v
# skipped ...
ERROR: Test that Rpush looks for variables in the local scope first.
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../rpy2/ipython/tests/test_rmagic.py", line 70, in test_push_localscope
    result = self.ip.user_ns['result']
nose.proxy.KeyError: 'result'
# skipped ...
AttributeError: 'numpy.ndarray' object has no attribute 'typeof'
# more similar tests ...
FAILED (SKIP=3, errors=4)

Full log on pastebin.

However, after pandas installed, all these tests passed.

$ pip install pandas
$ nosetests rpy2.ipython.tests.test_rmagic.TestRmagic -v
Ran 10 tests in 0.697s

OK (SKIP=2)

Comments (14)

  1. Laurent Gautier

    @Dav Clark
    I do not remember clearly whether rmagic is requiring pandas to function or not. If so, it seems that a resolution would be to just skip the test.

  2. Dav Clark

    More specifically, we should use @unittest.skipIf. I'll plan to do this, but I'm putting it here in case anyone decides to submit a pull request first!

  3. Dav Clark

    Thanks for pinging me on this - I've got a lot on my plate and I'd thought this was a test-only issue! But I see that there are in fact relatively easy-to-encounter bugs.

    In any case, the unofficial decision that I recall from a conversation with @lgauthier is that we would try to keep rmagic working for any of: pandas + numpy, just numpy, or neither. But it was enough of a PITA to make the tests cover all of these cases that I didn't do that - the tests for rmagic mostly target pandas + numpy being installed.

    In any case, I don't want full-on bugs like the one referenced in the stack-overflow above. So, I'll make that a priority. That said, there's no separate issue for that currently, right?

  4. Dav Clark

    Also, I should add that rpy2's default behavior is to return proxy objects for arrays and dataframes (instead of numpy / pandas objects) because that's more efficient, and the proxy's support the array protocol already (right?).

    It may make more sense for the default behavior of the ipython magic to be different, though.

    I've got a stalled project to implement conversion using blaze.into, which I think is the right way to make a lot of this more sane... not sure where the best focus of my limited energy would be, but again, for now I'll focus on the actual bug.

  5. Dav Clark

    Ok - this is now fixed in the tip of version_2.4.x (549a1ac). There was an issue where pandas2ri was routing around numpy2ri. There was a doubled attempt at conversion only for line magics (which I never use, so haven't seen). But the existence of pandas was (incorrectly) disabling the initial conversion to a numpy array.

    In any case, things make a lot more sense now, and I suspect rmagic users will be very happy with this update. Someone else have a look and if all's well, let's roll this one out?

  6. Laurent Gautier

    Hi @Dav Clark, I am cool with any requirement the ipython magic might have, although I might not want all of them to bubble up to as requirements to run rpy2 in general.

    rpy2 is returning proxy objects because this is the most efficient. All proxy vectors/arrays implement Python's buffer protocol so they should work directly when used in numpy function. Arrays of strings are more problematic and will require a copy. Also, R and numpy use different systems for NAs and I think that it is better to let a user explicitly do a conversion (so eventual problems can be traced to that step).

    About conversion, I have looked at using generics (see #197) and I have made quickly a drop-in replacement for the current system. The only quirk is that the deactivate is gone unless either I use what seems a not-so-orthodox trick (http://stackoverflow.com/questions/25951651/unregister-for-singledispatch), or I make a sort of stack of generics.

    blaze.into would be better than generics, but using generics would simplify the current system and may be buy time until something better is done.

  7. Dav Clark

    To clarify in this same thread:

    1. Currently, rpy2.ipython.rmagic doesn't incur any dependency on numpy, or pandas.
      Rather, an attempt is made to import pandas2ri, and failing that, numpy2ri (else, nothing).
    2. The issue is that after calling pandas2ri.activate(), conversion is done automatically for numpy-ish
      objects, but not for DataFrames.

    In any case, I've got example code to exercise this in #206. Let's continue that conversation there. The current system otherwise appears to be working fine (there were logic errors, and I fixed them), so I don't know that it's worth a regression on .deactivate() to get a cleaner system.

  8. Laurent Gautier

    @Dav Clark

    The issue is that after calling pandas2ri.activate(), conversion is done automatically for numpy-ish objects, but not for DataFrames.

    The current implementation of the conversion is mixing-up things a little. I am cleaning this is up while moving to generics / single dispatch.

    Currently the conversion has 3 functions:

    • ri2ro: from rinterface (low-level interface to R) to robjects (high-level interface to R)

    • py2ri: from Python (as in "non-R") object to rinterface-level

    • py2ro: from Python (as in "non-R") object to robjects-level. In theory this function could be omitted by applying py2ri and then ri2ro

    This essentially means that there are 3 domains to convert to and from: rinterface, robjects and Python.

    The conversion system has been doing alright with only 3 conversion function, rather then 6 because:

    • ro2ri: this function is probably not needed because robjects-level objects inherit from rinterface-level classes.

    • ro2py: with ro2ri not really needed for the reason above, ri2py will be enough

    • ri2py: from rinterface-level to a Python (as in "non-R") objects. Would be needed, but it is absent. The confusion that exists in some parts of the code come from this (the conversion system grew rather "organically").

    The use of generics will provide much needed clarity (if you feel curious, the default branch has it with only one unit tests currently failing - and the failure is linked to that last point)

  9. Log in to comment