Error converting pandas `Series` with dtype `pandas.Int32Dtype`, or `pandas.Int64Dtype`.

Issue #544 resolved
Laurent Gautier created an issue

This (relatively new ?) pandas types appear to be be object / int hybrids to acconmodate the possibility of having “NA” values in integer vector.

Comments (9)

  1. Gijs Molenaar

    I think i might still experience this issue. This code:

    import numpy as np
    from rpy2 import __version__
    from rpy2.robjects import r, pandas2ri
    import pandas as pd
    
    print(__version__)
    pandas2ri.activate()
    p = r['print']
    df = pd.DataFrame([(np.NaN,)], dtype=pd.Int64Dtype())
    print(df)
    p(df)
    

    gives:

    R[write to console]: During startup - 
    R[write to console]: Warning messages:
    R[write to console]: 1: Setting LC_COLLATE failed, using "C" 
    R[write to console]: 2: Setting LC_TIME failed, using "C" 
    R[write to console]: 3: Setting LC_MESSAGES failed, using "C" 
    R[write to console]: 4: Setting LC_MONETARY failed, using "C" 
    3.0.4
         0
    0  NaN
    /Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py:62: UserWarning: Error while trying to convert the column "0". Fall back to string conversion. The error is: 'Int64Dtype' object has no attribute 'isnative'
      % (name, str(e)))
    Traceback (most recent call last):
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py", line 57, in py2rpy_pandasdataframe
        od[name] = conversion.py2rpy(values)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/functools.py", line 827, in wrapper
        return dispatch(args[0].__class__)(*args, **kw)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py", line 144, in py2rpy_pandasseries
        res = func(obj)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/numpy2ri.py", line 56, in numpy2rpy
        if not o.dtype.isnative:
    AttributeError: 'Int64Dtype' object has no attribute 'isnative'
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 361, in from_object
        mv = memoryview(obj)
    TypeError: memoryview: a bytes-like object is required, not 'Series'
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3296, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-2-90913d81bcd8>", line 1, in <module>
        runfile('/Users/gijs/Library/Preferences/PyCharm2019.1/scratches/scratch_1.py', wdir='/Users/gijs/Library/Preferences/PyCharm2019.1/scratches')
      File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
        pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
      File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
        exec(compile(contents+"\n", file, 'exec'), glob, loc)
      File "/Users/gijs/Library/Preferences/PyCharm2019.1/scratches/scratch_1.py", line 11, in <module>
        p(df)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 192, in __call__
        .__call__(*args, **kwargs))
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 113, in __call__
        new_args = [conversion.py2rpy(a) for a in args]
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 113, in <listcomp>
        new_args = [conversion.py2rpy(a) for a in args]
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/functools.py", line 827, in wrapper
        return dispatch(args[0].__class__)(*args, **kw)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py", line 63, in py2rpy_pandasdataframe
        od[name] = StrVector(values)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/robjects/vectors.py", line 379, in __init__
        super().__init__(obj)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 288, in __init__
        super().__init__(type(self).from_object(obj).__sexp__)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 365, in from_object
        res = cls.from_iterable(obj)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 28, in _
        cdata = function(*args, **kwargs)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 314, in from_iterable
        cast_in)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 239, in _populate_r_vector
        set_elt(r_vector, i, cast_value(v))
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 424, in _as_charsxp_cdata
        return conversion._str_to_charsxp(x)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 120, in _str_to_charsxp
        cchar = _str_to_cchar(val)
      File "/Users/gijs/Work/thingy/.venv/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 99, in _str_to_cchar
        b = s.encode(encoding)
    AttributeError: 'float' object has no attribute 'encode'
    R[write to console]: Warning: stack imbalance in 'lazyLoadDBfetch', 22 then 23
    
  2. Gijs Molenaar

    Correct me if I’m wrong, but it looks like the actual handling of pandas extension types is not implemented?

    https://pandas.pydata.org/pandas-docs/stable/development/extending.html#extension-types

    Currently, rpy2 seems to try the interpreted the array as a native numpy array, but various methods like .isnative() and .ravel() are not implemented for these types:

    https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.arrays.IntegerArray.html#pandas.arrays.IntegerArray

  3. Gijs Molenaar

    I’ve managed to insert R NA values, it is in the PR.

    have you seen this discussion? If you let me know how the round trip back to numpy should be handled (R NA values back to numy NaN) I can give it a shot.

  4. Laurent Gautier reporter

    Correct me if I’m wrong, but it looks like the actual handling of pandas extension types is not implemented?

    Unfortunately they not yet fully supported (the integer types might be the only ones): these are quite recent additions to pandas, and I am learning about them through issue reports or unit tests suddenly breaking.

    I’ve managed to insert R NA values, it is in the PR.

    have you seen this discussion? If you let me know how the round trip back to numpy should be handled (R NA values back to numy NaN) I can give it a shot.

    I am a bit behind PR reviews. I should be able to look at it soon. The round trip will likely introduce loss of information (since R has both NaN and NA), but I am increasingly considering that conversion can’t escape this (meaning the loss of information).

  5. Log in to comment