Cannot convert Python Pandas Dataframe to R

Create issue
Issue #16 new
Yichi Liu created an issue

Hi,

I wanted to convert the pandas dataframe to R dataframe using pyGet, but I got the error below:

Error in as.data.frame.default(xi, optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ""PythonObject"" to a data.frame

The text.csv file only contains two columns, name (string) and value (int), there is no timestamp type data inside. I printed the pandas Dataframe inside the pyExec, and also the type of the py_df, with results below: Capture.PNG

Can you help to suggest how to convert a Pandas dataframe to R data.frame?

Thanks a lot!!

Comments (9)

  1. Florian Schwendinger repo owner

    Thank you for the report! I tried to reproduce the error but on my laptop it works.

    library(PythonInR)
    
    pyExec("
    import pandas as pd
    
    d = {'name': ('aaaa', 'bbbb', 'cccc', 'dddd'), 'value': (3, 5, 8, 2)}
    df = pd.DataFrame(d)
    ")
    
    pyExecp("df")
    pyPrint("df")
    pyExecp("type(df)")
    
    pyGet("df")
    

    I guess something goes wrong with the encoding. Could you also provide the session information sessionInfo() and if possible the file testtest.csv? And also maybe the output of pyExecp("py_df.apply(type)")?

  2. wysh1989

    I also have the same issue.

    pyExec("d = {'name': ('aaaa', 'bbbb', 'cccc', 'dddd'), 'value': (3, 5, 8, 2)}") pyExec("df = pd.DataFrame(d)") pyExecp("df") name value 0 aaaa 3 1 bbbb 5 2 cccc 8 3 dddd 2 pyPrint("df") name value 0 aaaa 3 1 bbbb 5 2 cccc 8 3 dddd 2 pyExecp("type(df)") <class 'pandas.core.frame.DataFrame'> pyExecp("df.apply(type)") name <class 'pandas.core.series.Series'> value <class 'pandas.core.series.Series'> dtype: object

    pyGet("df") cannot coerce class ""PythonObject"" to a data.frame[1] "cannot coerce class \"\"PythonObject\"\" to a data.frame" attr(,"class") [1] "errorMessage"

  3. Florian Schwendinger repo owner

    As I tried to explain before, at my PC it works. If you don't provide information like sessionInfo() it is not possible for me to see why it doesn't work on your PC. Please run the following script.

    library(PythonInR)
    
    sessionInfo()
    
    pyExec("import pandas as pd")
    
    pyExec("d = {'name': ('aaaa', 'bbbb', 'cccc', 'dddd'), 'value': (3, 5, 8, 2)}") 
    pyExec("df = pd.DataFrame(d)") 
    pyExecp("df") 
    pyPrint("df")
    pyExecp("type(df)")
    pyExecp("df.apply(type)") 
    
    df <- pyGet("df") 
    class(df)
    df
    

    and post the output at my PC it looks like this

    R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
    Copyright (C) 2017 The R Foundation for Statistical Computing
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
    Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
    Tippen Sie 'license()' or 'licence()' für Details dazu.
    
    R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
    Tippen Sie 'contributors()' für mehr Information und 'citation()',
    um zu erfahren, wie R oder R packages in Publikationen zitiert werden können.
    
    Tippen Sie 'demo()' für einige Demos, 'help()' für on-line Hilfe, oder
    'help.start()' für eine HTML Browserschnittstelle zur Hilfe.
    Tippen Sie 'q()', um R zu verlassen.
    
    [Vorher gesicherter Workspace wiederhergestellt]
    
    > library(PythonInR)
    
    Initialize Python Version 2.7.9 (default, Mar  1 2015, 13:01:26) 
    [GCC 4.9.2]
    
    > 
    > sessionInfo()
    R version 3.4.0 (2017-04-21)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: Debian GNU/Linux 8 (jessie)
    
    Matrix products: default
    BLAS: /home/florian/bin/R_dev/lib/libRblas.so
    LAPACK: /home/florian/bin/R_dev/lib/libRlapack.so
    
    locale:
     [1] LC_CTYPE=de_AT.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=de_AT.UTF-8    
     [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=de_AT.UTF-8   
     [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] PythonInR_0.1-3
    
    loaded via a namespace (and not attached):
    [1] compiler_3.4.0 R6_2.2.1       pack_0.1-1    
    > 
    > pyExec("import pandas as pd")
    > 
    > pyExec("d = {'name': ('aaaa', 'bbbb', 'cccc', 'dddd'), 'value': (3, 5, 8, 2)}") 
    > pyExec("df = pd.DataFrame(d)") 
    > pyExecp("df") 
       name  value
    0  aaaa      3
    1  bbbb      5
    2  cccc      8
    3  dddd      2
    > pyPrint("df")
       name  value
    0  aaaa      3
    1  bbbb      5
    2  cccc      8
    3  dddd      2
    > pyExecp("type(df)")
    <class 'pandas.core.frame.DataFrame'>
    > pyExecp("df.apply(type)") 
    name     <property object at 0x7f822c824730>
    value    <property object at 0x7f822c824730>
    dtype: object
    > 
    > df <- pyGet("df") 
    > class(df)
    [1] "data.frame"
    > df
      name value
    0 aaaa     3
    1 bbbb     5
    2 cccc     8
    3 dddd     2
    > 
    > proc.time()
           User      System verstrichen 
          0.928       0.188       1.075 
    
  4. wysh1989
    > sessionInfo()
    R version 3.2.2 (2015-08-14)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows 7 x64 (build 7601) Service Pack 1
    
    locale:
    [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
    [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
    
    attached base packages:
    [1] stats     graphics  grDevices datasets  tools     utils     methods   base     
    
    other attached packages:
    [1] PythonInR_0.1-3 devtools_1.12.0
    
    loaded via a namespace (and not attached):
    [1] R6_2.2.0      withr_1.0.2   memoise_1.1.0 pack_0.1-1    digest_0.6.12
    
  5. Log in to comment