Segmentation fault when feeding stats.var_test with two pandas series of different dtype

Issue #264 resolved
Kawing Chiu created an issue

Got a segmentation fault with the following less than 10 lines of code:

from pandas import read_csv
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri

stats = importr("stats")

pandas2ri.activate()

data = read_csv("core.csv")
var1 = data.iloc[:,0]  # choose the wrong column instead of the data column
var2 = data.iloc[:,1]

stats.var_test(var1, var2)

Contents of core.csv:

$ cat core.csv 
date,var1,var2
2015-01-10,5.10,4.76
2015-01-11,5.84,4.88
2015-01-12,5.10,4.81
2015-01-13,6.01,5.87
2015-01-14,3.69,3.41
2015-01-15,3.79,3.81
2015-01-16,4.53,4.64
2015-01-17,5.04,4.94

It is under python 3.4.2 and rpy2 2.5.5, arch linux platform.

Comments (4)

  1. Laurent Gautier

    This is happening when trying to convert a date object:

    from rpy2 import robjects
    robjects.conversion.py2ri(var1)
    
    # segfault
    

    The specific conversion function called is obtained as follows:

    robjects.conversion.py2ri.dispatch(type(var1))
    # answer: rpy2.robjects.pandas2ri.py2ri_pandasseries
    

    The root of the problem appear to be when trying to assign a slot "names" to the object (the following line in the conversion):

    res.do_slot_assign('names', ListVector({'x': conversion.py2ri(obj.index)}))
    

    I am suspecting an issue upstream (in R and R's C API), where "names" has internally acquired a special status in recent versions and the macro SET_SLOT is no longer working like the rest when "names" is the slot name.

  2. Log in to comment