segfaulting when wrapping an R function for CV in sklearn through joblib

Issue #608 new
Jonathan Taylor created an issue

For pedagogical reasons, I’m trying to illustrate how one might wrap an R function with a parameter that is automatically cross-validated using sklearn. The example I’m using is choosing i in leaps::regsubsets(..., nvar=i). I have written a function to compute MSEs for a fixed i and it seems to run but when using _delayed and Parallel from joblib I get a segfault

Example here:

This function had previously worked when pandas2ri.activate / pandas2ri.deactivate was the method for converting DataFrames back and forth from pandas to R. I have tried a few different approaches with ContextManager but still seem to have this segfault

In [2]: rpy2.__version__                                                        

Out[2]: '3.2.0'

In [3]: import pandas                                                           


In [4]: pandas.__version__                                                      

Out[4]: '0.25.1'

In [5]: import numpy                                                            

In [6]: numpy.__version__                                                       

Out[6]: '1.17.2'

Comments (1)

  1. Jonathan Taylor reporter

    Hmm… after a little fiddling, the segfault doesn’t seem to happen anymore. I honestly cannot say what I did to fix this.

    I find conversion to / from pandas and rpy2 frustratingly complex at times, and it does seem to change fairly frequently. Is this ContextManager conversion scheme stable now?

  2. Log in to comment