readline, free() invalid pointer and interfactive python freeze

Issue #10 resolved
Laurent Gautier created an issue

Joseph Xu reported:

Hi:

I've just started using rpy2, and am running into a strange crash that involves the readline module. The attached test case should sufficiently explain the problem. If you run it you should see something like

glibc detected python: free(): invalid pointer: 0xb79c6840 ***

and the terminal will stop responding.

I'm using python version 2.6.2 and rpy2 version 2.0.4 on Arch Linux.

Comments (11)

  1. Laurent Gautier reporter

    It seems to work with python-2.6.2 and both rpy2-2.0.5-dev and rpy2-2.1-dev on OS X.

    Can you try if any of those can solve your problem ?

  2. Former user Account Deleted

    I tried again with code checked out from hg, and I still get the same problem. So maybe the conflict is with the Linux readline library?

  3. Laurent Gautier reporter
    • changed status to open

    I tried on Linux and I see the problem. Going through your snippet of code I noticed that there are 2 "import readline"; why is that ?

    import readline
    from rpy2.robjects import r
    import readline
    
    readline.set_completer_delims("")
    

    After a bit of experimentation, it seems that what the problem occurs whenever the import to rpy2 is made *between* "import readline" and "readline.set_completer_delims". Before or after seems just fine (and this seem to currently be a workaround).

    At this point, I don't know whether this is an issue with rpy2 or with readline.

  4. Joseph Xu

    Hmm, I guess when I was trying to isolate a test case I didn't realize the second readline was not necessary. The real situation I encountered this bug in was a little more complicated. I imported a module that itself imported readline (it was handling tab-completion). Then I imported rpy2. Then I called a function in my module that made the set_completer_delims call. Something like this:

    -------------------- test.py -------------------------

    import test2
    from rpy2.robjects import r
    test2.f()
    

    -------------------- test2.py -------------------------

    import readline
    def f():
        readline.set_completer_delims("")
    

    -------------------------------------------------------

    You're also right about there not being a problem if I import rpy2 before readline. That solves my problem, but it still seems like it would be a good idea to get to the bottom of this issue.

  5. Laurent Gautier reporter

    My guess is that something funky is going on with either the readline package or R's use of readline.

    Setting the delimiter from Python's readline is probably making an (unsafe) assumption about something happening during import of the readline and importing R on the way breaks it all.

    The issue is left as "open", until someone finds the time to trace issue in the readline package (since this is where it bombs): this would start by tracing the code in set_completer_delims at the Python level, then I suspect go to the C-level.

    Contributions to get there are welcome.

  6. Joseph Xu

    I think you're exactly right.

    A quick look through readline.c in the python source suggests what's happening in set_completer_delims is that python is calling free() on the external variable rl_completer_word_break_characters. This is a variable defined in gnu readline's header file readline.h, and is also set by R when it's started (or in our case, imported via rpy2):

    $ ipython
    
    In [1]: import readline
    
    In [2]: readline.get_completer_delims()
    Out[2]: ' \t\n`!@#$^&*()=+[{]}\\|;:\'",<>?'  # python readline sets rl_completer_word_break_characters on import
    
    In [3]: from rpy2.robjects import r
    
    In [4]: readline.get_completer_delims()
    Out[4]: ' \t\n"\\\'`><=%;,|&{()}'   # R has set rl_completer_word_break_characters to point to something else
    
    In [5]: readline.set_completer_delims('')   # python tries to free the string it originally set
    *** glibc detected *** /usr/bin/python: free(): invalid pointer: 0xb771a840 ***
    

    So my guess is that the segfault is from python trying to free memory that is local to the R library somehow?

    It seems to me like the correct solution to this is to prevent R from setting rl_completer_word_break_characters when rpy2 is imported, since it doesn't actually have a user interface, and to let the parent python process maintain full control of readline.

  7. Laurent Gautier reporter

    Preventing R from setting rl_completer_word_break_characters does not appear straightforward at first sight (if possible at all).

    The setting happens here: src/unix/sys-std.c

    attribute_hidden
    void set_rl_word_breaks(const char *str)
    {
        static char p1[201], p2[203];
        strncpy(p1, str, 200); p1[200]= '\0';
        strncpy(p2, p1, 200); p2[200] = '\0';
        strcat(p2, "[]");
        rl_basic_word_break_characters = p2;
        rl_completer_word_break_characters = p1;
    }
    

    This function is called from

    void attribute_hidden InitOptions(void)
    

    located in src/main/options.c

        /* value from Rf_initialize_R */
        SET_TAG(v, install("rl_word_breaks"));
        SETCAR(v, mkString(" \t\n\"\\'`><=%;,|&{()}"));
        set_rl_word_breaks(" \t\n\"\\'`><=%;,|&{()}");
    

    itself called from InitOptions in main.c, itself called form setup_Rmainloop()...

    Rewritting set_Rmainloop() is certainly possible (although a possible maintenance headache), but it will stumble on the InitOptions() part (at least).

    Putting a request to the R development team is possible (but from experience has *very* limited chances of success, even when providing a patch).

  8. Laurent Gautier reporter

    The problem is likely on the R side.

    I have a patch for R-2.12 (attached). It appears to work on linux 64bits but appears to stir more problems under linux 32 bits (malloc.c:3096: sYSMALLOc: ...etc... error )

  9. Laurent Gautier reporter
    • changed milestone to 2.2.0

    I have committed a change that solves the problem without having to patch R (and seems to work on Linux 32bit): 408bae913653 .

    The potential downside is that now <readline/readline.h> is required for rpy2 to compile.

    The fix is still somewhat rough as there is nothing made for compiling in the absence of readline/readline.h (either through automated detection or explicit switch).

  10. Log in to comment