PyObject_GetBuffer fails on arrays

Issue #2909 invalid
Louis Sautier
created an issue

Hello, the regex package fails tests with PyPy.

Apparently PyObject_GetBuffer(string, &str_info->view, PyBUF_SIMPLE) does not work on array.array types. The relevant code is here

Test programme:

import regex
import array

a = array.array("c")
regex.compile("bla").match(a)

Result:

pypy test.py
PyObject_GetBuffer failed!
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    regex.compile("bla").match(a)
TypeError: expected string or buffer

Version info:

$ pypy --version
Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Oct 13 2018, 23:33:37)
[PyPy 6.0.0 with GCC 8.2.0]

Comments (5)

  1. Armin Rigo

    I tested, and I get the same error on top of CPython 2.7:

    PyObject_GetBuffer(some_array_object, &view, PyBUF_SIMPLE)
    TypeError: 'array.array' does not have the buffer interface
    

    so it seems that PyPy is correctly implementing the same behaviour as CPython. The difference is likely inside regex itself: the line just before is #if defined(PYPY_VERSION). Likely, nowadays we should just completely remove the PyPy special case. You can check if that works; if it doesn't, then there is some real PyPy-CPython difference for us to fix; if it does work, then I suggest opening a pull request on the regex package to remove this code.

  2. Louis Sautier reporter

    Hello @Armin Rigo, I've tried to do that but it fails with another error:

    Traceback (most recent call last):
      File "test.py", line 5, in <module>
        regex.compile("bla").match(a)
    SystemError: An exception was set, but function returned a value
    

    Here's what I did to the regex package:

    diff -r 92a74b206f0d regex_2/_regex.c
    --- a/regex_2/_regex.c  Thu Nov 22 02:09:43 2018 +0000
    +++ b/regex_2/_regex.c  Tue Nov 27 12:27:59 2018 +0100
    @@ -17539,27 +17539,6 @@
         }
    
     #endif
    -#if defined(PYPY_VERSION)
    -    /* Get pointer to string buffer. */
    -    if (PyObject_GetBuffer(string, &str_info->view, PyBUF_SIMPLE) != 0) {
    -        printf("PyObject_GetBuffer failed!\n");
    -        PyErr_SetString(PyExc_TypeError, "expected string or buffer");
    -        return FALSE;
    -    }
    -
    -    if (!str_info->view.buf) {
    -        PyBuffer_Release(&str_info->view);
    -        PyErr_SetString(PyExc_ValueError, "buffer is NULL");
    -        return FALSE;
    -    }
    -
    -    str_info->should_release = TRUE;
    -
    -    str_info->characters = str_info->view.buf;
    -    str_info->length = str_info->view.len;
    -    str_info->charsize = 1;
    -    str_info->is_unicode = FALSE;
    -#else
         /* Get pointer to string buffer. */
         buffer = Py_TYPE(string)->tp_as_buffer;
         str_info->view.len = -1;
    @@ -17620,7 +17599,6 @@
    
         str_info->length = size;
         str_info->is_unicode = FALSE;
    -#endif
    
         return TRUE;
     }
    
  3. Armin Rigo

    The C code contains a bug: it calls (*buffer->bf_getbuffer)(string, &str_info->view, PyBUF_SIMPLE), but if that returns -1, then it never clears the exception. I think that's why it ends up with the error SystemError: An exception was set, but function returned a value. I guess that the difference is that some other things that it calls happen to swallow the exception on CPython but not on PyPy. If this is correct then it's really a bug of _regex.c that should be fixed in general.

  4. Log in to comment