- edited description
PyObject_GetBuffer fails on arrays
Hello, the regex package fails tests with PyPy.
Apparently PyObject_GetBuffer(string, &str_info->view, PyBUF_SIMPLE)
does not work on array.array
types. The relevant code is here
Test programme:
import regex import array a = array.array("c") regex.compile("bla").match(a)
Result:
pypy test.py PyObject_GetBuffer failed! Traceback (most recent call last): File "test.py", line 5, in <module> regex.compile("bla").match(a) TypeError: expected string or buffer
Version info:
$ pypy --version Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Oct 13 2018, 23:33:37) [PyPy 6.0.0 with GCC 8.2.0]
Comments (5)
-
-
I tested, and I get the same error on top of CPython 2.7:
PyObject_GetBuffer(some_array_object, &view, PyBUF_SIMPLE) TypeError: 'array.array' does not have the buffer interface
so it seems that PyPy is correctly implementing the same behaviour as CPython. The difference is likely inside regex itself: the line just before is
#if defined(PYPY_VERSION)
. Likely, nowadays we should just completely remove the PyPy special case. You can check if that works; if it doesn't, then there is some real PyPy-CPython difference for us to fix; if it does work, then I suggest opening a pull request on the regex package to remove this code. -
- changed status to invalid
-
Hello @Armin Rigo, I've tried to do that but it fails with another error:
Traceback (most recent call last): File "test.py", line 5, in <module> regex.compile("bla").match(a) SystemError: An exception was set, but function returned a value
Here's what I did to the regex package:
diff -r 92a74b206f0d regex_2/_regex.c --- a/regex_2/_regex.c Thu Nov 22 02:09:43 2018 +0000 +++ b/regex_2/_regex.c Tue Nov 27 12:27:59 2018 +0100 @@ -17539,27 +17539,6 @@ } #endif -#if defined(PYPY_VERSION) - /* Get pointer to string buffer. */ - if (PyObject_GetBuffer(string, &str_info->view, PyBUF_SIMPLE) != 0) { - printf("PyObject_GetBuffer failed!\n"); - PyErr_SetString(PyExc_TypeError, "expected string or buffer"); - return FALSE; - } - - if (!str_info->view.buf) { - PyBuffer_Release(&str_info->view); - PyErr_SetString(PyExc_ValueError, "buffer is NULL"); - return FALSE; - } - - str_info->should_release = TRUE; - - str_info->characters = str_info->view.buf; - str_info->length = str_info->view.len; - str_info->charsize = 1; - str_info->is_unicode = FALSE; -#else /* Get pointer to string buffer. */ buffer = Py_TYPE(string)->tp_as_buffer; str_info->view.len = -1; @@ -17620,7 +17599,6 @@ str_info->length = size; str_info->is_unicode = FALSE; -#endif return TRUE; }
-
The C code contains a bug: it calls
(*buffer->bf_getbuffer)(string, &str_info->view, PyBUF_SIMPLE)
, but if that returns -1, then it never clears the exception. I think that's why it ends up with the errorSystemError: An exception was set, but function returned a value
. I guess that the difference is that some other things that it calls happen to swallow the exception on CPython but not on PyPy. If this is correct then it's really a bug of _regex.c that should be fixed in general. - Log in to comment