Got an exception using PyPy

Create issue
Issue #221 resolved
Chuancong Gao created an issue

When using the latest version of PyPy on OS X El Capitan, the following exception happens. Regex is imported by another module agate. It seems like a bug.

  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/regex.py", line 345, in compile
    return _compile(pattern, flags, kwargs)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/regex.py", line 535, in _compile
    req_offset, req_chars, req_flags = _get_required_string(parsed, info.flags)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 4236, in _get_required_string
    req_offset, required = parsed.get_required_string(bool(flags & REVERSE))
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 1910, in get_required_string
    return self.max_width(), None
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 2411, in max_width
    return max(b.max_width() for b in self.branches)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 2411, in <genexpr>
    return max(b.max_width() for b in self.branches)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 3454, in max_width
    return sum(s.max_width() for s in self.items)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 3454, in <genexpr>
    return sum(s.max_width() for s in self.items)
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 3903, in max_width
    self.info.kwargs[self.name])
  File "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/_regex_core.py", line 3902, in <genexpr>
    return max(len(_regex.fold_case(fold_flags, i)) for i in
SystemError: An exception was set, but function returned a value

Comments (7)

  1. Matthew Barnett repo owner

    What was the pattern?

    Could you edit "/Users/genaminer/.virtualenv/pypy-5.4.1/site-packages/regex.py" to print out (using 'ascii') the pattern, the flags and kwargs just before calling _compile?

    It might be difficult to track down the bug if can't reproduce it.

  2. Chuancong Gao reporter

    Sorry for the very late reply. I was travelling recently. I just repeated this problem on macOS Sierra using PyPy 5.4.1. Python 2.7.12 and PyPy 5.3.0 works fine.

    I added the following statement in _compile (line 417) in regex.py:

    print pattern, flags, kwargs
    

    and got the following output:

     0 {}
    
        (
                \p{Uppercase_Letter} {2,}                          # 2 or more adjacent letters - UP always
            |
                \p{Uppercase_Letter}                               # target one uppercase letter, then
                    (?=
                        [^\p{Lowercase_Letter}…\p{Term}--,،﹐,]+    # not chars breaks possible UP (…abc.?!:;)
                        \p{Uppercase_Letter} {2}                   # and 2 uppercase letters
                    )
            |
                (?<=
                    \p{Uppercase_Letter} {2}                       # 2 uppercase letters
                    [^\p{Lowercase_Letter}…\p{Term}--,،﹐,]+       # not chars breaks possible UP (…abc.?!:;), then
                )
                \p{Uppercase_Letter}                               # target one uppercase letter, then
                (?!
                        \p{Lowercase_Letter}                       # not lowercase letter
                    |
                        […\p{Term}--,،﹐,]\p{Uppercase_Letter}      # and not dot (.?…!:;) with uppercase letter
                )
        )
         320 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+ 2 {}
    [^\p{AlNum}]+|(?<!\p{AlNum})(?:\L<stop_words>)(?!\p{AlNum}) 2 {'stop_words': ('a', 'an', 'the')}
    
  3. Matthew Barnett repo owner

    I've updated the sources in this repository (but not on PyPI) with the hope that it'll reveal what the exception is, because I have no idea what the problem is!

  4. Chuancong Gao reporter

    I installed the version in source. Now I get this error:

    debug: OperationError:
    debug:  operror-type: TypeError
    debug:  operror-value: exceptions must be old-style classes or derived from BaseException, not NotImplemented
    

    The issue can be reproduced easily by using the below code. It only happens when re.IGNORECASE is set.

    import regex as re
    
    x = r'(?:\L<stop_words>)'
    y = ('test',)
    
    re.compile(x, re.IGNORECASE, stop_words=y)
    
  5. Matthew Barnett repo owner

    Fixed in regex 2016.10.22.

    Bytestrings are usually handled via the buffer protocol, but PyPy was complaining for some reason (very strange!), so I've coded around it...

  6. Log in to comment