Comment logic different between Re and Regex

Create issue
Issue #271 resolved
Former user created an issue

It appears that Regex handles comments (?#comment) a bit different than Re. Specifically, Regex ignores everything until it sees the closing ).

>>> import regex
>>> regex.match(r'test(?#comment\))', 'test', regex.V0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/regex.py", line 251, in match
    return _compile(pattern, flags, kwargs).match(string, pos, endpos,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/regex.py", line 510, in _compile
    raise error("unbalanced parenthesis", pattern, source.pos)
_regex_core.error: unbalanced parenthesis at position 16

Re seems to handle escaped closing bracket \).

>>> re.match(r'test(?#comment\))', 'test')
<_sre.SRE_Match object; span=(0, 4), match='test'>

Not sure if this is intentional or not, but thought I'd at least bring it to your attention. I don't really use this style of comments, but not sure if this is desirable behavior or not.

Comments (4)

  1. Isaac Muse

    I looked into this further and compared behavior of other regular expression engines, and it appears that Python's Re is in the minority with their logic on this one. It seems Regex is mirroring what many others do. So maybe Regex is "more" correct on this one.

  2. Matthew Barnett repo owner

    As the regex module is aiming to be backwards-compatible with the re module, I'll fix this anyway in the next release.

  3. Log in to comment