Unexpected matching difference with .*? between re and regex

Create issue
Issue #212 resolved
Walter Farrell created an issue

Using regex-2016.06.02-cp35-none-win_amd64

Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.

>>> import re, regex
>>> s = r'.sr  h |<nw>|<span class="locked">|'
>>> re.match(r"\.sr (.*?) (.)(.*)\2(.*)\2(.*)", s)
<_sre.SRE_Match object; span=(0, 35), match='.sr h |<nw>|<span class="locked">|'>
>>> regex.match(r"\.sr (.*?) (.)(.*)\2(.*)\2(.*)", s)

Note that in s = r'.sr h |<nw>|<span class="locked">|' there are 2 spaces between the h and the | character. The match operation using regex seems to show (.*?) as doing a greedy match and consuming both spaces, but with re it consumes only one space.

Comments (3)

  1. Log in to comment