AssertionError in parse_ebnf when nesting + and []

Issue #2459 new
Vierkantor created an issue

I was using the rlib.parsing module for parsing a simple language, but I ran into an AssertionError which I can't explain (from the documentation, at least). It seems to occur whenever a symbol like + or * is used inside [] brackets. Here is an example file:

from rpython.rlib.parsing.ebnfparse import parse_ebnf
regexes, rules, ToAST = parse_ebnf("""
test: ["foo"+] "bar";
""")

When I run this code, both on the default branch and on the release-pypy2.7-v5.6.0 branch, the following error occurs:

test/test_parser.py:10: in <module>
    """)
pypy/rpython/rlib/parsing/ebnfparse.py:57: in parse_ebnf
    s.visit(visitor)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:110: in visit_file
    return node.children[0].visit(self)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:114: in visit_list
    child.visit(self)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:131: in visit_production
    expansions = node.children[2].visit(self)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:146: in visit_body
    expansion = child.visit(self)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:153: in visit_expansion
    expansion = child.visit(self)
pypy/rpython/rlib/parsing/tree.py:86: in visit
    return getattr(visitor, "visit_" + self.symbol)(self)
pypy/rpython/rlib/parsing/ebnfparse.py:161: in visit_enclosed
    assert change == " " or change == newchange
AssertionError

Commenting out the assertion that is failing doesn't seem to cause any problems and the resulting parser has the right behavior. My current workaround is to make a new terminal symbol that behaves like "foo"+ in the example code, which appears to work but is somewhat less readable.

Did I miss something or is this an actual bug in RPython and/or its documentation?

Comments (1)

  1. Armin Rigo

    You should ask on the pypy-dev@python.org mailing list, as the author of rlib.parsing might not read this.

  2. Log in to comment