coverage of multiline generator/set/dict comprehensions is wrong when run with pytest's assertion rewriting

Issue #515 closed
Andy Freeland
created an issue

Code/config to reproduce available as a gist. Fails on Python 2.7 but not Python 3.5.

Essentially, given this test:

def test_foo():
    # covered!
    assert {i for i in range(10)} == {i for i in range(10)}

    # "didn't finish the set comprehension"
    assert {i for i in range(10)} == {
        i for i in range(10)
    }

    # covered!
    assert True

When run under pytest with assertion rewriting (the default), the multiline set comprehension is reported as partially covered, even though the comprehension on oneline is fully covered. I think this is a bug in coverage, not pytest's assertion rewriting, because this code passes:

import ast
from _pytest.assertion.rewrite import rewrite_asserts

oneline = """assert {i for i in range(10)} == {i for i in range(10)}"""
multiline = """assert {i for i in range(10)} == {
    i for i in range(10)
}"""

# Parse the expressions
oneline_tree = ast.parse(oneline)
multiline_tree = ast.parse(multiline)

# Dump the pre-assertion rewrite ASTs
multiline_dump_prerewrite = ast.dump(multiline_tree)
oneline_dump_prerewrite = ast.dump(oneline_tree)

# The ASTs should be the same
assert multiline_dump_prerewrite == oneline_dump_prerewrite

# Rewrite the asserts
rewrite_asserts(oneline_tree)
rewrite_asserts(multiline_tree)

# Dump the rewritten ASTs
oneline_dump_rewrite = ast.dump(oneline_tree)
multiline_dump_rewrite = ast.dump(multiline_tree)

# The ASTs should be the same
assert oneline_dump_rewrite == multiline_dump_rewrite

Comments (24)

  1. Andy Freeland reporter

    Included the output in the gist, but that test is fully covered if running pytest without assertion rewriting, tox -- --assert=plain. The snippet above though suggests that the two expressions are equivalent after the rewrite, so not sure what's going on.

  2. Artem Dayneko

    Issue is reproducible under 2.7.12, 3.3.6, 3.4.5 (and all supported pypy's - 2.4, 2.6, 3-2.4). Under 3.5.2 missing coverage is not observed.

    With assert rewriting results looks like:

    Name          Stmts   Miss Branch BrPart  Cover   Missing
    ---------------------------------------------------------
    test_foo.py       4      0      5      1    89%   6->exit
    

    Without assert rewriting (--assert=plain)it looks like

    Name          Stmts   Miss Branch BrPart  Cover   Missing
    ---------------------------------------------------------
    test_foo.py       4      0      5      0   100%
    

    One thing that is strange to me is that branch count is not zero. test_foo.py does not have branching and if we change set comprehension to list comprehension then report correctly shows 0 branches in both cases:

    Name          Stmts   Miss Branch BrPart  Cover   Missing
    ---------------------------------------------------------
    test_foo.py       4      0      0      0   100%
    

    FYI: under python 3.5.2 where issue is not observed, branch count is 4, not 5 (all covered)

  3. Andy Freeland reporter

    One thing that is strange to me is that branch count is not zero

    The branches here are the possibility of exit before the generator is evaluated. In Python 2, set/dict comprehensions are implemented as generators I believe, so won't necessarily be evaluated, vs. list comprehensions which always are. If you look at the HTML coverage, the partial lines are annotated "line N didn't finish the set comprehension on line N"

  4. Scott Colby

    I'd like to +1 this issue with what I think is a similar situation (Python 2.7.12 only, I haven't yet tried to reproduce with 3.5):

    # in my module
    data = {'a': {'a': 1, 'b': 2}, 'b': {'a': 1, 'b': None}}
    if any([score is None for row in data.values() for score in row.values()]):
        raise ValueError
    
    # in a test function
    with pytest.raises(ValueError):
        function_that_was_in()
    

    this reports a missed branch and "line 44 didn't run the list comprehension on line 44" or similar.

    Replacing with a generator epxression:

    if any(score is None for row in data.values() for score in row.values()):
        raise ValueError
    

    reports full coverage.

    Messing with --assert=plain didn't change anything, here. In this case, the comprehension/generator is in the code under test, not the test suite itself.

  5. Loic Dachary

    Here is a minimal reproducer with test_foo.py as

    def test_foo():
        assert {i for i in range(10)} == {
            i for i in range(10)
        }
    
    coverage run --branch --source=test_foo.py -m pytest test_foo.py
    

    using python-3.5 or python-2.7.12 both create a .coverage file containing

    [[2,-1],[-1,1],[-1,2],[2,-2],[-2,2],[2,2],[1,-1]]
    

    With --assert=plain the content of the .coverage file is different when using python-2.7.12 but remains the same with python-3.5

    coverage run --branch --source=test_foo.py -m pytest--assert=plain test_foo.py
    
    [[-3,3],[-1,1],[-1,2],[3,3],[3,-3],[2,-2],[3,-1],[2,3],[-2,2],[2,2],[1,-1]]
    
  6. Loic Dachary

    Could it be that starting from python-3.5 sys.settrace claims that all event == line that belong to a multiline assert are from the first line of the assert ? If pytest rewrites assertions as if they were on a single line even if they are multiline, it would explain why it only breaks python versions before 3.5. It's just a theory at this point ;-)

  7. Bruno Oliveira

    Hi @Loic Dachary,

    Here's the rewritten AST of your examplem, courtesy of pytest-ast-back-to-python:

    ============================= test session starts =============================
    platform win32 -- Python 3.5.0, pytest-3.0.5, py-1.4.31, pluggy-0.4.0
    rootdir: c:\pytest, inifile: tox.ini
    plugins: ast-back-to-python-0.1.0
    collected 1 items
    
    c:\pytest\.tmp\cov_test.py .
    =========================== Rewritten AST as Python ===========================
    import builtins as @py_builtins
    import _pytest.assertion.rewrite as @pytest_ar
    
    def test_foo():
        @py_assert0 = {i for i in range(10)}
        @py_assert3 = {i for i in range(10)}
        @py_assert2 = @py_assert0 == @py_assert3
        if not @py_assert2:
            @py_format5 = @pytest_ar._call_reprcompare(('==',), (@py_assert2,), ('%(py1)s == %(py4)s',), (@py_assert0, @py_assert3)) % {'py1': @pytest_ar._saferepr(@py_assert0), 'py4': @pytest_ar._saferepr(@py_assert3)}
            @py_format7 = ('' + 'assert %(py6)s') % {'py6': @py_format5}
            raise AssertionError(@pytest_ar._format_explanation(@py_format7))
        @py_assert0 = @py_assert2 = @py_assert3 = None
    
    ========================== 1 passed in 0.01 seconds ===========================
    

    (I have the same AST in Python 2.7 and Python 3.4)

    (Also note that to reproduce this, you either have to use pytest~=2.8 or after use the branch from this PR).

    It seems to me the rewritten expression is correct, not sure what could be the problem here.

  8. Loic Dachary

    I did not know about pytest-ast-back-to-python, cool :-) My theory is that while the rewritten assert is correct it was transformed into a onliner instead of multiline. Could it be the root cause of this confusion ?

  9. Loic Dachary

    The test_foo.py file is below and virtualenv is set with python-2.7.12 and pytest-2.8.7 and coverage.py at 24aff3d7bfd5

    def test_foo():
        assert {i for i in range(10)} == {
            i for i in range(10)
        }
    
    $ coverage run --branch --source=test_foo.py -m pytest test_foo.py
    $ cat .coverage ; echo
    ...[[2,-1],[-1,1],[-1,2],[2,-2],[-2,2],[2,2],[1,-1]]..
    $ coverage report
    Name          Stmts   Miss Branch BrPart  Cover
    -----------------------------------------------
    test_foo.py       2      0      3      1    80%
    

    This could be explained because pytest rewrites the assert as a single line instead of multiline. The content of the .coverage file is created from line events sent via sys.settrace to the coverage.py trace function and reflect the code re-written by pytest. However, coverage report compiles the unmodified source and finds that some lines are not covered.

  10. Bruno Oliveira

    This could be explained because pytest rewrites the assert as a single line instead of multiline. The content of the .coverage file is created from line events sent via the sys.setttrace to the coverage.py trace function and reflect the code re-written by pytest. However, coverage report compiles the unmodified source and finds that some lines are not covered.

    This makes sense, but wouldn't this type of assert also cause the same problem?

    def func_call(): return 1
    def func_call2(): return 1
    
    def test_foo():
        assert func_call() == \
            func_call2()
    

    ?

    Pytest will break each operand into its own statement in that case as well:

    =========================== Rewritten AST as Python ===========================
    import builtins as @py_builtins
    import _pytest.assertion.rewrite as @pytest_ar
    
    def func_call():
        return 1
    
    def func_call2():
        return 1
    
    def test_foo():
        @py_assert1 = func_call()
        @py_assert5 = func_call2()
        @py_assert3 = @py_assert1 == @py_assert5
        if not @py_assert3:
            @py_format7 = @pytest_ar._call_reprcompare(('==',), (@py_assert3,), ('''%(py2)s
    {%(py2)s = %(py0)s()
    } == %(py6)s
    {%(py6)s = %(py4)s()
    }''',), (@py_assert1, @py_assert5)) % {'py6': @pytest_ar._saferepr(@py_assert5), 'py4': @pytest_ar._saferepr(func_call2) if 'func_call2' in @py_builtins.locals() or @pytest_ar._should_repr_global_name(func_call2) else 'func_call2', 'py2': @pytest_ar._saferepr(@py_assert1), 'py0': @pytest_ar._saferepr(func_call) if 'func_call' in @py_builtins.locals() or @pytest_ar._should_repr_global_name(func_call) else 'func_call'}
            @py_format9 = ('' + 'assert %(py8)s') % {'py8': @py_format7}
            raise AssertionError(@pytest_ar._format_explanation(@py_format9))
        @py_assert1 = @py_assert3 = @py_assert5 = None
    
  11. Loic Dachary

    Hum, you're right.

    def func_call():
        pass
    
    def func_call2():
        pass
    
    def test_foo():
        assert func_call() == \
            func_call2()
    

    does not cause any problem although line 9 does not show at all in the .coverage file.

    $ coverage run --branch --source=test_bar.py -m pytest test_bar.py
    $ cat .coverage
    ...[[-4,5],[8,-7],[-1,1],[2,-1],[-1,2],[4,7],[1,4],[7,-1],[-7,8],[5,-4]]...
    $ coverage report
    Name          Stmts   Miss Branch BrPart  Cover
    -----------------------------------------------
    test_bar.py       6      0      0      0   100%
    $ coverage run --branch --source=test_bar.py -m pytest --assert=plain test_bar.py
    $ cat .coverage
    ...[[-4,5],[-1,1],[2,-1],[-1,2],[4,7],[1,4],[9,-7],[8,9],[7,-1],[-7,8],[5,-4]]...
    $ coverage report
    Name          Stmts   Miss Branch BrPart  Cover
    -----------------------------------------------
    test_bar.py       6      0      0      0   100%
    
  12. Loic Dachary

    @Bruno Oliveira

    def test_foo():
        assert func_call() == \
            func_call2()
    

    works because it does not contain any branch and coverage.py does not care about the last line. It records the assert as a single line which, if not run, indicates the function never returned (see arc 8, -7 below and notice the absence of arc involving line 9 which is the call to func_call2):

    $ coverage run --branch --source=test_bar.py -m pytest test_bar.py
    $ COVERAGE_TRACK_ARCS=1 coverage report
    ..
    Adding arc: (-1, 1): None, None
    Adding arc: (1, 4): None, None
    Adding arc: (4, 7): None, None
    Adding arc: (7, -1): None, "didn't exit the module"
    Adding arc: (-1, 2): None, None
    Adding arc: (2, -1): None, "didn't return from function 'func_call'"
    Adding arc: (-4, 5): None, None
    Adding arc: (5, -4): None, "didn't return from function 'func_call2'"
    Adding arc: (-7, 8): None, None
    Adding arc: (8, -7): None, "didn't return from function 'test_foo'"
    ..
    

    By contrast the assert from the original bug description contains arcs and coverage.py expects them to be run (see the arcs -3, 3 and 3, -3 which are about the second part of the assert):

    $ coverage run --branch --source=test_foo.py -m pytest test_foo.py
    $ COVERAGE_TRACK_ARCS=1 coverage report
    ..
    Adding arc: (-1, 1): None, None
    Adding arc: (1, -1): None, "didn't exit the module"
    Adding arc: (-1, 2): None, None
    Adding arc: (2, -1): None, "didn't return from function 'test_foo'"
    Adding arc: (-2, 2): None, "didn't run the set comprehension on line 2"
    Adding arc: (2, -2): None, "didn't finish the set comprehension on line 2"
    Adding arc: (-3, 3): None, "didn't run the set comprehension on line 3"
    Adding arc: (3, -3): None, "didn't finish the set comprehension on line 3"
    
  13. Ned Batchelder repo owner

    I'm not following all the details here, but I love that you are digging into it! When you have something for me to do, leave a clear message telling me what it is!

  14. Loic Dachary

    @Ned Batchelder I think it happens because multiline re-numbering does not happen for negative line numbers and it could be fixed with

    --- a/coverage/parser.py    Mon Dec 12 15:07:48 2016 +0100
    +++ b/coverage/parser.py    Wed Dec 14 16:31:35 2016 +0100
    @@ -183,6 +183,7 @@
                         # so record a multi-line range.
                         for l in range(first_line, elineno+1):
                             self._multiline[l] = first_line
    +                        self._multiline[-l] = -first_line                        
                     first_line = None
                     first_on_line = True
    

    Without this patch the (3,-3) and (-3,3) arcs are renumbered (2,-3) and (-3,2) which is incorrect because there is no arc between line 2 and line 3. With this patch the (3,-3) and (-3,3) arcs are renumbered (2,-2) and (-2,2) and the code coverage is correct.

    Do you think it could be the root cause of the problem ? If you're not sure I can try to explain why it accounts for all the symptoms.

  15. Loic Dachary

    During coverage report the arcs found by analysing the AST tree of a source file should show one line per multiline statement because they are remapped to the first line of the statement.

    However, the first_line function relies on the multiline data member that only remaps positive line numbers.

    For example, the set comprehension at line 3 in

    def test_foo():
        assert {i for i in range(10)} == {
            i for i in range(10)
        }
    

    is added as two arcs as (3, -3) and (-3, 3) as shown by

    $ coverage run --branch --source=test_foo.py -m pytest test_foo.py
    $ COVERAGE_ASTDUMP=1 COVERAGE_TRACK_ARCS=1 coverage report
    ...
    Adding arc: (-3, 3): None, "didn't run the set comprehension on line 3"
    Adding arc: (3, -3): None, "didn't finish the set comprehension on line 3"
    

    The arcs are then renumbered by _analyze_ast and

    [(-3, 3), (-2, 2), (-1, 1), (-1, 2), (1, -1), (2, -2), (2, -1), (3, -3)]
    

    becomes

    [(-3, 2), (-2, 2), (-1, 1), (-1, 2), (1, -1), (2, -2), (2, -1), (2, -3)]
    

    As if the (-3, 2) and (2, -3) arcs existed, although they do not. When the arcs_missing function compares that to the actual arcs collected by coverage run which are stored in the .coverage file:

    [(-2, 2), (-1, 1), (-1, 2), (1, -1), (2, -2), (2, -1), (2, 2)]
    

    and concludes that the following arcs have not be executed

    [(-3, 2), (2, -3)]
    

    Note: this can be verified by printing the local variables of the arcs_missing function.

  16. Log in to comment