Issue #303 resolved
Douglas Figueiredo created an issue
Traceback (most recent call last):
  File "/usr/local/bin/coverage", line 8, in <module>
    load_entry_point('coverage==3.7.1', 'console_scripts', 'coverage')()
  File "/Library/Python/2.7/site-packages/coverage/cmdline.py", line 721, in main
    status = CoverageScript().command_line(argv)
  File "/Library/Python/2.7/site-packages/coverage/cmdline.py", line 461, in command_line
  File "/Library/Python/2.7/site-packages/coverage/control.py", line 662, in html_report
    return reporter.report(morfs)
  File "/Library/Python/2.7/site-packages/coverage/html.py", line 113, in report
    self.report_files(self.html_file, morfs, self.config.html_dir)
  File "/Library/Python/2.7/site-packages/coverage/report.py", line 84, in report_files
    report_fn(cu, self.coverage._analyze(cu))
  File "/Library/Python/2.7/site-packages/coverage/html.py", line 253, in html_file
    html = html.decode(encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3418: ordinal not in range(128)

Comments (18)

  1. Ned Batchelder repo owner
    • edited description

    Can you please provide more detail? In particular, a reproducible test case would be appreciated. My guess is that you have a non-ASCII character in a comment in your source file, and that if you add a coding comment to the top, coverage will work fine:

    # -*- coding: utf8 -*-
  2. Douglas Figueiredo reporter

    I tried this solution, but here it's didn't work. So I tried the proposed solution. This problem occurred on Fedora 18 using Python 2.7.3

  3. Douglas Figueiredo reporter

    Another detail: I'm using django unit test like this: coverage run --source='.' manage.py test client.tests . Using a simple code, it's works fine, like this: coverage run myprogram.py The problem occurs only when I call "coverage html"

  4. Douglas Figueiredo reporter

    It's a piece of a IPTV project (Private) and It's only work complete. I would like but I can't. Thank you for your time, your are so helpful.

  5. Robert Sussland

    I am having the same issue. There are no non-unicode characters in any of the python source files, however the test suite I am running coverage on downloads files with unicode file names and processes file with unicode characters. I cannot send you my code as it is pulling data from an on-site database. The stack trace doesn't show which file triggered the error:

      File "/Users/rsussland/pop/bin/coverage", line 9, in <module>
        load_entry_point('coverage==4.0a0', 'console_scripts', 'coverage')()
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/cmdline.py", line 747, in main
        status = CoverageScript().command_line(argv)
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/cmdline.py", line 467, in command_line
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/control.py", line 679, in html_report
        return reporter.report(morfs)
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/html.py", line 109, in report
        self.report_files(self.html_file, morfs, self.config.html_dir)
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/report.py", line 81, in report_files
        report_fn(cu, self.coverage._analyze(cu))
      File "/Users/rsussland/pop/lib/python2.7/site-packages/coverage-4.0a0-py2.7-macosx-10.6-intel.egg/coverage/html.py", line 241, in html_file
        html = html.decode(encoding)
      File "/Users/rsussland/pop/lib/python2.7/encodings/utf_8.py", line 16, in decode
        return codecs.utf_8_decode(input, errors, True)
    UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 231299: invalid start byte
  6. Robert Sussland

    The workaround -- ignore errors -- works just fine for me. Also, both the original reporter and myself are on Macs.

  7. Ian Cordasco

    The problem is that ignoring errors when encoding unicode in ascii is that you're going to lose data. What data is trying to be used, I'm not certain, but @rsussland could you please put your traceback in a fenced code block, i.e., precede it with three backticks and a newline, and follow it by a newline and three backticks. That will make it much easier to read.

  8. Robert Sussland

    Made the change -- I understand that there may be some loss of data but before I was getting 0% data as the html failed to generate.

  9. Ian Cordasco

    So I'm looking at this code, and I'm wondering why decode is even called. It seems to be a special case for Python 2.6 and 2.7 but the string API in those versions is different than Python 3. You can do 'foo bar bogus'.encode('ascii') with confidence on Python 2. I'm not sure how well it will work with xmlcharrefreplace, but I suspect that is possibly the only problem. I'm going to investigate a bit.

  10. Robert Sussland

    I'm not familiar enough with this code to determine whether the policy is to work with code points or bytestrings, but unless you import unicode_literals, 'foo bar bogus' is already an ascii encoded byte string, so calling encode on it will decode with the ascii codec and re-encode it with the same codec. If you are intending to work with code points then you do need to decode, but I don't see any of the standard patterns for doing that in the html.py file -- for example, you are calling with open instead of with codecs.open, so your other strings are already encoded as bytestrings and not code points, in which case no need to decode at all, but care must be taken when combining bytestrings of different encodings (they are all ascii byte strings).

  11. Ned Batchelder repo owner

    @rsussland The line of code in question is dealing with the HTML version of your source files. I find it very hard to believe that you have no non-ascii characters in your source file. Perhaps in a comment? A curly apostrophe? The data you download for your tests doesn't matter, that isn't part of the HTML report.

    In Python 2.7, do this:


    Does it succeed, or raise an exception? Also, it looks like your file is really large? Can you share any of the code with me?

  12. Ned Batchelder repo owner

    Sorry, I see that you say the file isn't in the error message. At the very least, I can add some information there so these problems are easier to diagnose while we decide on an approach.

  13. Log in to comment