CodeParser not opening source files with proper decoder

Issue #107 resolved
Brett Cannon created an issue

In CodeParser.init() you will notice that it is opening a source file and then reading it, relying on the default encoding for open(). This can trigger a UnicodeDecodeError if the source file specifies an explicit encoding other than Unicode (on Python 3).

For example, in Python's stdlib, Lib/sqlite3/test/ has a specified encoding of ISO-8859-1. But because the CodeParser doesn't use something like tokenize.detect_encoding() ( the read fails as there is some bytes in there not allowed under UTF-8 but are valid under ISO-8859-1.

Comments (4)

  1. Brett Cannon reporter

    Attached is a patch that uses Python 3.2's when available. A solution that works for Python 3.0 and 3.1 could be created by copying the implementation of, but I went the easier route. =)

    BTW, Ned, do you prefer patches or pull requests?

  2. Log in to comment