Explain how to use pep263 coding hint to open file correctly before parsing.

Issue #7 new
Former user created an issue

Thanks for your great docs on the Python ast, that really helped me to get started with Transcrypt!

I think the following should be added:

To open a Python file for parsing by the ast module, use the following code:

import tokenize

open_file = tokenize.open (<file_name>)

rather than the ordinary open function.

It will correctly take into account the pep263 coding hint, allowing parsing of files that aren't in the default (utf-8) encoding.

(Remark: I've been looking for hours for this, eventually found it in the source of the Python interpreter, probably will save time if it's in your docs.)

Comments (1)

  1. Thomas Kluyver repo owner

    I'm glad it's useful to you :-)

    I think there's a simpler way to handle the encoding if you're reading a file to pass it directly to ast: open it in binary mode and pass the bytes to ast.parse(). I just experimented, and it seems to handle encoding comments correctly:

    ast.dump(ast.parse('# encoding: cp1252\na = "¶"'.encode('cp1252')))
    

    tokenize.open() may still be useful if you want to manipulate the code as text before parsing it, though.

    Do you want to open a pull request to add this? I think it would make sense on the "Getting to and from ASTs" page.

  2. Log in to comment