1. Georg Brandl
  2. pygments-main
  3. Issues
Issue #799 closed

encoding argument ignored for input on stdin

Reuben Thomas
created an issue

According to the documentation:

  • If you give an encoding option, it will be used as the input and output encoding.
  • If you give an outencoding option, it will override encoding as the output encoding.

So, I run:

$ pygmentize -O encoding=iso-8859-1,outencoding=UTF-8 -l latex microtype.dtx

and it works fine. (Incidentally, the file I am using for testing is here:

http://anorien.csc.warwick.ac.uk/mirrors/CTAN/macros/latex/contrib/microtype/microtype.dtx

.) But if instead I run:

$ cat microtype.dtx | pygmentize -O encoding=iso-8859-1,outencoding=UTF-8 -l latex

Then I get the following error:

Traceback (most recent call last): File "/home/rrt/.local/bin/pygmentize", line 9, in <module> load_entry_point('Pygments==1.5', 'console_scripts', 'pygmentize')() File "/home/rrt/.local/lib/python3.2/site-packages/Pygments-1.5-py3.2.egg/pygments/cmdline.py", line 395, in main code = sys.stdin.read() File "/usr/lib/python3.2/codecs.py", line 300, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 17827: invalid start byte

Apparently when reading from stdin, pygments is ignoring the encoding argument and using the terminal encoding (I'm on a UTF-8 terminal).

Comments (4)

  1. Log in to comment