Fix issue 10: UnicodeDecodeError from pip

#5 Merged at f79fcdc
  1. Marc Abramowitz


os.environ['PYTHONIOENCODING'] = 'utf_8'

before calling pip solves the problem.

Comments (6)

    1. Marc Abramowitz author

      OK, I took a stab at a simple test. It's kind of a silly test, because all it does is check that the environment variable is set. A better test would simulate the pip command returning UTF-8 text, but I haven't figured out how to do that kind of mocking in py.test yet.

      1. Marc Abramowitz author

        Here's a possible way to simulate gcc returning UTF-8:

        $ echo 'void main(){}' | gcc -xc -c -o /dev/null -
        <stdin>: In function main:
        <stdin>:1: warning: return type of main is not int
    2. Marc Abramowitz author

      This is mostly a bug in pip on Python 3, as far as I can tell, and we are simply working around it, so writing a test is challening. A proper test it seems would belong more in pip than in tox. Here's the beginnings of an approach for a test in pip, which I will hopefully polish and then send to the pip folks:

      $ python -V
      Python 3.2.2
      $ cat broken_emits_utf8/ 
      # -*- coding: utf-8 -*-
      from distutils.core import setup
      import sys
      class FakeError(Exception):
      if sys.argv[1] == 'install':
          sys.stdout.buffer.write(b'\nThis package prints out UTF-8 stuff like:\n')
          sys.stdout.buffer.write('* return type of ‘main’ is not ‘int’\n'.encode('utf-8'))
          sys.stdout.buffer.write('* Björk Guðmundsdóttir [ˈpjœr̥k ˈkvʏðmʏntsˌtoʊhtɪr]'.encode('utf-8'))
          raise FakeError('this package designed to fail on install')
      $ pip install broken_emits_utf8/ | grep -v 'xxx'
        File "/Users/marc/python/virtualenvs/py3.1-phpserialize/lib/python3.2/site-packages/pip-1.0.2-py3.2.egg/pip/", line 230, in call_subprocess
          line = console_to_str(stdout.readline())
        File "/Users/marc/python/virtualenvs/py3.1-phpserialize/lib/python3.2/site-packages/pip-1.0.2-py3.2.egg/pip/", line 60, in console_to_str
          return s.decode(console_encoding)
      UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 17: ordinal not in range(128)