"easy_install pyLSM" fails with UnicodeEncodeError

Issue #17 open
Sridhar Ratnakumar
created an issue

{{{

!python

Searching for pyLSM
Reading http://pypi.python.org/simple/pyLSM/
Reading http://www.embl-heidelberg.de/~roduit/
Reading https://launchpad.net/pylsm/+download
Reading http://www.freesbi.ch/pylsm
Traceback (most recent call last):
File "bin/easy_install", line 8, in <module>
load_entry_point('setuptools==0.6c9', 'console_scripts', 'easy_install')()
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 1671, in main
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 1659, in with_ei_usage
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 1675, in <lambda>
File "/opt/ActivePython-2.6/lib/python2.6/distutils/core.py", line 152, in setup
dist.run_commands()
File "/opt/ActivePython-2.6/lib/python2.6/distutils/dist.py", line 975, in run_commands
self.run_command(cmd)
File "/opt/ActivePython-2.6/lib/python2.6/distutils/dist.py", line 995, in run_command
cmd_obj.run()
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 211, in run
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 433, in easy_install
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 462, in fetch_distribution
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 303, in find_packages
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 617, in scan_url
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 201, in process_url
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 278, in process_index
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 617, in scan_url
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 188, in process_url
File "/tmp/fo/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 652, in info
File "/opt/ActivePython-2.6/lib/python2.6/distutils/log.py", line 38, in info
self._log(INFO, msg, args)
File "/opt/ActivePython-2.6/lib/python2.6/distutils/log.py", line 28, in _log
print msg % args
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2329' in position 86: ordinal not in range(128)

}}}

Comments (9)

  1. Sridhar Ratnakumar reporter

    Different traceback while using the setuptools API to download the same package (but sdist)

      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 211, in run
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/command/easy_install.py", line 433, in easy_install
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 462, in fetch_distribution
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 303, in find_packages
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 617, in scan_url
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 201, in process_url
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 278, in process_index
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 617, in scan_url
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 189, in process_url
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 579, in open_url
      File "/home/apy/as/pypm/scratch/t1/lib/python2.6/site-packages/setuptools-0.6c9-py2.6.egg/setuptools/package_index.py", line 717, in open_with_auth
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 124, in urlopen
        return _opener.open(url, data, timeout)
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 383, in open
        response = self._open(req, data)
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 401, in _open
        '_open', req)
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 361, in _call_chain
        result = func(*args)
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 1130, in http_open
        return self.do_open(httplib.HTTPConnection, req)
      File "/home/apy/ActivePython-2.6/lib/python2.6/urllib2.py", line 1102, in do_open
        h.request(req.get_method(), req.get_selector(), req.data, headers)
      File "/home/apy/ActivePython-2.6/lib/python2.6/httplib.py", line 874, in request
        self._send_request(method, url, body, headers)
      File "/home/apy/ActivePython-2.6/lib/python2.6/httplib.py", line 911, in _send_request
        self.endheaders()
      File "/home/apy/ActivePython-2.6/lib/python2.6/httplib.py", line 868, in endheaders
        self._send_output()
      File "/home/apy/ActivePython-2.6/lib/python2.6/httplib.py", line 740, in _send_output
        self.send(msg)
      File "/home/apy/ActivePython-2.6/lib/python2.6/httplib.py", line 719, in send
        self.sock.sendall(str)
      File "<string>", line 1, in sendall
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u2329' in position 61: ordinal not in range(128)
    
  2. Tarek Ziadé repo owner

    Seems it was fixed for PyLSM.

    I still can reproduce it with ServPDF.

    It happens because the scanned url is in unicode with a character that cannot be translated in ascii.

    What is required is converting the original page into ascii before processing its URLs

  3. Tarek Ziadé repo owner

    Ok so the reason is that the url contains "lang", and the code translate it to an html entity code, using the htmlentitydefs.name2codepoint.

    So 'lang' becomes htmlentitydefs.name2codepoint['lang'], which is 2329.

    I don't know the reason why all urls found on pages are processed like that. As far as understand, we should remove all htmlencode() tentative on urls founded on pages.

  4. Log in to comment