Unencodeable character (in programme title presumably) crashes Python

Issue #5 resolved
Joe Green created an issue

Currently occurring when I choose Yahoo Plus7:

 50) Live Well
 51) The Logie Awards
 52) Losing It With John Stamos
 53) Lyndey Milan's Taste Of Australia
Traceback (most recent call last):
  File "D:\Documents and Settings\xxx\My Documents\webdl\grabber.py", line 56, in <module>
    main()
  File "D:\Documents and Settings\xxx\My Documents\webdl\grabber.py", line 41, in main
    result = choose(options, allow_multi=will_download)
  File "D:\Documents and Settings\xxx\My Documents\webdl\grabber.py", line 10, in choose
    print "%3d) %s" % (i+1, key)
  File "C:\Python27\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' in position 29: character maps to <undefined>

This is using Python 2.7.3 on WinXP SP3.

Of course I can work round this with a try/except for now.

Mind you I get the same error when I use interactive Python and try to execute

print u'\u0420\u043e\u0441\u0441\u0438\u044f'

which should print

Россия

So you could argue that it's a function of the DOS shell's inability to cope, rather than anything Pythonesque. But I wonder if there's some simple function to sanitise Unicode strings (or maybe I just have to write my own high-byte-clearing lambda or something).

Comments (7)

  1. delx repo owner
    • assigned issue to
      delx
    • edited description

    Hi,

    Could you try changing the following line in grabber.py to encode the episode title as UTF-8 before printing to stdout.

    Original:

            print "%3d) %s" % (i+1, key)
    

    New value:

            print "%3d) %s" % (i+1, key.encode('utf-8'))
    

    Please let me know if that fixes the problem.

  2. Tyler Durden

    @delx can you please upload the fix to the download section and also the git repository?

    Also I tried changing that line as suggested but get this error

    File "grabber.py", line 9
        print "%3d) %s" % (i+1, key.encode('utf-8'))
                      ^
    SyntaxError: invalid syntax
    

    I followed instructions in overview "git clone https://bitbucket.org/delx/webdl" and encounter this problem, so it was not updated with the fix

  3. delx repo owner

    Hi @madmax3, your problem is unrelated to this issue. Could you please open a new one?

    It appears that you're using an old version of the code. So I would double-check that you're up to date by doing git pull

  4. Tyler Durden

    @delx I just did a git pull and it says it is "Already up-to-date."

    What do you mean unrelated to this issue? I am getting this error, which looks very similar to the OP's, that's why I found this thread.

    I am also running python 3.5.2 on Windows 10

    C:\webdl>venv\Scripts\activate.bat
    
    (venv) C:\webdl>python grabber.py
      1) ABC iView
      2) Nine
      3) SBS
      4) Ten
      0) Back
    Choose> 4
      1) Art Without Borders
      2) Australia By Design
      3) Australian Fishing Championships Series XIII
         * 
     78) St Francis
     79) Studio 10
     80) TEN Eyewitness News First At Five
     81) Todd Sampson's Body Hack
    Traceback (most recent call last):
      File "grabber.py", line 55, in <module>
        main()
      File "grabber.py", line 40, in main
        result = choose(options, allow_multi=will_download)
      File "grabber.py", line 9, in choose
        print("%3d) %s" % (i+1, key))
      File "C:\webdl\venv\lib\encodings\cp850.py", line 19, in encode
        return codecs.charmap_encode(input,self.errors,encoding_map)[0]
    UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 17: character maps to <undefined>
    
  5. Log in to comment