Accessing worksheet with unicode names

Issue #515 resolved
Cheng Guo
created an issue

I have an excel file with unicode names for the worksheets

wb.get_sheet_names()

output:

[u'\u4efb\u52a11',
 u'\u4efb\u52a12',
 u'\u4efb\u52a13A',
 u'\u4efb\u52a13B',
 u'\u4efb\u52a14-A',
 u'\u4efb\u52a14-B',
 u'\u4efb\u52a15-A',
 u'\u4efb\u52a15-B']

Then I did this to access the worksheet:

sheet1 = wb.get_sheet_names()[0]
wb[sheet1]

Here is the error:

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
/Users/cheng/.virtualenvs/edge/lib/python2.7/site-packages/IPython/core/formatters.pyc in __call__(self, obj)
    695                 type_pprinters=self.type_printers,
    696                 deferred_pprinters=self.deferred_printers)
--> 697             printer.pretty(obj)
    698             printer.flush()
    699             return stream.getvalue()

/Users/cheng/.virtualenvs/edge/lib/python2.7/site-packages/IPython/lib/pretty.pyc in pretty(self, obj)
    381                             if callable(meth):
    382                                 return meth(obj, self, cycle)
--> 383             return _default_pprint(obj, self, cycle)
    384         finally:
    385             self.end_group()

/Users/cheng/.virtualenvs/edge/lib/python2.7/site-packages/IPython/lib/pretty.pyc in _default_pprint(obj, p, cycle)
    501     if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs:
    502         # A user-provided repr. Find newlines and replace them with p.break_()
--> 503         _repr_pprint(obj, p, cycle)
    504         return
    505     p.begin_group(1, '<')

/Users/cheng/.virtualenvs/edge/lib/python2.7/site-packages/IPython/lib/pretty.pyc in _repr_pprint(obj, p, cycle)
    683     """A pprint that just redirects to the normal repr function."""
    684     # Find newlines and replace them with p.break_()
--> 685     output = repr(obj)
    686     for idx,output_line in enumerate(output.splitlines()):
    687         if idx:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 12-13: ordinal not in range(128)

Then I tried this:

wb[name1.encode('utf8')]

The output:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-72-6648e632677d> in <module>()
----> 1 wb[name1.encode('utf8')]

/Users/cheng/.virtualenvs/edge/lib/python2.7/site-packages/openpyxl/workbook/workbook.pyc in __getitem__(self, key)
    231             if sheet.title == key:
    232                 return sheet
--> 233         raise KeyError("Worksheet {0} does not exist.".format(key))
    234 
    235     def __delitem__(self, key):

KeyError: 'Worksheet \xe4\xbb\xbb\xe5\x8a\xa11 does not exist.'

I know this has something to do with unicode vs str, what am I suppose to do to access the worksheet?

Thanks!

P.S. versions I used: openpyxl (2.2.5), python(2.7.9)

Comments (4)

  1. CharlieC

    Thanks for the report. This is similar to something that has cropped up on the mailing list recently. You can actually work with the file worksheet but the repr for the sheet is causing problem in your environment, in your case it looks like IPython.

    Could you try patching the repr_format so that it doesn't try to decode the title? When reading a file then the title will always be unicode. We just need to make sure that it always unicode when we set it.

  2. Log in to comment