'load_workbook' fail on opening workbook caused by a protected sheet

Issue #488 resolved
David Thenon
created an issue

Hi,

I'm trying to open a worbook with some protected sheets and it raises an exception :

Traceback (most recent call last):
File "/home/emencia/projects/inserdiag/django-apps-src/autodiag/autodiag/api/parser/inspector.py", line 371, in inspect
    wb.open(filename)
File "/home/emencia/projects/inserdiag/django-apps-src/autodiag/autodiag/api/parser/inspector.py", line 132, in open
    self.workbook_formula_mode = load_workbook(filename=filepath)
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/excel.py", line 149, in load_workbook
    _load_workbook(wb, archive, filename, read_only, keep_vba)
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/excel.py", line 236, in _load_workbook
    color_index=wb._colors)
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/worksheet.py", line 327, in read_worksheet
    fast_parse(ws, xml_source, shared_strings, style_table, color_index)
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/worksheet.py", line 315, in fast_parse
    parser.parse()
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/worksheet.py", line 94, in parse
    dispatcher[tag_name](element)
File "/home/emencia/projects/inserdiag/eggs/openpyxl-2.2.4-py2.7.egg/openpyxl/reader/worksheet.py", line 283, in parse_sheet_protection
    self.ws.protection = SheetProtection(**values)
TypeError: __init__() got an unexpected keyword argument 'hashValue'

This is one of six workbooks i'm opening with openpyxl, they all have protection and somewhat similar, but this occurs only on one workbooks. All of these workbooks have been created with MS Office.

None of our users reported any bugs with them.

From my tests i saw that opening the workbook using the "read_only" mode enabled dont raise the exception with 'hashValue' key, but i can't use this mode because i need shared_formula (there is a lot of them in our workbooks) and from what i saw in openpyxl 2.2.4, shared_data are allways empty in the sheet attributes.

I can't really provide the workbook file here as it's a private file from our customer, but could do it for developer(s) on their private email if needed.

Comments (11)

  1. CharlieC

    This is interesting as we've recently been looking at the worksheet protection stuff. @Hamza Khchine's fork adds support for this, I think.

    2.3 should now unpack shared formulae into ones relevant ones for individual cells, so you might want to give that a spin: pip install -U --pre openpyxl

    Could you send me a file to look at? And what version of MS Office are you using?

  2. CharlieC

    E-mail addresses are best passed via PM, I reckon. Though the spammers normally know them anyway.

    I suspect this is coming from a newer version of Office as hashValue has replaced password as the attribute in the spec. We've debating how much of the other attributes we should keep around – anything related to the encryption algorithm is essentially useless when you can read the source and I'd rather not have them in the class signature.

    https://bitbucket.org/khchine5/openpyxl/src/154112f3ad873a8f5b75aa73ad0c32946f05ef65/openpyxl/worksheet/chartsheet.py?at=2.3#cl-231

    Should be able to chomp this kind of file. Maybe we'll get away with a simple alias for hashValue.

  3. CharlieC

    Naively add missing hashValue attribute. This is fine for roundtripping. Khchine's definition has more sophisticated defintions but the ability to implement different hashing algorithms and salts isn't necessarily something we want to do. Resolves #488

    → <<cset 61831555662d>>

  4. Log in to comment