Opening and resaving this spreadsheet causes it to become corrupt

Issue #968 resolved
Daniel Bereznyi
created an issue

Just doing

import openpyxl as xl
wb = xl.load_workbook("sheet.xlsx")
wb.save("sheet_out.xlsx")

causes sheet_out.xlsx to be saved incorrectly with Excel 2016 reporting the file as being corrupt. The file is opened correctly by LibreOffice 6.0.0.3 (x64). Validating sheet_out.xlsx in Open XML SDK 2.5 Productivity Tool generates this: openxml.PNG

Comments (4)

  1. CharlieC

    Thanks for the report and the file. Unfortunately the validator is giving you false positives as the order of font child elements is not fixed by the specification. You'll see other incorrect errors if you look at the original file.

    As there are are lot of worksheets it's a bit tricky to investigate this but at least part of the problem is related to the encryption scheme used in the worksheet protection which uses a scheme which openpyxl doesn't support (and we have no plans to support either). If the protection is removed then the tabs will at least load but they will be empty. But the problem itself seems to be related to the defined names used. These are perfectly preserved but also seem to relate to the use of an external file.

    The following code will at least allow you to read the file.

    for n in wb.sheetnames:
        ws = wb[n]
        ws.protection = SheetProtection()
    wb.defined_names.definedName = []
    
  2. Log in to comment