Openpyxl corrupts Excel *.xlsm files if they contain worksheets that are copied from an external workbook using Excel 2007

Issue #546 resolved
Philipp Freyer
created an issue

Issue was already discussed here: openpyxl-users Mailing-List

I need to update information in a lot of Excel worksheets. They are created with Excel 2007. Editing the data works fine but I have problems if someone using these sheets inserts another worksheet into them by copying them from another workbook with the Excel "move or copy" functionality on worksheets.

I am using the latest openpyxl library from PyPi (2.3.0) and Python 2.7.8 32 bit on Windows 7 (64 bit).

Here a short example script that already shows the problem:

import openpyxl

update_sheet=u"testfile.xlsm"

wb = openpyxl.load_workbook(update_sheet, keep_vba=True)
wb.save(update_sheet)

The script just opens and saves the file. This works fine when editing a normal Excel workbook, but if it contains a worksheet copied into there from another Excel workbook, it produces invalid files that need repairing.

I attached sample workbooks: TESTBOOK_okay.xlsm - here, the script works fine TESTBOOK_broken.xlsm - here, the script produces invalid sheets.

Excel says: Excel found unreadable content in 'TESTBOOK_broken.xlsm'. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes.

Recover log file unfortunately does not say much:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"><logFileName>error086080_04.xml</logFileName><summary>Errors were detected in file 'C:\[path]\TESTBOOK_broken.xlsm'</summary><additionalInfo><info>Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.</info></additionalInfo></recoveryLog>

Copying and Inserting is meaning the function in Microsoft Excel itself, meaning doing a right click on the worksheet tab in the bottom of excel and using the "move or copy" option there for copying the sheet over.

By playing a little bit with the Debugger, I found a workaround that works in my case but may not be the best option for this in general:

wb._external_links = []

With this option, I delete the reference to the external sheet and Excel opens the file just fine.

Comments (1)

  1. Log in to comment