Error while opening .xlsm file written by openpyxl

Issue #467 resolved
Marcial Messmer
created an issue

I try to use open and save a .xlsm file with openpyxl. However, the .xlsm file can't be opened by excel.The ultimate goal is to add content to an existing worksheet in that specific workbook. I use the following code in a python 3.4 , win7 and excel 2010 envirnoment.

    from openpyxl import load_workbook
    fileDir = 'Z:\'
    wb = load_workbook(filename = fileDir + 'temp.xlsm', keep_vba=True)
    wb.save(fileDir + 'temp.xlsm')

The error message i receive when i try to open the file with excel is (excel error 101648):

"Excel found unreadable content in 'filename.xlsx'. Do you want to recover teh contents of this workbook? If you trust the source of this workbook, click Yes. (Yes) (No)"

Comments (42)

  1. Charlie Clark

    Unfortunately, the Excel error messages are worse than useless. There is probably a problem with a reference somewhere within the file but without a file it's difficult to say more. Does the file work okay if you don't try and preserve macros?

  2. Charlie Clark

    Thanks for the file. It looks like a reference to a part of the file is being preserved while the part isn't. Do you also get a warning that the header/footer can't be preserved?

  3. Charlie Clark

    Looks like the Excel error is related to the handling of the comments. If the ActiveX stuff is relevant to your use then its worth noting that it will all be lost if the file is processed by openpyxl.

  4. Marcial Messmer reporter

    Sorry, i might have miss used the word "copy". I meant when I simply load the workbook with openpyxl and save it without changing anything I cannot access the file anymore from excel - basically if I simply execute the code above.

  5. John Bovey

    @Charlie Clark This looks like another example of the vml conflict between comments and vba controls. One solution would be for me to simply suppress the creation of the commentsDrawing1.vml file when in keep_vba mode. Many versions of excel (including mine and Johannes Me's) do not need it. I can generate an exception instead but that will simply mean that users will find out there is a problem at an earlier stage.

  6. Charlie Clark

    Just to set expectations here: the specification is thousands of pages long and we don't support all of it, what we don't support, we discard sometimes with warnings. The VB support is a hack that works well enough, even in this case, but is far from perfect. The issue you're seeing seems to be related to something else, which is why it happens with or without VB being preserved. It might be the ActiveX (I don't remember seeing a file with embedded ActiveX stuff) before, but it might be something completely different. I hope to have time to investigate it properly when I get back to my office.

  7. Charlie Clark

    FWIW the current 2.3 branch will now allow you to save the file if you can live without VBA (keep_vba=False). When keeping VBA there is a duplicate entry in content types for comments but that, alas, is not the main reason why the file cannot be used.

  8. Charlie Clark

    The transitional specification limits the number of legacyDrawing elements in a worksheet to 1. Manually removing the one for commentsvml allows a file to be created that Excel will open. However, the macros seem unavailable. This may well be down to the ActiveX component or other extensions.

  9. John Bovey

    Could I clarify this. In my testing, the macros are available but not not the ActiveX controls. The problem with legacyDrawing elements is caused by the fact that openpyxl uses one in the sheet to link to any controls and one to support commented cells. If there are two in a sheet then excel rejects it, which means it breaks on sheets that contain both comments and controls. I am working on a fix (ready in a day or two) to deal with the conflict between comments and controls, but ActiveX controls will probably not be handled for a while. Form controls should work, though.

  10. Charlie Clark

    That's good to hear! I can't see the macros on my machine but I think that happens quite a lot on MacOS. The ActiveX controls all seem to be linked indirectly from the worksheet relationships. I suspect it might be possible to shoehorn these in if required.

    There are also quite a lot of dependencies on extensions which we certainly won't support for a while.

  11. Marcial Messmer reporter

    It calls the updated version, I've checked the writer/excel.py and it is up to date. I am now using through pandas.to_excel() with 'engine = 'openpyxl', but it should not make a difference.

  12. Marcial Messmer reporter

    I've manually updated openpyxl via pip and bitbucket. it also seems to call the updated version of openpyxl (i checked via an included print statement inside writer/excel.py)

  13. Marcial Messmer reporter

    I understand. To limit the problem, I just re-run the code from the original issue with a different file and somehow strange behaviour

        from openpyxl import load_workbook
        fileDir = 'Z:\\'
        wb = load_workbook(filename = fileDir + 'test_openpyxl.xlsm',
                           keep_vba=True)
        wb.save(fileDir + 'temp1.xlsm')
    

    I can open the file only after excel asks to repair the file. The content of the sheets is there, however, the testmacro is missing.

    If I run the code and change to keep_vba=False, I cannot open the file at all, causing the same error as initially described.

  14. Charlie Clark

    What version of Excel do you have? I have tested with Excel 2011 and 2016 for MacOS. Do you have lxml installed? I'll add one of the files I created. Can you open this okay?

  15. John Bovey

    I am out at the moment but I'll have a look when I get back, probably tomorrow. The problem is almost certainly the ActiveX controls, though, which are not supported at present. John

  16. Charlie Clark

    @John Bovey actually, I shoehorned the ActiveX controls into the archive. Excel complains normally when it can't resolve the dependency graph properly. Still, it won't hurt to have you look over the changes I made… so thanks again.

  17. John Bovey

    I have just tried loading and saving the file test_openpyxl.xlsm (with keep_vba=true) and excel 2010 on Windows opens the result just fine. It also seems to be ok with the output from Adv_xl1.xlsm file other than complaining about missing libraries. This is using branch 2.2 from the current version of openpyxl on bitbucket, run on debian linux.

    I did do a quick visual comparison between the xml files in my output file and those in temp_vba_true.xlsm (which my excel also rejects). The main differences seem to be in the use of namespace prefixes.

  18. Marcial Messmer reporter

    I have just run the code on a different machine (win 8, excel 2013, openpyxl 2.2.4) and it works for me. One exception, I cannot open the file when I save the workbook loaded with keep_vba=False as a .xlsm. It works when I save it as .xlsx, which makes in most cases more sense, but is this intended to be like this? On monday I'll try again on the machine, where I have the problem, with 2.2.4 (last time I used verions 2.4 and 2.0.2).

  19. Charlie Clark

    Excel has a synthetic difference between .xlsx and .xlsm file extensions which is nothing to do with this issue. You haven't been working with checkouts which is why you're getting such weird files.

  20. Marcial Messmer reporter

    Thanks for your effort. Now I run everything with version 2.2.4, and the test files work all perfecty fine. The original file (which I cannot share) works only with keep_vba=False and save as xlsx. I can live with that (it also works with the pandas.DataFrame.to_excel() function). However, using the option keep_vba=True and save as xlsm works only after excel "repairs" the file. Then the repaired file contains all the content (vba + sheets). If you are interested, I can try to adjust the file in a way such that I could share it with you. + I am sorry for using the wrong version initially.

  21. Charlie Clark

    We can fix what we can't see. In the supplied file we did not some additional dependencies between the macro and the ActiveX plugin but they don't seem to affect the way it works.

    Tip: use xlwings instead of macros and plugins to augment your Excel files.

  22. Log in to comment