"We found a problem with some content in copy.xlsx" Openpyxl 3.0.3, open and save with no changes to file

Issue #1430 new
Ryan created an issue

openpyxl versions tested: 3.0.0 and 3.0.3

python versions tested: 3.6 and 3.7

Code to reproduce (I’ve attached the referenced files to this ticket):

import openpyxl
wb = openpyxl.load_workbook('/Users/ryan/Downloads/original.xlsx')
wb.save('/Users/ryan/Downloads/copy.xlsx')

After running this code, if I attempt to open copy.xlsx I get the following error:

and here’s the XML ”Repair result” report generated by excel

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <logFileName>Repair Result to copy0.xml</logFileName>
  <summary>Errors were detected in file '/Users/ryan/Downloads/copy.xlsx'</summary>
  <removedParts summary="Following is a list of removed parts:">
    <removedPart>Removed Part: Drawing shape.</removedPart>
  </removedParts>
  <repairedRecords summary="Following is a list of repairs:">
    <repairedRecord>Repaired Records: Drawing from /xl/drawings/drawing1.xml part (Drawing shape)</repairedRecord>
  </repairedRecords>
</recoveryLog>

I’ve seen vaguely similar reports to this both in the official issue tracker and more commonly in old stackoverflow posts, but all of them that were resolved were resolved by upgrading openpyxl to much older versions than the ones I’ve tried.

Comments (4)

  1. Ryan reporter

    On further testing I’ve discovered that if I do

    for sheet in workbook:
      sheet._legacy_drawing = None
    

    and then save the workbook the error message does not appear, so it’s something to do with _legacy_drawing

  2. CharlieC

    Thanks for the update, I haven’t a chance to loook at the files but your description suggest there is a probably relating to the use of comments. The specification here is really tricky because you end up writing the same information in three different places: the comments.xml files, the drawing.xml files and the shapes.vml files. This last one is legacy from pre-OOXML and officially deprecated but factually the most commonly understood format. And, unless you enforce “strict” OOXML required if anyone is going to see the comments (openpyxl itself never uses the VML to see what the comments say). The files are also used for other things so that it quickly comes to problems, especially some stuff required for VBA macros because sometimes also stuff in the files.

  3. Charlie Heitzig

    Hi Ryan, did you ever solve this issue? It seems like I’m having the same or similar issue with #1455.

    Mine isn’t solve by your _legacy_drawing loop though

    I’d love to hear if you resolved?

  4. Lucas Thiago Zane

    Hi guys,

    Did you have solved this or have any workaround? I am also having this problem.

    Thanks

  5. Log in to comment