shape issue in load_workbook

Issue #1109 resolved
Jeremie Magnette
created an issue

Hello, as asked in #971, I'm creating a new issue for my problem.

I encounter an exception when loading one of my excel files (very simple example attached). It seems to be ML_Drawing related. I'm using openpyxl 2.5.8. If I remove the shapes/arrows, then the issue disappear.

I've seen that this issue comes a lot here (mainly because ML_Drawing API is not open ?). Do you think that it would be feasible to not crash but better add a "warning/error" mechanism, that could be enabled, allowing users to still read problematic files ? This system would still allow users to give you feedback on what is missing, while still being capable to use the 2.5.x openpyxl version.

Right now, because I cannot delete those shapes (Client's file), I will downgrade to 2.4.x version of openpyxl.

Comments (4)

  1. Jeremie Magnette reporter

    And here is the traceback I've got:

    openpyxl.load_workbook("shape_error.xlsx", data_only=True)
    
    
    Traceback (most recent call last):
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 57, in _convert
        value = expected_type(value)
    ValueError: invalid literal for int() with base 10: 'T0'
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "/opt/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2961, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-27-c8ce5ba0e956>", line 1, in <module>
        wk = openpyxl.load_workbook("shape_error.xlsx", data_only=True)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 276, in load_workbook
        for c in find_charts(archive, rel.target):
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/chart/reader.py", line 50, in find_charts
        drawing = SpreadsheetDrawing.from_tree(tree)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 84, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 100, in from_tree
        return cls(**attrib)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/drawing/geometry.py", line 418, in __init__
        self.x = x
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 69, in __set__
        value = _convert(self.expected_type, value)
      File "/home/magnette/.local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 59, in _convert
        raise TypeError('expected ' + str(expected_type))
    TypeError: expected <class 'int'>
    
  2. Jeremie Magnette reporter

    And the answer you already gave me :

    As noted above support for DrawingML is incomplete and that is the source of the problem here:

    <a:cxn ang="T8">
                 <a:pos x="T0" y="T1"/>
    </a:cxn>
    

    openpyxl expects an integer for the x value but it turns out the specification will accept other values. :-/ This probably fixable but in the meantime you'll have to use another shape.

  3. CharlieC

    Thanks for the report. The API is open but the work just needs doing and there is a lot of it. It might be possible to do what Excel does and just delete objects where this happens.

  4. Log in to comment