DrawingML objects -- parsing a:rect coordinates

Issue #1279 new
Peter Banks created an issue

Steps to reproduce: Load attached workbook ‘input_image_bad_arrow.xlsx’ using openpyxl. Save this workbook to a new file

Expected result: A duplicate of the original file

Actual result: An empty file.

This is caused by the a:rect element in the xl/drawings/drawing1.xml file. Removing this element from the input file (as in ‘input_image_good_arrow.xlsx’) leads to the duplicate file containing the images from the original file (the drawings are still not copied; however this appears to be a separate issue/feature request).

The offending section of the file is:

<a:custGeom>
  <a:avLst/>
  <a:gdLst/>
  <a:ahLst/>
  <a:rect l="l" t="t" r="r" b="b"/>
  <a:pathLst>
    <a:path w="21600" h="21600">
      <a:moveTo>
        <a:pt x="0" y="0"/>
      </a:moveTo>
      <a:lnTo>
        <a:pt x="21600" y="21600"/>
      </a:lnTo>
    </a:path>
  </a:pathLst>
</a:custGeom>

The ECMA documentation specifies that the attributes l, t, r and b of the a:rect element should be of type ST_AdjCoordinate (defined in [1, pp 2924, section 20.1.10.2) which is defined to be the union of the types ST_Coordinate and ST_GeomGuideName. The latter of these can be any token. However, openpyxl expects these to be integers:

Traceback (most recent call last):
  File "/home/pbanks/src/openpyxl-testing/venv/lib/python3.7/site-packages/openpyxl/descriptors/base.py", line 57, in _convert
    value = expected_type(value)
ValueError: invalid literal for int() with base 10: 'l'

See also https://groups.google.com/forum/#!msg/openpyxl-users/rDzV9y-DOfE/iHRGKHuiAgAJ.

[1] Office Open XML File Formats — Fundamentals and Markup Language Reference, https://www.ecma-international.org/publications/files/ECMA-ST/ECMA-376,%20Fifth%20Edition,%20Part%201%20-%20Fundamentals%20And%20Markup%20Language%20Reference.zip

Comments (2)

  1. CharlieC

    In the LibreOffice file the is simply missing so maybe the fastest workaround would be to ignore the element.

    Regarding the ST_GeomGuideName: obviously the specification needs updating here. The best thing to do would be to ask MS what they think should be the valid characters.

  2. Log in to comment