Namespace issue in core.xml

Issue #1257 resolved
Carsten Byrman
created an issue

Our Django application exports data to xlsx. When a user tries to open a generated file, Excel alerts:

We found a problem with some content in 'data.xlsx' . Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.

Even empty workbooks result in an alert, so it's not the data. After some debugging, I found the problem to be in core.xml.

Workbooks with alert have this:

<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">openpyxl</dc:creator><dct:created xmlns:dct="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="dcterms:W3CDTF">2019-04-09T15:38:19Z</dct:created><dct:modified xmlns:dct="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="dcterms:W3CDTF">2019-04-09T15:38:19Z</dct:modified></cp:coreProperties>

Workbooks without alert have this:

<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">openpyxl</dc:creator><dcterms:created xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="dcterms:W3CDTF">2019-04-09T16:14:20Z</dcterms:created><dcterms:modified xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="dcterms:W3CDTF">2019-04-09T16:14:20Z</dcterms:modified></cp:coreProperties>

1) Note that both differ in the namespace alias that is used: dct vs. dcterms.

2) Also note that "dcterms:W3CDTF" is a hard-coded string: https://bitbucket.org/openpyxl/openpyxl/src/3640394bff97564a07eb7ecb0cf68f57aaaeac67/openpyxl/packaging/core.py#lines-49

I am able to repair a bad workbook manually, by either changing "dct" into "dcterms" or by changing "dcterms:W3CDTF" into "dct:W3CDTF". Both changes make the alert go away.

One strange thing is that creating an empty workbook interactively from the Django shell, produces a good workbook. If the same is done from within our application, however, the namespace problem shows up. In both cases the same library versions are used.

Comments (8)

  1. CharlieC

    The prefix doesn't matter as long it is declared in the scope. If there are differences then it's possible that some other code in the same process is setting the prefixes differently when registering the namespaces. But I'd be surprised if this is really the problem. Can't really say anymore without a file.

  2. Log in to comment