openpyxl creates broken files if mimetypes.init() is called after openpyxl has been imported

Issue #1221 resolved
Hans-Jakob Holtz
created an issue

Whenever mimetypes.init() gets called after openpyxl has been imported, all subsequent save (or save_workbook, save_virtual_workbook etc) calls result in a broken file.

Code to reproduce:

from openpyxl import Workbook
import mimetypes

if __name__ == '__main__':
    Workbook().save('good.xlsx')

    mimetypes.init()

    Workbook().save('bad.xlsx')

The same symptom (broken xlsx file with invalid entry '<Default ContentType="text/xml" Extension="xml"/>' in [Content_Types].xml) has already been described in #1077, but there it has been shrugged off as due to broken/weird mime-types configuration. Since calling mimetypes.init() later is explicitly allowed and sufficient to reproduce the problem, I consider this a valid bug.

As a fix, I propose using a private mime database in manifest.py to be independent from the global mime database (which possibly contains incompatible extension-to-type mappings).

I shall try to implement this proposed fix and send a pull request as soon as I have a working solution.

Comments (1)

  1. CharlieC

    manifest: use private mime type database

    This fixes #1221

    The private mime database starts out as a copy of the global mime database. In particular, it contains all well-known types like image/jpeg and image/png. We add/overwrite our own types into that database instead of the global one. This private database is immune to subsequent changes to the global mime database, such as mimetypes.init() or conflicting add_type(...) calls.

    → <<cset a3229c138081>>

  2. Log in to comment