DOCX rewritten as recoverable file

Issue #1335 resolved
YvesS created an issue

Some DOCX files that open OK in Word, and get processed without problem with the Okapi filter, have a problem when they get re-written: Word sees them as recoverable files. You get prompted to let Word recover them. If you answer Yes, the file get read fine. For example, the attached test1.docx file is such file. If I process it to change its text with Rainbow (Text Rewriting) we get back the attached test1.out.docx: that opens fine, but only after a prompt. This is with latest 1.46-snapshot, but this is true for any previous version I have tried too. If I open the test1.docx in Word and save it without any change, the resulting file processes fine and opens fine. A possible difference I can see is that the original test1.docx seems to have some mac word namespaces.

Comments (4)

  1. Denis Konovalyenko

    @YvesS thank you for documenting this! The issue can be recreated with translatePowerpointDocProperties.b=true parameter. The docProps/core.xml is corrupted after the merge. Original:

    <?xml version='1.0' encoding='UTF-8' standalone='yes'?>
    <cp:coreProperties
      xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
      xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
      xmlns:dcmitype="http://purl.org/dc/dcmitype/"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <dc:title/>
      <dc:subject/>
      <dc:creator>python-docx</dc:creator>
      <cp:keywords/>
      <dc:description>generated by python-docx</dc:description>
      <cp:lastModifiedBy/>
      <cp:revision>1</cp:revision>
      <dcterms:created xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:created>
      <dcterms:modified xsi:type="dcterms:W3CDTF">2013-12-23T23:15:00Z</dcterms:modified>
      <cp:category/>
    </cp:coreProperties>
    

    Merged:

    <?xml version='1.0' encoding='UTF-8' standalone='yes'?>
    <cp:coreProperties
      xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
      xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
      xmlns:dcmitype="http://purl.org/dc/dcmitype/"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <dc:title/>
      <dc:subject/>
      <dc:creator>ƥŷţĥōń-ďōćx</dc:creator>
      <cp:keywords/>
      <dc:description>ĝēńēŕàţēď ƀŷ ƥŷţĥōń-ďōćx</dc:description>
      <cp:revision>
        <dcterms:created xsi:type="dcterms:W3CDTF">
          <dcterms:modified xsi:type="dcterms:W3CDTF">
    

  2. Log in to comment