OpenXML: DOCX corruption in files with deletion change tracking

Issue #473 resolved
Chase Tingley created an issue

At least, I think the cause is the change tracking marker. Roundtripping the attached file through the openxml filter will cause word/document.xml to become truncated after the paragraph containing the timecode.

Comments (1)

  1. Chase Tingley reporter

    Fix Issue 458, Fix Issue 467, and Fix Issue 473 in the openxml filter

    This rewrites OpenXMLContentFilter.combineRepeatedFormat() and
    splits out the markup simplification content to a new class called
    ParagraphSimplifier.  This fixes many issues in the old code with
    multiple <t> elements in a single run, as well as issues with tabs
    and linebreaks that were being lost when interspersed with text.
    This covers Issue 458 and re-fixes Issue 467 in a better way.
    
    Additional fixes were to Issue 473 and an unfiled problem with
    entities in deleted text that weren't being re-escaped in target
    output.
    
    This has caused some changes to placeholder creation in segments.
    

    → <<cset 2445b887857a>>

  2. Log in to comment