OpenXML: corrupt formatting of consecutive runs where the second uses default formatting

Issue #614 resolved
Jan Pascal Maas created an issue

We discovered a bug with corrupt formatting in a merged document if text passages in the original document are formatted differently. More precisely, it occurs if the first of two runs is formatted in a special way and the second run uses the default formatting of the document. An example can be found in the attachments.

Further investigation lead to the following observation that can be noted in the document.xml-Part of the document:

In the original document, The following XML snippet can be found:

<w:p w:rsidR="00623D7E" w:rsidRPr="007E1190" w:rsidRDefault="007E1190">
    <w:pPr>
        <w:rPr>
            <w:lang w:val="en-US"/>
        </w:rPr>
    </w:pPr>
    <w:r w:rsidRPr="007E1190">
        <w:rPr>
            <w:rFonts w:ascii="MS Gothic" w:eastAsia="MS Gothic" w:hAnsi="MS Gothic"/>
            <w:lang w:val="en-US"/>
        </w:rPr>
        <w:t>Hello, I’m formatted.</w:t>
    </w:r>
    <w:r w:rsidRPr="007E1190">
        <w:rPr>
            <w:lang w:val="en-US"/>
        </w:rPr>
        <w:t xml:space="preserve"> Hello, I’m not.</w:t>
    </w:r>
    <w:bookmarkStart w:id="0" w:name="_GoBack"/>
    <w:bookmarkEnd w:id="0"/>
</w:p>

As observed, the first one has properties that describe the font while the second does not. In the merged document, the above snippet is converted to the following:

<w:p>
    <w:r>
        <w:rPr>
            <w:rFonts w:ascii="MS Gothic" w:eastAsia="MS Gothic" w:hAnsi="MS Gothic"/>
        </w:rPr>
        <w:t xml:space="preserve">Hello, I’m formatted. Hello, I’m not.</w:t>
    </w:r>
</w:p>

As noted before, the formatting of the first run is also applied to the second run which is obviously not intended.

Since we started investigating this with a very specific case and discovered the general case, we want to inform you about this bug. Also, we are currently working on a fix which will be issued as PR as soon as we've finished fixing and testing it. It will involve changes to the provided test data since it suffered from the same issue.

Comments (4)

  1. Chase Tingley

    Thanks! When you submit the PR, please include @DenisKonovalyenko as a reviewer, he knows the code best. (Also include me.)

  2. Log in to comment