OpenXML Filter: always expose one of runs with implicit formatting

Issue #887 resolved
Denis Konovalyenko created an issue

If we consider an example when there are runs with different formatting, neither of which can be refactored with the paragraph formatting.

Below is a corresponding document structure:

<w:p>
    <w:pPr>
    <w:rPr>
        <w:sz w:val="28"/>
    </w:rPr>
    </w:pPr>
    <w:r>
        <w:rPr>
            <w:sz w:val="26"/>
        </w:rPr>
        <w:t xml:space="preserve">Run 13pt.</w:t>
    </w:r>
    <w:r>
        <w:rPr>
            <w:sz w:val="24"/>
        </w:rPr>
        <w:t xml:space="preserve">Run 12pt.</w:t>
    </w:r>
    <w:r>
        <w:rPr>
            <w:sz w:val="28"/>
        </w:rPr>
        <w:t xml:space="preserve">Run 14pt.</w:t>
    </w:r>
</w:p>

And the current extraction is this:

<g id="1">Run 13pt.</g><g id="2">Run 12pt.</g><g id="3">Run 14pt.</g>

This is a segment that contains positions that are outside of any formatting - e.i. before the first <g>, or between </g> and <g>, or after the final </g>. Thus, it is not very easy to work with such segments for translators.

A better segment would be one in which one of the runs is assumed, and the other is highlighted. For instance:

Run 13pt.<g id="1">Run 12pt.</g><g id="2">Run 14pt.</g>

In this case, the styling of every position in the segment is known. Either the position is inside a tag pair, in which case it has the styling of that run, or it is outside a tag pair, in which case it has the styling of the "implicit" tag pair.

The example document is attached.

Comments (3)

  1. Log in to comment