- changed status to resolved
OpenXML: Consecutive runs containing tabs can be incorrectly merged
Issue #467
resolved
The attached file contains some text with two embedded tabs, divided across multiple runs. The pattern looks like this:
<w:r>
<w:tab/>
<w:t>-</w:t>
</w:r>
<w:r>
<w:tab/>
<w:t>R</w:t>
</w:r>
Because the runs have the same properties, they are merged, but the tabs are ignored, so you end up with a single run like this:
<w:r>
<w:tab/>
<w:t>-R</w:t>
</w:r>
The result of this is that one of the tabs appears to disappear during translation.
Comments (3)
-
reporter -
reporter Fix Issue 458, Fix Issue 467, and Fix Issue 473 in the openxml filter
This rewrites OpenXMLContentFilter.combineRepeatedFormat() and splits out the markup simplification content to a new class called ParagraphSimplifier. This fixes many issues in the old code with multiple <t> elements in a single run, as well as issues with tabs and linebreaks that were being lost when interspersed with text. This covers Issue 458 and re-fixes Issue 467 in a better way. Additional fixes were to Issue 473 and an unfiled problem with entities in deleted text that weren't being re-escaped in target output. This has caused some changes to placeholder creation in segments.
→ <<cset 2445b887857a>>
-
reporter Fix Issue 458, Fix Issue 467, and Fix Issue 473 in the openxml filter
This rewrites OpenXMLContentFilter.combineRepeatedFormat() and splits out the markup simplification content to a new class called ParagraphSimplifier. This fixes many issues in the old code with multiple <t> elements in a single run, as well as issues with tabs and linebreaks that were being lost when interspersed with text. This covers Issue 458 and re-fixes Issue 467 in a better way. Additional fixes were to Issue 473 and an unfiled problem with entities in deleted text that weren't being re-escaped in target output. This has caused some changes to placeholder creation in segments.
→ <<cset 2445b887857a>>
- Log in to comment
Fix issue
#467- openxml filter could lose some tabs during extraction→ <<cset acde79aeea03>>