-
assigned issue to
OpenXML: Strip <w:lang> run properties during merging
The attached DOCX file demonstrates text that is interspersed with tags due to the presence of <w:lang> tags in some runs. From the ECMA Office OpenXML Reference:
17.3.2.20 lang (Languages for Run Content) This element specifies the languages which shall be used to check spelling and grammar (if requested) when processing the contents of this run. If this element is not present, the default value is to leave the formatting applied at previous level in the style hierarchy. If this element is never applied in the style hierarchy, then the languages for the contents of this run shall be automatically determined based on their contents using any method desired.
This argument can be safely discarded when extracting for translation, as it is only relevant to the source content.
Comments (2)
-
reporter -
reporter - changed status to resolved
Fix Issue 482 - Strip <w:lang> during run consolidation
This produces cleaner segments in some cases.
→ <<cset ed40a2db503e>>
- Log in to comment