OpenXML: Inline markups for spell and grammar checking in xlf document
Original issue 440 created by s.kar...@24technology.de on 2015-02-02T08:40:53.000Z:
What steps will reproduce the problem?
1. Take a docx/pptx document with at least one word marked as spelling/grammar error, e.g. "Sentence with an Eror."
2. Convert this document to xlf
3. The converted xlf document has additional g- and/or x-markups
What is the expected output? What do you see instead?
Expected output: No markups around the "wrong"-spelled word.
Current output: Markups around the "wrong"-spelled word.
converted docx:
<source xml:lang="en-us"><x id="1"/><g id="2">Sentence with an </g><x id="3"/><g id="4">Eror</g><x id="5"/><g id="6">.</g><x id="7"/></source>
converted pptx:
<source xml:lang="de-de"><g id="1">Sentence with an </g><g id="2">Eror</g><g id="3">.<g/></source>
What version of the product are you using? On what operating system?
Okapi version: M27 (January 25 2014)
Operating System: Windows 7 Enterprise
Please provide any additional information below.
The xml document of the docx contains this sentence as the following:
Sentence with an </w:t></w:r><w:proofErr w:type="spellStart"/><w:r><w:t>Eror</w:t></w:r><w:proofErr w:type="spellEnd"/><w:r><w:t xml:space="preserve">.
Comments (7)
-
Account Deleted -
Account Deleted - attached spelling.docx
Comment 2. originally posted by @ysavourel on 2015-03-26T21:23:38.000Z:
Sample file with a spelling error or two, but I don't think there's grammatical error markup in there.
-
- changed title to OpenXML: Inline markups for spell and grammar checking in xlf document
-
assigned issue to
- edited description
-
- attached spelling.docx
-
The markup structure in question looks like: <w:proofErr w:type="spellStart"/>
This element is used to anchor both start and end of both spelling and grammar errors, using different type values. The fix is to strip them prior to merging runs.
-
- marked as major
-
- changed status to resolved
Fix issue
#440- Strip computed spelling and grammar markupThis streamlines trans-units by dropping the <w:proofErr> tags from DOCX files before merging text runs.
→ <<cset 5575305e30bb>>
- Log in to comment
Comment 1. originally posted by @ysavourel on 2015-02-21T23:57:58.000Z: