DOCX/OpenXML - tag corruption in document.xml
Original issue 379 created by bailo... on 2013-11-22T10:12:26.000Z:
I'm an user of Okapi and I had a trouble while opening documents in Microsoft Office 2007: File.docx . The error message is: /Word / document.xml line 6294 colums 6293. The problem doesn't exist in OpenOffice, there is no problems in the file.docx (I can open the file without any error message)
tikal.sh -lm file.docx -totrg -from aftertest
Comments (4)
-
Account Deleted -
Account Deleted Comment 2. originally posted by @ysavourel on 2014-01-29T18:44:04.000Z:
The document.xml file in the attached file.out.docx has been corrupted.
At the offset (line 2, column 6294) there is some very weird tag structure in which a run (<w:r>) has been embedded directly within the <w:t> of another run. It looks like this:
<w:p>
<w:pPr>
<!-- snipped for space -->
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="CharAttribute0"/>
<w:rFonts w:eastAsia="Batang"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<w:t xml:space="preserve">
<w:r> <------------- What
<w:rPr>
<w:rStyle w:val="CharAttribute0"/>
<w:rFonts w:eastAsia="Batang"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
<w:u w:val="single"/>
</w:rPr>
<w:t> légende</w:t>
</w:r>
</w:t> <---- what
</w:r>
</w:p>This isn't invalid XML, but I'm pretty sure it's illegal in OpenXML. (I'd need to check.)
Also, it looks like document.xml has some tag mismatches further on (line 2, column 42916), based on trying to open it in an XML editor.
-
-
assigned issue to
- edited description
-
assigned issue to
-
- changed status to resolved
This is fixed in the latest M28 snapshot.
- Log in to comment
Comment 1. originally posted by @ysavourel on 2014-01-29T18:29:38.000Z: