-
assigned issue to
XLIFF2 Filter: crashes parsing any file containing a group
Pretty bad bug: we crash extracting any XLIFF2 file that contains a <group> element, which is most files.
I can reproduce this with the test01.xlf file which is part of the unittest data for the filter (!), and I've also attached. (The unittest that uses it just verifies that a particular TU is extracted, but does not complete parsing -- since the bug occurs on END_GROUP events, the test doesn't hit it.)
The issue is that the filter is trying to call into xliff-toolkit serialization methods to write out certain elements as skeleton, but those methods depend on internal state (eg the group stack) which isn't being tracked properly when called from the filter. xliff-toolkit's XLIFFReader
class has no problem handling this file.
Stack trace looks like this:
java.util.EmptyStackException
at java.util.Stack.peek(Stack.java:102)
at java.util.Stack.pop(Stack.java:84)
at net.sf.okapi.lib.xliff2.writer.XLIFFWriter.writeEndGroup(XLIFFWriter.java:1058)
at net.sf.okapi.filters.xliff2.XLIFF2Filter.convEndGroup(XLIFF2Filter.java:338)
at net.sf.okapi.filters.xliff2.XLIFF2Filter.readNext(XLIFF2Filter.java:249)
at net.sf.okapi.filters.xliff2.XLIFF2Filter.next(XLIFF2Filter.java:185)
Comments (8)
-
-
reporter I have a kind of clunky fix that seems to work; however I'm also seeing a crash when I try to write this to XLIFF 1.2 using tikal that may indicate that my fix is bad. I'll post an update in a bit.
-
reporter The tikal crash (which can be reproduced in test by collecting XLIFF2Filter events and then dumping them into an XLIFFWriter) is due to the way the XLIFF2 filter maps group elements to subdocument events. The </group> is therefore handled as end_subdocument by the XLIFFWriter, which closes both the </group> and the current </file>. This causes the element stack to eventually bottom out when we go to end the file.
-
reporter - changed status to resolved
Fix
#697- Handle XLIFF2 files that contain group data→ <<cset 9391ca5d2ebd>>
-
reporter I think I've got the fixes here (if you can review the PR this weekend, that would be great). I was initially nervous about the way I fixed the first issue, as there was a very old line of similar code that was commented out for some reason. But I can't get it to break anything else.
-
Thanks a lot for doing the fixes. I'll review before Monday.
-
Merged in issue697 (pull request #220) Fix
#697- Handle XLIFF2 files that contain group dataApproved-by: Yves Savourel yves@opentag.com
→ <<cset 378ea2e62e4c>>
-
reporter - changed milestone to M36
- Log in to comment
I'll try to look at it before Monday.