- attached fail.xlsx
Straight quote in Excel file crashes OmegaT project with java.lang.ClassException
Steps to reproduce
Translate Excel file with text “How well do you know this person's performance?” (for example)
Expected results
The filter extracts the text and project loads normally.
Actual results
Project crashes with error “java.lang.ClassException: com.sun.xml.internal.stream.events.CharacterEvent cannot be cast to javax.xml.stream.events.EndElement”
See screenshot https://imgur.com/2eAmPuA.png, or below:
Further info
The problem seems to be the straight quote. I get the expected results if I replace it with a curly apostrophe: “How well do you know this person’s performance?“
Files
I am attaching an OmegaT project that includes two files, one with straight quote and one with curly quote / apostrophe.
Comments (6)
-
-
- attached okf_openxmlnoauthor.fprm
-
I attached the
fail.xlsx
file and the omegat fprm directly for convenience.However, when I try to extract through just Okapi, I don’t see a crash (although I do see a couple warnings that seem unrelated – they are present in both the good and bad versions of the file, and are related to the structure of the docx archive):
$ tikal.sh -fc ../omegat/okf_openxml@noauthor.fprm -x fail.xlsx Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 ------------------------------------------------------------------------------- Okapi Tikal - Localization Toolset Version: 2.1.44.0-SNAPSHOT ------------------------------------------------------------------------------- Extraction Source language: en Target language: fr Default input encoding: UTF-8 Filter configuration: okf_openxml@noauthor Output: /home/tingley/Downloads/omegat/source/fail.xlsx.xlf Input: /home/tingley/Downloads/omegat/source/fail.xlsx Unable to resolve '../customXml/item1.xml' against path ''. Unable to resolve '../customXml/item2.xml' against path ''. Unable to resolve '../customXml/item3.xml' against path ''. Done in 0.645s
-
@Manuel Souto Pico What version of the plugin/okapi are you using?
-
This issue is definitely related to, and probably has the same root cause with, issue #38. The cell contents should be treated as just a text element, should not be parsed further as XML, and should not cause an exception.
I inserted this test case (well, not really a test case since there’s no assert) to Okapi OmegaT plugin’s net.sf.okapi.lib.omegat.AbstractOkapiFilterTest and this runs normally.
@Test public void testXlsx () throws Exception { org.omegat.filters2.IFilter filter = new OpenXMLFilter(); VirtualOmegaT omegat = new VirtualOmegaT(); File inFile = new File(getClass().getResource("/fail.xlsx").toURI()); filter.parseFile(inFile, null, new FilterContext(), omegat); }
The exception only happens when the plugin is used from OmegaT.
It is difficult to debug because three separate projects are involved, OmegaT, the plugin, and Okapi. (Any suggestion how to do this on Intellij?) -
reporter Thank you guys for looking into this.
@Chase Tingley I was using a customized variant of the plugin (version okapiFiltersForOmegaT-1.12-1.44.0) based on commit
c5fa867
. I need this customization to make the plugin compatible with Java 8, which is what OmegaT (including JRE) supports at the moment.I have just tested it with the latest binary available (version okapiFiltersForOmegaT-1.11-1.43.0.jar), running OmegaT with Java 11 from the command line, and I can reproduce it. The error message is not exactly the same, though:
Version: OmegaT-5.7.1_0_c3206253
Platform: Linux 5.16.0-5mx-amd64
Java: 11.0.12 amd64
Memory: 598MiB total / 482MiB free / 5960MiB maxjava -version
openjdk version "11.0.12" 2021-07-20
OpenJDK Runtime Environment 18.9 (build 11.0.12+7)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.12+7, mixed mode)I hope that helps.
- Log in to comment