- edited description
OpenXML filter failed with String value {n} exceeds range of unsigned int
Okapi openxml filter failed with a processing of a file (attached one example)
The reason is a file word/_rels/document.xml.rels can contain a relationship tag with id attribute that could not be converted to numeric value with a larger int type.
<Relationship Id="R922b713c7fd246ad" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.png"/>
Stacktrace:
java.lang.NumberFormatException: String value 9227137246 exceeds range of unsigned int. at java.base/java.lang.Integer.parseUnsignedInt(Integer.java:839) at java.base/java.lang.Integer.parseUnsignedInt(Integer.java:928) at net.sf.okapi.filters.openxml.RelationshipIdGeneration$Default.refineLastIndexWith(RelationshipIdGeneration.java:42) at net.sf.okapi.filters.openxml.Relationships$Default.readWith(Relationships.java:134) at net.sf.okapi.filters.openxml.Document$General.relationshipsFor(Document.java:445) at net.sf.okapi.filters.openxml.Document$General.open(Document.java:147) at net.sf.okapi.filters.openxml.OpenXMLFilter.openDocument(OpenXMLFilter.java:423) at net.sf.okapi.filters.openxml.OpenXMLFilter.next(OpenXMLFilter.java:250) at net.sf.okapi.steps.common.RawDocumentToFilterEventsStep.handleEvent(RawDocumentToFilterEventsStep.java:135) at net.sf.okapi.common.pipeline.Pipeline.execute(Pipeline.java:117) at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:227) at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:199) at net.sf.okapi.common.pipelinedriver.PipelineDriver.processBatch(PipelineDriver.java:182)
Comments (11)
-
reporter -
- changed status to open
approve ticket
-
The current Okapi seems to assume the relationship id to be something like “rId1“ “rId2“.. which is the way Office generated the relationship but no something enforced by OpenXML schema. https://learn.microsoft.com/en-us/openspecs/office_standards/ms-oi29500/1f66e1f2-ad2a-4cee-8efc-e563dca56a03
Since
net.sf.okapi.filters.openxml.RelationshipIdGeneration
only impact the relationship added via Okapi (no impact for reading), I am wondering if we can simply match the current id with “^rId(\d+)
“ and increase the nextId when the regex matches, otherwise for such non-standard id, just ignore that. -
@Aleksei Bogdanovich thanks for reporting this case!
@Jiacheng Sheng thank you for your proposal to solve this issue!
-
A related commit 4a6308edf7b4f5f7903f04a81b32aad2dc254c60 in the scope of the pull request #727 was merged.
-
The test document was added to the scope of pull request #734.
-
- changed milestone to 1.46.0
-
assigned issue to
-
- changed status to resolved
@Aleksei Bogdanovich yet another related commit cdf8a4838438e8dc249ca59a7ba24523b11e8515 with a test document made by @Alec Shashaty and the pull request #734 were merged. Could you please take a look and see whether this issue can be closed?
-
Hey, “changed milestone to 1.46.0“ does it mean we will have this change released in 1.46? If so, when can we expect the release? Thanks!!!
-
@Di Hu you are right - if an issue is resolved and a milestone is matched, then a solution is going to be released under the specified milestone. Namely, this issue solution will be present in the next 1.46 release.
As for the release date, could you please ask about that in the okapi-devel Google group?
Thanks!
-
reporter - edited description
- Log in to comment