OpenXML filter failed with String value {n} exceeds range of unsigned int

Issue #1306 resolved
Aleksei Bogdanovich created an issue

Okapi openxml filter failed with a processing of a file (attached one example)

The reason is a file word/_rels/document.xml.rels can contain a relationship tag with id attribute that could not be converted to numeric value with a larger int type.

<Relationship Id="R922b713c7fd246ad" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.png"/>

Stacktrace:
java.lang.NumberFormatException: String value 9227137246 exceeds range of unsigned int. at java.base/java.lang.Integer.parseUnsignedInt(Integer.java:839) at java.base/java.lang.Integer.parseUnsignedInt(Integer.java:928) at net.sf.okapi.filters.openxml.RelationshipIdGeneration$Default.refineLastIndexWith(RelationshipIdGeneration.java:42) at net.sf.okapi.filters.openxml.Relationships$Default.readWith(Relationships.java:134) at net.sf.okapi.filters.openxml.Document$General.relationshipsFor(Document.java:445) at net.sf.okapi.filters.openxml.Document$General.open(Document.java:147) at net.sf.okapi.filters.openxml.OpenXMLFilter.openDocument(OpenXMLFilter.java:423) at net.sf.okapi.filters.openxml.OpenXMLFilter.next(OpenXMLFilter.java:250) at net.sf.okapi.steps.common.RawDocumentToFilterEventsStep.handleEvent(RawDocumentToFilterEventsStep.java:135) at net.sf.okapi.common.pipeline.Pipeline.execute(Pipeline.java:117) at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:227) at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:199) at net.sf.okapi.common.pipelinedriver.PipelineDriver.processBatch(PipelineDriver.java:182)

Comments (11)

  1. Jiacheng Sheng

    The current Okapi seems to assume the relationship id to be something like “rId1“ “rId2“.. which is the way Office generated the relationship but no something enforced by OpenXML schema. https://learn.microsoft.com/en-us/openspecs/office_standards/ms-oi29500/1f66e1f2-ad2a-4cee-8efc-e563dca56a03

    Since net.sf.okapi.filters.openxml.RelationshipIdGeneration only impact the relationship added via Okapi (no impact for reading), I am wondering if we can simply match the current id with “^rId(\d+)“ and increase the nextId when the regex matches, otherwise for such non-standard id, just ignore that.

  2. Denis Konovalyenko

    @Aleksei Bogdanovich thanks for reporting this case!

    @Jiacheng Sheng thank you for your proposal to solve this issue!

  3. Di Hu

    Hey, “changed milestone to 1.46.0“ does it mean we will have this change released in 1.46? If so, when can we expect the release? Thanks!!!

  4. Denis Konovalyenko

    @Di Hu you are right - if an issue is resolved and a milestone is matched, then a solution is going to be released under the specified milestone. Namely, this issue solution will be present in the next 1.46 release.

    As for the release date, could you please ask about that in the okapi-devel Google group?

    Thanks!

  5. Log in to comment