PPTX: exception resolving rel reference to '../customXml/item1.xml'

Issue #1046 resolved
Chase Tingley created an issue

Reported by Marc Mittag on the okapi-users list. He sent me a sample file that demonstrates this, which I can share, but isn’t postable here. I’ve attached a somewhat hacky testcase by grafting the offending rels line onto a clean file.

java.lang.IllegalStateException: Unable to resolve '../customXml/item1.xml' against path ''.
    at net.sf.okapi.filters.openxml.Relationships.normalizeTarget(Relationships.java:198)
    at net.sf.okapi.filters.openxml.Relationships.addRelationship(Relationships.java:104)
    at net.sf.okapi.filters.openxml.Relationships.parseFromXML(Relationships.java:156)
    at net.sf.okapi.filters.openxml.Document$General.getRelationships(Document.java:369)
    at net.sf.okapi.filters.openxml.Document$General.initializeMainPartNameAndDocumentRelationshipsNamespace(Document.java:157)
    at net.sf.okapi.filters.openxml.Document$General.open(Document.java:129)
    at net.sf.okapi.filters.openxml.OpenXMLFilter.openDocument(OpenXMLFilter.java:435)
    at net.sf.okapi.filters.openxml.OpenXMLFilter.next(OpenXMLFilter.java:264)
    at net.sf.okapi.steps.common.RawDocumentToFilterEventsStep.handleEvent(RawDocumentToFilterEventsStep.java:135)
    at net.sf.okapi.common.pipeline.Pipeline.execute(Pipeline.java:117)
    at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:227)
    at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:199)
    at net.sf.okapi.common.pipelinedriver.PipelineDriver.processBatch(PipelineDriver.java:182)
    at net.sf.okapi.applications.tikal.Main.extractFile(Main.java:1624)
    at net.sf.okapi.applications.tikal.Main.process(Main.java:1005)
    at net.sf.okapi.applications.tikal.Main.main(Main.java:604)

Comments (8)

  1. Chase Tingley reporter

    Cross-posting my response from the okapi-users list:

    The extraction is failing because in the original, the _rels/.rels part contains several references that seem invalid:

    <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/custom-properties" Target="docProps/custom.xml"/>

    <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml" Target="../customXml/item1.xml"/>

    The first of these is fine and is provided for comparison.  The second one (and several others like it) are the problem.  They specify a relative path that starts with '../', even though this relationships file describes relationships relative to the root of the archive -- you can't "go up one level", there's nothing to go back to.  The code just crashes at this point because the whole thing looks suspicious (for example, a bad implementation that blindly followed those paths on an exploded archive on a local file system would be vulnerable to security problems).

    The fix is probably just to warn and skip this relationship when parsing that file.

    If you are able to check with the client and find out how this document was created, I'd be interested in the info.  The customXml directory does exist, it's just not being addressed properly.

    As a workaround, I was able to get this file to extract by opening it in LibreOffice and re-saving it.  However, that also strips the customXml data from the file entirely.

  2. Log in to comment