OpenXML (docx without word/styles.xml)

Issue #453 resolved
Former user created an issue

Original issue 453 created by andriy.luts...@crowdin.com on 2015-04-07T12:20:37.000Z:

According to ECMA-376, 4th Edition
(and also https://msdn.microsoft.com/en-us/library/aa982683%28v=office.12%29.aspx)
style (word/styles.xml) definitions part is optional.
Usually this file exists in generated by LO/OO/MSO, but also there some apps which don't create this file.

In result of processing this (attached) docx file we obtain zip file only with [Content_Types].xml.

Solution of this issue:
File
okapi/okapi/filters/openxml/src/main/java/net/sf/okapi/filters/openxml/OpenXMLFilter.java in method nextInZipFile() in condition if(nZipType==MSWORD):

line 694:

if (sEntryName.equals("word/styles.xml"))

change to:

if (sEntryName.equals("word/document.xml"))

and line 704-705:

(sEntryName.equals("[Content_Types].xml") || // but don't do Content_Types
sEntryName.equals("word/styles.xml"))) // and styles a second time

change to:

(sEntryName.equals("[Content_Types].xml") || sEntryName.equals("word/document.xml")))

I can't estimate quality of my "hack" but after this fixing processing of attached document finished successfully.

Tested on okapi's development branch.

Comments (5)

  1. Former user Account Deleted

    Comment 1. originally posted by @ysavourel on 2015-04-07T23:41:36.000Z:

    Thanks Andriy, I will take a look when I get back from vacation.

    In the future, it is easier if you submit patches in diff form. You can use the 'git diff' command in the git cli, or an equivalent tool in most git GUIs.

    (Or even better, you can even create a fork of the okapi repo, commit your changes on a custom branch, then point us where the branch is.)

    Thanks!

  2. Log in to comment