OpenXML: Embedded content not extracted

Issue #513 new
Sebastian Ebert created an issue

Word and Powerpoint allow to embed other OpenXML files, e.g. an Excel table may be embedded within a Word document. One can edit the Excel table directly within Word using a double click. But Rainbow does not extract the content of the Excel file. I attached a sample for reproduction.

It would be nice if such content was also extracted for translation. The best solution might be to add an option to the OpenXML filters that is enabled by default, but can be disabled. Embedding Excel or Word content in PowerPoint is the same issue.

Comments (4)

  1. Sebastian Ebert reporter

    We've encountered this problem at least 10 times within the last 5 months. The only solution is to open the source document, export the embedded files (those can be many!) from the source file, e.g. Powerpoint, and save them in their native format (e.g. Excel) to separate files. Translate them afterwards and re-embed them to the source file.

    Depending on the number of files embedded, this can be exhausting.

    What is worse: You will probably not even recognize that you have embedded files in the source document. The first time you realize this is oftenly when you did the translation work in your CAT tool (e.g. OmegaT) and wonder afterwards that not everything has been translated (since the source contained embedded) files.

    I would be very happy about a solution to this problem.

    By the way: OmegaT has the same issue as well as Trados in older Versions. MemoQ is able to handle this issue. Would be a great thing if Rainbow would also support the embedded file cases.

  2. Jim Hargrave (OLD)
    • changed version to M33

    Added test files to integration tests. Confirmed embedded content still not extracted. Unless there is an option I am missing

  3. Log in to comment