XLIFF2 Filter: Okapi is not able to preserve indentation on merged file

Issue #1057 resolved
Handika D created an issue

The Okapi should be able to preserve it. What is the workaround of this?

Comments (6)

  1. ysavourel

    The behavior of white space within segments is driven by the xml:space attribute.

    As for the outer white space, it’s currently not preserved, I believe. It’s not significant for the content itself, so it’s not as important as the inner white space. Note that if you need to compare original XLIFF2 files with translated ones, you probably want to use some XML-aware comparison tool anyway, as other things may be changed. For example the order of the attributes, the way empty tags are output, escape sequences, etc.

  2. Handika D reporter

    I’ve just noticed that Okapi added unnecessary code into the merged file, like the xml header to the file and content of the <targets>. It really differs from what original file has. AFAIK, what Okapi is created for in the first place is to translate while maintaining the structure of the original file, isn't it?

  3. ysavourel

    The aim of the Okapi filters is to provide a way to read a file, translate it and write out the translation in the original format, while preserving the formatting of the content as much as possible.

    When the file format is a bilingual (like XLIFF) or multilingual formats (like TMX), the translations go in the places where it is expected, so in <target> in XLIFF case.

    That say, we usually try to preserve the original document as much as possible, but it’s not a requirement. Especially for formats such as XML where there is no easy way to control some of the output. So in the case of the XLIFF2 filter, currently, we don’t try to preserve the outer physical layout (indents, type of escapes, etc.), because it’s focused on the payload of the file, which is the segment’s content. When you parse 2 XLIFF files: one indented, the other one in a single line, there is no difference of content. It’s a lot like an HTML file in that aspect.

    In short: preserving the indentation of the XLIFF2 file is a “nice-to-have” thing, but it’s not high in the priority list,

  4. Jim Hargrave

    I have enabled the xliff2 writer indent option. This should help a bit with the formatting matching the original

  5. Log in to comment