XLIFF filter changes `<source>` when escaping/unescaping

Issue #564 new
Nikolai Vladimirov created an issue

Given the following file:

<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:1.2" version="1.2">
    <file datatype="html" original="file.ext" source-language="en">
        <body>
            <trans-unit id="1">
              <source>&quot; &lt; &gt; ' " ></source>
            </trans-unit>
        </body>
    </file>
</xliff>

When processed with rainbow's test tool it creates the following output:

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:1.2" version="1.2">
    <file datatype="html" original="file.ext" source-language="en" target-language="fr">
        <body>
            <trans-unit id="1">
              <source>" &lt; > ' " ></source>
            <target xml:lang="fr">" &lt; > ' " ></target>
</trans-unit>
        </body> 
    </file>
</xliff>

The important part is that the <source> tag content changed in the output file(unescaped quote and gt). Whatever the escaping rules are in the filter config - the source should always remain unchanged.

Comments (2)

  1. Jim Hargrave (OLD)

    Having the source unchanged is the ideal, but getting this right increases complexity. Our goal is to reach xml equivalence.

    Interested in other opinions. Do any other xml-based localization tools output targets with the exact strings as the source?

  2. Log in to comment