- changed status to open
XLIFF: Set xml:space="preserve" for entries created using ITS rule with itsx:whiteSpaces="preserve"
Original issue 311 created by khagar... on 2013-02-03T08:42:36.000Z:
When configuring custom ITS with entries using itsx:whiteSpaces="preserve", the extracted entries in XLIFF do have the spaces preserved, but don't have xml:space="preserve" set, so processing them in tools that do not treat XLIFF as xml:space="preserve" by default (fe. OmegaT) results in wrong translations. If the entry is created by a ITS rule with itsx:whiteSpaces="preserve" it should have xml:space="preserve" set.
Comments (9)
-
Account Deleted -
Account Deleted Comment 2. originally posted by khagar... on 2013-02-03T15:02:23.000Z:
Try source XML without xml:space="preserve".
-
Account Deleted Comment 3. originally posted by @ysavourel on 2013-02-03T15:12:43.000Z:
Sorry: Ive tried several cases and mis-copied the example in my previous post.
I did try without xml:preserve='preserve' in <literal>:
<?xml version="1.0" ?>
<doc xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0">
<prolog>
<date>2013-02-13</date>
<its:rules version="2.0" xmlns:itsx="http://www.w3.org/2008/12/its-extensions">
<its:translateRule selector="/doc/prolog" translate="no"/>
<its:idValueRule selector="//para" idValue="@ id"/>
<its:withinTextRule selector="//b" withinText="yes"/>
<its:translateRule selector="//literal" translate="yes" itsx:whiteSpaces="preserve"/>
</its:rules>
</prolog>
<body>
<para id="p1">Rome is the capital city of Italy.</para>
<para id="p2">It is also the country's largest and most populated comune and fourth-most populous city in the European Union by population within city limits.</para>
<literal>Country: Italy
Population: 2,777,979 (2011)
Time zone: CET</literal>
</body>
</doc>and got the exact same result:
<trans-unit id="3" xml:space="preserve"> -
Account Deleted Comment 4. originally posted by khagar... on 2013-02-03T16:39:42.000Z:
Then I wonder why it doesn't work for me, perhaps it's because the files I'm translating do have the translatable text stored as attributes.
-
Account Deleted Comment 5. originally posted by @ysavourel on 2013-02-03T17:38:25.000Z:
The property should affect the translated attributes as well.
If you send me an example that reproduce the problem for you I can try to debug it and fix it.
-ys -
Account Deleted - attached example.zip
Comment 6. originally posted by khagar... on 2013-02-03T22:45:03.000Z:
Here is an example file and the ITS rule. I'm creating an OmegaT project using the translation kit creation.
-
Account Deleted Comment 7. originally posted by @ysavourel on 2013-02-04T12:52:46.000Z:
Thanks, I can reproduce the issue.
As you suggested, it looks like a difference in the way we process the extracted text when it comes from an attribute.
I'll work on it. -
Account Deleted Comment 8. originally posted by @ysavourel on 2013-02-04T16:03:03.000Z:
The issue should be fixed now.
The fix is available in the latest manual snapshots
(http://okapi.opentag.com/snapshots/)Thanks for pointing out the problem.
-yves -
Account Deleted - changed status to resolved
Comment 9. originally posted by @ysavourel on 2013-02-04T16:03:22.000Z:
- Log in to comment
Comment 1. originally posted by @ysavourel on 2013-02-03T12:07:00.000Z:
It seems to be working for me.
For example, if I process:
<?xml version="1.0" ?>
<doc xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0">
<prolog>
<date>2013-02-13</date>
<its:rules version="2.0" xmlns:itsx="http://www.w3.org/2008/12/its-extensions">
<its:translateRule selector="/doc/prolog" translate="no"/>
<its:idValueRule selector="//para" idValue="@ id"/>
<its:withinTextRule selector="//b" withinText="yes"/>
<its:translateRule selector="//literal" translate="yes" itsx:whiteSpaces="preserve"/>
</its:rules>
</prolog>
<body>
<para id="p1">Rome is the capital city of Italy.</para>
<para id="p2">It is also the country's largest and most populated comune and fourth-most populous city in the European Union by population within city limits.</para>
<literal xml:space='preserve'>Country: Italy
Population: 2,777,979 (2011)
Time zone: CET</literal>
</body>
</doc>
I get:
<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:okp="okapi-framework:xliff-extensions" xmlns:its="http://www.w3.org/2005/11/its">
<file original="/Example_XML.xml" source-language="en-us" target-language="fr-fr" datatype="xml">
<body>
<trans-unit id="1" resname="p1">
<source xml:lang="en-us">Rome is the capital city of <g id="1">Italy</g>.</source>
</trans-unit>
<trans-unit id="2" resname="p2">
<source xml:lang="en-us">It is also the country's largest and most populated comune and fourth-most populous city in the European Union by population within city limits.</source>
</trans-unit>
<trans-unit id="3" xml:space="preserve">
<source xml:lang="en-us">Country: Italy
Population: 2,777,979 (2011)
Time zone: CET</source>
</trans-unit>
</body>
</file>
</xliff>
As you can see the xml:space is set on the trans-unit. xml:space is inherited by all children elements (http://www.w3.org/TR/xml/#sec-white-space).
Do you have an example where it's not working?
Thanks.
-yves