Unwanted tags in XLIFF from transpdf

Issue #26 new
Anil Duggirala created an issue

hello,

I am using the Okapi XLIFF filter for OmegaT (not XLIFF 2), on XLIFF from transpdf.com . Within OmegaT I am getting many tags that are unnecessary. For example.

<g1652913756>Choose your option</g1652913756>

And then I get another segment like this one:

<g1651932575>Choose your option</g1651932575>

This means OmegaT does not recognize these as identical segments. This means I can’t simply insert a match, I need to edit the tags. There are other examples that are much more painful, like this one:

128x128Ext (128 x 128 LCD matrix, detachable) <g1652794592>(4 </g1652794592><g1652794593>Nominal Digital Input voltage (voltage withstand) </g1652794593>

After contacting transpdf, they mentioned unique identification of segments as a cause for this. And even suggested ignoring <g> tags. I am attaching my email exchange with their support team.

The XLIFF file is here: https://1drv.ms/u/s!AjPT5GYO0wRogYFwAVkmndvLfsQMzA And the original pdf being translated can be viewed here: https://1drv.ms/b/s!AjPT5GYO0wRogYFvxhuX0tv6zhuSaA?e=ztCc8w

thank you,

Comments (1)

  1. t_cordonnier

    This is exactly the same problem as ticket #20: the filter uses the id in <g> markup instead of paragraph-specific sequential number.

  2. Log in to comment