Numerotation of tags in xliff filter

Issue #20 new
t_cordonnier created an issue

With the Okapi filter for XLIFF format, OmegaT displays tags whose number is taken from the contents of the file. For example if the file contains <g id="10"> it will generate a tag named <g10>.

Some CAT tool editors use XLIFF files where the tag number is incremental at document level. This means that two segments from the same document, if they contain tags, have absolutely no chance to be considered as identical (100 % match), and as a consequence the auto-propagation does not work.

Would it be possible to use, eventually as an option for this filter, a segment-level sequence, which resets to 0 for each new segment? Note that this is what the native XLIFF filter for OmegaT does, but yours has the advantage that it is bilingual.

Comments (6)

  1. ysavourel

    @tingley : I think you worked on something similar to this (re-starting the inline code numbering for each segment) at some point. Do we have a utility function somewhere that could be used or adapted for such a temporary renumbering?

  2. Chase Tingley

    @ysavourel Yes, that was in SegmentationStep. The utility code it called is in net.sf.okapi.common.RenumberingUtil.

  3. Anil Duggirala

    Hello, I am having this issue in OmegaT, not getting 100% matches. Could you please educate me as to how to apply this solution? Do I modify the code in the filter files? Do I need to do something within OmegaT?

    thank you,

  4. t_cordonnier reporter

    Sorry, I did not say that I had a solution. This ticket is still open, meaning that it is not yet solved. I only wanted to report that tickets #20 and #26 are most probably exactly the same, which can be a good help for the developers since you gave an example.

  5. Log in to comment