Update the XLIFFFilter#processIwsStatus to allow blocking segments with a tm_score 100% and leave segments that have the has_multiple_exact attribute unblocked

Issue #872 resolved
Clement Mouchet created an issue

Intro:

A common workflow when dealing with IWS files is to have to review content where the IWS TM has multiple exact matches.

Currently, the filter is only capable of blocking the 100% (including any multiple exact matches) or multiple exact matches only.

Background:

The XLIFF filter can read iws:segment-metadata and protect trans-units using translate="no" attribute to allow partial translation or review of files XLF files from IWS.

The initial feature was limited to the translation_status and it’s then been extended to tm_score, multiple_exact& lock_status.

The current feature in Okapi for multiple_exact is only able to exclude them though, and if one try to filter isolate them to review them but not the segments that are tm_score="100.00" without multiple matches, they will be marked as translate="no". The change I’m introducing in this PR is simply allowing users to leave segments that have multiple_exact="has_multiple_exact" exposed, while other that have tm_score="100.00" but not multiple exact matches will be protected.

Here’s a sample I’ve put together by obfuscating bits of IWS file.

<trans-unit id="0000000026" datatype="x-text/xlf" restype="string">
  <source>Mint</source>
  <target>Mint</target>
  <note>Ingredient</note>
  <alt-trans match-quality="100.00" origin="client_tm">
    <source>Mint</source>
    <target>Hortelã</target>
    <iws:tm_entry_id tm_entry_id_value="167843558"/>
    <iws:is-reverse-leveraged reverse="false"/>
    <iws:is-repaired-match repaired="false"/>
    <iws:status translation_status="finished"/>
    <iws:asset-origin origin="3683028653991213327.tmx"/>
    <iws:attribute name="_tm_created_by">Bob</iws:attribute>
  </alt-trans>
  <alt-trans match-quality="100.00" origin="other_tm">
    <source>Mint</source>
    <target>Não usados (mint)</target>
    <iws:tm_entry_id tm_entry_id_value="167785700"/>
    <iws:is-reverse-leveraged reverse="false"/>
    <iws:is-repaired-match repaired="false"/>
    <iws:status translation_status="finished"/>
    <iws:asset-origin origin="2945621326041467552.tmx"/>
    <iws:attribute name="_tm_created_by">Fred</iws:attribute>
  </alt-trans>
  <alt-trans match-quality="100.00" origin="other_tm">
    <source>Mint</source>
    <target>Mint</target>
    <iws:tm_entry_id tm_entry_id_value="167785699"/>
    <iws:is-reverse-leveraged reverse="false"/>
    <iws:is-repaired-match repaired="false"/>
    <iws:status translation_status="finished"/>
    <iws:asset-origin origin="2945621326041467552"/>
    <iws:attribute name="_tm_created_by">Paul</iws:attribute>
  </alt-trans>
  <iws:segment-metadata tm_score="100.00" ws_word_count="1" sid="CAV_388690649" max_segment_length="0">
    <iws:status translation_status="pending" tm_origin="from_ws_tm" multiple_exact="has_multiple_exact"/>
  </iws:segment-metadata>
</trans-unit>

This issue updates the filter to allow blocking 100% matches except the ones that have multiple exact matches.

Comments (5)

  1. Log in to comment