-
assigned issue to
- edited description
TextFragmentUtil.alignAndCopyCodeMetadata inserts only an open tag, leaving the TextFragment unbalanced
This is against dev commit 45ecaf5 after PR has been merged.#503
When the BaseConnector finds a fuzzy match where a TM entry almost matches except that the TM entry lacks a code pair, it was found that in the target, just the open tag is inserted. This results in an invalid XLIFF file.
It seems TextFragmentUtil.alignAndCopyCodeMetadata is not working as expected. This can be demonstrated by applying the following patch to TextFragmentUtilTest.java.
diff --git a/okapi/core/src/test/java/net/sf/okapi/common/resource/TextFragmentUtilTest.java b/okapi/core/src/test/java/net/sf/okapi/common/resource/TextFragmentUtilTest.java index 38773d3317..ebe82d6878 100644 --- a/okapi/core/src/test/java/net/sf/okapi/common/resource/TextFragmentUtilTest.java +++ b/okapi/core/src/test/java/net/sf/okapi/common/resource/TextFragmentUtilTest.java @@ -281,6 +281,11 @@ public class TextFragmentUtilTest { TextFragment oriFrag; TextFragment trgFrag; + oriFrag = GenericContent.fromLetterCodedToFragment("xxx <g1>yyy</g1> zzz", null, false, true); + trgFrag = GenericContent.fromLetterCodedToFragment("xxx yyy zzz", null, false, true); + alignAndCopyCodeMetadata(oriFrag, trgFrag, true, false); + assertEquals("xxx yyy zzz<1></1>", fmt.setContent(trgFrag).toString()); + oriFrag = GenericContent.fromLetterCodedToFragment("src<x1/>", null, false, true); trgFrag = GenericContent.fromLetterCodedToFragment("trg", null, false, true); alignAndCopyCodeMetadata(oriFrag, trgFrag, true, true);
Comments (10)
-
reporter -
Thank you Kuro. I will take a look this Sunday.
Jim Hargrave
-
- marked as enhancement
-
@Kuro Kurosaka The original functionality of addMissingCodes was to only add codes that were guaranteed to have accurate placement. This meant only leading or trailing codes for the segment. We could enhance the functionality to add all missing codes - with the understanding that the placement will be incorrect and require post-editing.
I don’t see a problem adding this as it would allow you to merge leveraged documents which would be useful for context preview etc.
-
- marked as bug
-
@Kuro Kurosaka But you are right I did find a problem when the 'to” frgmant has zero codes. I will fix this.
What do you think about the enhancement for adding all codes?
-
reporter @Jim Hargrave (OLD) thank you for looking into it on Sunday! I thought this option was meant to add all extra codes found in the source to the end of of the target so that the human editor can reposition them later. If that is not the intention of this flag, I misunderstood its purpose. I am not sure what you mean by “the enhancement for adding all codes”. Could you have some example that shows what you the result of your fix might be without this enhancement, and what it would be with this enhancement?
-
@Kuro Kurosaka Actually you are right. I misread the original code. All missing codes are added- except the bug case which I will fix.
public static void addMissingCodes(TextFragment from, TextFragment to, CodeMatches codeMatches) { List<Code> fromCodes = from.codes; TextFragment leadingCodes = new TextFragment(); // iterate over all the mismatches in from and add if possible for (Integer index : codeMatches.getFromMismatchIndexIterator()) { Code fc = fromCodes.get(index); if (isLeadingCode(fc, from)) { leadingCodes.append(fc.clone()); } else { to.append(fc.clone()); } } // add missing leading codes to.insert(0, leadingCodes, true); }
-
- changed status to resolved
fix Issue
#1043and cleanup and consolidate code mismatch error logging→ <<cset ae060a9337f7>>
-
Fixed in DEV with commit 3057027dfea06b3881d107fb8b81146fc7fb578b
- Log in to comment