- changed title to Duplicate Segments in merged file
- removed version
- edited description
- attached files_and_comparison.zip
Duplicate Segments in merged file
I have been trying to find solution for -ERR:REF-NOT-FOUND- getting appended in my translated file at random places
and i found this issue has been resolved in version: 1.41.0-SNAPSHOT (https://bitbucket.org/okapiframework/okapi/pull-requests/451/996-core-reference-identifier-values)
. I have used the latest source code from dev branch and built the version locally.
But another issue i noticed here is , in the merged file duplicate segments are appearing randomly.
Attached original and translated file and compare report.
Version: 1.41.0-SNAPSHOT
Comments (17)
-
reporter -
reporter @Denis Konovalyenko
little help here
-
@Devesh Kumar , the round-trip of
Artura _ McLaren Automotive.html
has not given any duplicates (for more information please refer to the attached1014.zip
)… Do you think you would be able to provide additional details (HTML filter parameters, at least) the document is processed with? Also, it would be really helpful if you could reduce the original file content as much as possible (1-2 segments ideally). -
- attached 1014.zip
-
Can you also copy the duplicate segments so I can search for it in the larger file?
-
reporter
left side : original file content (Artura _ McLaren Automotive.html)
Right side : merged file , with duplicate segments highlighted in RED (File Name : merged_file_with_duplicate.html)
Few example :
<a href="https://cars.mclaren.com/us-en/artura" class="language-link js-language-link cta" data-locale="en">
<a href="https://mclarencars.cn/cn-zh" class="language-link js-language-link cta" data-locale="zh">
Italian</a>
French</a>
Above lines have come twice in the merged file, (see attached : files_and_comparison.zip) ,
Merged file name : merged_file_with_duplicate.html
-
reporter @Denis Konovalyenko
I am getting these logs, what are these while doing the translation
#Reference dp NOT FOUND and
the extra target code id='2' does not have corresponding data. (item id='tu105', name='')
2021-01-12 13:22:40.269 WARN 17446 --- [nio-8732-exec-1] c.matecat.converter.core.XliffProcessor : Missing producer version in input XLIFF
2021-01-12 13:22:42.141 INFO 17446 --- [nio-8732-exec-1] n.s.o.c.pipelinedriver.PipelineDriver : Input: /tmp/3640111358274881623/pack/manifest.rkm
2021-01-12 13:22:42.340 INFO 17446 --- [nio-8732-exec-1] n.s.o.s.rainbowkit.postprocess.Merger : Merging: mcleren.html
2021-01-12 13:22:43.483 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp45' not found.
2021-01-12 13:22:44.374 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp46' not found.
2021-01-12 13:22:45.532 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp70' not found.
2021-01-12 13:22:45.609 WARN 17446 --- [nio-8732-exec-1] n.sf.okapi.common.resource.TextUnitUtil : The extra target code id='2' does not have corresponding data. (item id='tu105', name='')
2021-01-12 13:22:46.725 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp189' not found.
2021-01-12 13:22:47.681 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp203' not found.
2021-01-12 13:22:48.632 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp246' not found.
2021-01-12 13:22:49.428 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp256' not found.
2021-01-12 13:22:49.435 WARN 17446 --- [nio-8732-exec-1] n.sf.okapi.common.resource.TextUnitUtil : The extra target code id='3' does not have corresponding data. (item id='tu171', name='')
2021-01-12 13:22:50.131 WARN 17446 --- [nio-8732-exec-1] n.s.o.filters.html.HtmlSkeletonWriter : Reference 'dp344' not found.
Item id = tu105
<trans-unit id="tu105" xml:space="preserve"> <source xml:lang="en"> <bx id="1" /> Configure <ex id="1" /> </source> <seg-source> <mrk mid="0" mtype="seg"> </mrk> <mrk mid="1" mtype="seg"> <bx id="1" /> </mrk> <mrk mid="2" mtype="seg"> Configure <ex id="1" /> </mrk> <mrk mid="3" mtype="seg"> </mrk> </seg-source> <target xml:lang="hi"> <mrk mid="0" mtype="seg"> <sid id="1"> </sid> </mrk> <mrk mid="1" mtype="seg"> <sid id="1"> <bx id="1" /> </sid> </mrk> <mrk mid="2" mtype="seg"> Configure <ex id="1" /> <sid id="1"> Configure <ex id="1" /> </sid> </mrk> <mrk mid="3" mtype="seg"> <sid id="1"> </sid> </mrk> </target>
Item id = tu171
<trans-unit id="tu171" xml:space="preserve"> <source xml:lang="en"> <ex id="1" /> <ex id="2" /> </source> <seg-source> <mrk mid="0" mtype="seg"> <ex id="1" /> </mrk> <mrk mid="1" mtype="seg"> <ex id="2" /> </mrk> </seg-source> <target xml:lang="hi"> <mrk mid="0" mtype="seg"> <sid id="1"> <ex id="1" /> </sid> </mrk> <mrk mid="1" mtype="seg"> <ex id="2" /> <sid id="1"> <ex id="2" /> </sid> </mrk> </target> </trans-unit>
-
I think I found the problem. You are using the RainbowKitStep. The merger used in that step is using very old code. I have updated RainbowKitStep and pushed to dev. Can you give it a try with the latest code? Thanks!
-
- changed status to open
-
reporter @Jim Hargrave (OLD) where can i find the specific updated RanibowKitStep code(any specific commits), do i need to change only this package?
-
Easiest way would be to re-pull the dev branch. You want commits
f58b737 https://bitbucket.org/okapiframework/okapi/commits/f58b7376e391c1a798c0bec144609abc8ecd31ee
and
b4f080a https://bitbucket.org/okapiframework/okapi/commits/b4f080a9508892d8d28f15e4e2cf065e2cd758d5
-
reporter @Jim Hargrave (OLD) thank you
-
reporter I was using okapi version :0.35
also, i have found few issues on the top layer, on using parallestream() , multiple threads accessing and manipulating same document was causing this ERROR:NO REF . adding syncronized block worked fine.
@Denis Konovalyenko thanks for giving the insight to check for issue on top layers. -
@Devesh Kumar , thank you for getting back with the information on the root cause of the issue. Do you think it can be closed then?
-
reporter yes. thank you for helping me out and apologies, for taking your too much time on this issue , will try 1.41 version once released on maven
-
reporter - changed status to resolved
-
reporter - marked as minor
- Log in to comment