Tikal -seg option doesn't work
I tried that option but can't see any difference when compared without the -seg option
./tikal.sh -x ~/me/sample-files/calibre.docx -seg ~/Documents/default.srx -tl id
the custom srx file is attached
Comments (7)
-
-
- attached test.docx.xlf
- attached test.docx
<div class="preview-container wiki-content"><!-- loaded via ajax --></div> <div class="mask"></div> </div>
It seems to work Ok for me.
See the attached .XLF produced from the attached .doxx file with your SRX file and the same command-line.
The segment markers delimit properly the few sentences of the paragraph.
Maybe the input file you have tested doesn’t have several sentences in the same paragraph? -
sample file I used - calibre.docx: https://drive.google.com/file/d/1XB_yFkv1XCgakbhdCvhqFsXtt923FT_U/view?usp=sharing
-
oh I see. so instead of creating a new source tag, it’s delimited by `<mrk>` tags rather , right?
-
Yes. the segmented content goes into a separate element <seg-source>.
See http://docs.oasis-open.org/xliff/v1.2/pr03/xliff-core.html#Struct_Segmentation for details on the specification. -
- changed status to resolved
-
then hours later after I examined the xliff again I notice unwanted thing. that's the id value of <g> is not reset to 1 on every subsequent <mrk> s.
<seg-source><mrk mid="0" mtype="seg">Here is some <g id="1">bold, </g><g id="2">italic, <g id="3">bold-italic, </g></g><g id="4">underlined </g>and <g id="5">struck out </g> text.</mrk><mrk mid="1" mtype="seg"> Then, we have a super<g id="6">script</g> and a sub<g id="7">script</g>.</mrk><mrk mid="2" mtype="seg"> Now we see some <g id="8">red</g>, <g id="9">green</g> and <g id="10">blue</g> text.</mrk><mrk mid="3" mtype="seg"> Some text with a <g id="11">yellow highlight</g>.</mrk><mrk mid="4" mtype="seg"> Some text in a <g id="12">box</g>.</mrk><mrk mid="5" mtype="seg"> Some text in <g id="13">inverse video</g>.</mrk></seg-source>
- Log in to comment
anyway, this is my post.
I can’t see any difference when I looked upon the generated xlf file