Xlsx-Merge fails when text followed by empty run with run properties

Issue #864 resolved
Cornelius Buschka created an issue

No description provided.

Comments (10)

  1. Cornelius Buschka reporter

    net.sf.okapi.common.exceptions.OkapiMergeException: Error merging from original file

    at net.sf.okapi.lib.merge.step.OriginalDocumentXliffMergerStep$1.produce(OriginalDocumentXliffMergerStep.java:124)
    at net.sf.okapi.lib.merge.step.OriginalDocumentXliffMergerStep$1.produce(OriginalDocumentXliffMergerStep.java:113)
    at net.sf.okapi.common.io.InputStreamFromOutputStream$DataProducer.call(InputStreamFromOutputStream.java:139)
    at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
    at java.util.concurrent.FutureTask.run(FutureTask.java)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    

    Caused by: net.sf.okapi.common.exceptions.OkapiNotImplementedException: no text name set
    at net.sf.okapi.filters.openxml.OpenXMLZipFilterWriter.handleEvent(OpenXMLZipFilterWriter.java:252)
    at net.sf.okapi.lib.merge.merge.SkeletonMergerWriter.processTextUnit(SkeletonMergerWriter.java:240)
    at net.sf.okapi.lib.merge.merge.SkeletonMergerWriter.handleEvent(SkeletonMergerWriter.java:103)
    at net.sf.okapi.lib.merge.step.OriginalDocumentXliffMergerStep$1.produce(OriginalDocumentXliffMergerStep.java:120)
    ... 7 more
    Caused by: java.lang.IllegalStateException: no text name set
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.writeRunText(BlockTextUnitWriter.java:219)
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.flushText(BlockTextUnitWriter.java:193)
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.writeCode(BlockTextUnitWriter.java:136)
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.writeSegment(BlockTextUnitWriter.java:100)
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.writeFirstSegment(BlockTextUnitWriter.java:88)
    at net.sf.okapi.filters.openxml.BlockTextUnitWriter.write(BlockTextUnitWriter.java:75)
    at net.sf.okapi.filters.openxml.StyledTextSkeletonWriter.processTextUnit(StyledTextSkeletonWriter.java:166)
    at net.sf.okapi.common.filterwriter.GenericFilterWriter.processTextUnit(GenericFilterWriter.java:259)
    at net.sf.okapi.common.filterwriter.GenericFilterWriter.handleEvent(GenericFilterWriter.java:195)
    at net.sf.okapi.filters.openxml.OpenXMLZipFilterWriter.handleEvent(OpenXMLZipFilterWriter.java:249)
    ... 10 more

  2. Denis Konovalyenko

    @Cornelius Buschka , I assume this might be related to adding a text to run builder here - net.sf.okapi.filters.openxml.RunBuilder#addText. By the way, there has been recently merged a lot of adjustments to provide an initial support for Open XML Strict documents in the scope of the pull request #338. Have you been experiencing this problem before d57beb9d?

  3. Cornelius Buschka reporter

    @Denis Konovalyenko We have experienced the problem with release 0.37.

    @Chase Tingley No reason to thank. We benefit from the work you all put into okapi here. Of course we contribute back at least some bug reports 😉 We are still investigating the issue at 24technology. But if you are willing to colaborate on this issue I am open to it.

  4. Denis Konovalyenko

    @Cornelius Buschka , thanks for the confirmation! I am afraid I am quite busy at the moment and can only assume that the sharedStrings.xml part with the following content (got it from the file at the mentioned pull request in the Okapi integration tests):

    <?xml version="1.0" encoding="UTF-8"?>
    <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">
        <si>
            <r>
                <t>Text1</t>
            </r>
            <r>
                <rPr>
                    <sz val="10"/>
                    <rFont val="Arial"/>
                </rPr>
                <t>Text2</t>
            </r>
            <r>
                <rPr>
                    <b/>
                    <sz val="10"/>
                    <rFont val="Arial"/>
                    <family val="2"/>
                </rPr>
            </r>
        </si>
    </sst>
    

    should look like that after round-tripping (extracting and then merging):

    <?xml version="1.0" encoding="UTF-8"?>
    <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">
        <si>
            <r>
                <t>Text1</t>
            </r>
            <r>
                <rPr>
                    <sz val="10"/>
                    <rFont val="Arial"/>
                </rPr>
                <t>Text2</t>
            </r>
        </si>
    </sst>
    

    as the 3rd run does not even contain text (and that is why the text name is absent and the exception is thrown on writing the part down)… I think the related code is in net.sf.okapi.filters.openxml.RunBuilderSkipper#canSkip method:

            for (Chunk runBodyChunk : runBodyChunks) {
                if (runBodyChunk instanceof Run.RunText && ((Run.RunText) runBodyChunk).isEmpty()){
                    continue;
                }
                return false;
            }
    

    It would be nice to check the case that there is no any Run.RunText chunk present as well. Do you think you would be able to contribute with a pull request for that?

  5. Cornelius Buschka reporter

    Hi @Denis Konovalyenko , I think your guess is right. Thank you for pointing me to the code location. I found another location in RunMerger but still had no good fix. I am going to take a look into RunBuilderSkipper and get back to you.

  6. Log in to comment