Translation using XLIFF 1.2 file as bilingual file throwing validation exception in M42, was working in M39

Issue #1130 resolved
Mike Bryant created an issue

In M39 the command “tikal -t -sl en -tl de -bi translate.xlf extract.xlf” succeeds. In M42 it throws an exception:

net.sf.okapi.lib.xliff2.reader.XLIFFReaderException: Error lineNumber: 2; columnNumber: 220; cvc-elt.1: Cannot find the declaration of element 'xliff'.
at net.sf.okapi.lib.xliff2.reader.SchemaValidator.validate(SchemaValidator.java:86)
at net.sf.okapi.lib.xliff2.reader.XLIFFReader.validateAndGetInput(XLIFFReader.java:368)
at net.sf.okapi.lib.xliff2.reader.XLIFFReader.open(XLIFFReader.java:433)
at net.sf.okapi.lib.xliff2.reader.XLIFFReader.open(XLIFFReader.java:319)
at net.sf.okapi.filters.xliff2.XLIFF2Filter.open(XLIFF2Filter.java:127)
at net.sf.okapi.steps.common.RawDocumentToFilterEventsStep.handleEvent(RawDocumentToFilterEventsStep.java:132)
at net.sf.okapi.common.pipeline.Pipeline.execute(Pipeline.java:117)
at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:227)
at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:199)
at net.sf.okapi.common.pipelinedriver.PipelineDriver.processBatch(PipelineDriver.java:182)
at net.sf.okapi.connectors.bifile.BilingualFileConnector.makeTempTM(BilingualFileConnector.java:151)
at net.sf.okapi.connectors.bifile.BilingualFileConnector.init(BilingualFileConnector.java:91)
at net.sf.okapi.connectors.bifile.BilingualFileConnector.query(BilingualFileConnector.java:81)

Comments (12)

  1. Mike Bryant reporter

    I am seeing this issue with a custom program that I have written to use the framework, but it also reproduces using tikal. I am trying to upgrade from M37 to M42 and ran into this issue.

  2. Mike Bryant reporter

    The attached XLIFF files are version 1.2. The stack trace is showing the XLIFF2Filter?

  3. Mike Bryant reporter

    Hmm, maybe root cause was changing “.xlf” to “.xlf;” in XLIFF2Filter.java in M42?

  4. Mike Bryant reporter

    No sure this is the correct fix, but it seems to work when I added this to getDefaultConfigurationFromExtension() in FilterConfigurationMapper before the for():

            FilterConfiguration autoXliff = getConfiguration("okf_autoxliff");
            if (autoXliff != null && autoXliff.extensions.contains(tmp)) {
                return autoXliff;
            }
    

  5. Mike Bryant reporter

    There are many duplicates of the “.xlf” file extension when the default mappings are searched. The configMap in FilterConfigurationMapper is a linked list map, so the order that is searched is predictable, but the order of the default configurations added to it from DefaultFilter.properties is not predictable. This could be changed so that AutoXLIFFFilter comes before XLIFFFilter and XLIFF2Filter in the list and then the special casing of okf_autoxliff would not be needed.

  6. Jim Hargrave

    @mike_bryant ".xlf;.xliff;.xlf2;.xliff2"

    I updated the extensions (possible fix as you suggested) which should give us some better coverage but there is still ambiguity.

    Personally I always set my filter config manually and don’t depend on autodetection. But @Chase Tingley or @ysavourel might have some suggestions for a fix.

  7. Mike Bryant reporter

    Is there a way to override the file mappings loaded by BilingualFileConnector:makeTempTM()? It uses DefaultFilters.setMappings().

  8. Log in to comment