ITS standoff annotations should be inside file

Issue #363 on hold
Former user created an issue

Original issue 363 created by @ysavourel on 2013-08-22T05:42:09.000Z:

Currently the ITS standoff annotations for LQI and provenance are placed at the end of the document, just before </xliff>

This cause issues when the document is split per <file> (e.g. with the XLIFF Splitter step): the notations do not follow.

There is also an issue in the 1.2 specification and schema where they contradict each other as where the extended elements can be within the <xliff> element. See: https://lists.oasis-open.org/archives/xliff/201308/msg00062.html for details.
Because the schema wins the ITS entries should really be placed before the first <file> if they are to stay outside <file>.

Comments (6)

  1. Former user Account Deleted

    Comment 1. originally posted by @ysavourel on 2013-08-22T17:27:00.000Z:

    CC'ing Kevin. We talked about this a bunch when he was working on the code, and there are some ugly cases. XLIFF Splitters (which are common) are unlikely to be ITS-aware, since most of them are extremely simple. This means if we want to support that use case, we will need to make sure that each <file> contains all the relevant ITS metadata for its own contents. This is a disconnect from the way ITS standoff can be referenced by multiple locations in the file. To fully support XLIFF splitting, it is therefore sometimes necessary to rewrite existing ITS standoff so that there is one copy for each referencing location. (And then moving that standoff inside the relevant <file>.)

    When we were initially working on it, we didn't want to touch any of that stuff, so we went the simple route at the risk of breaking the XLIFF splitter.

    But it looks like you're right about the schema; we would need to move it to be before the first <file>, at least.

  2. Former user Account Deleted

    Comment 2. originally posted by ke...@spartansoftwareinc.com on 2013-08-22T19:05:51.000Z:

    The reason why the ITS entries were placed at the end of the file was the list of ITS standoff to be written out is built during the processing of the document by the XLIFFSkeletonWriter. So the standoff isn't known until the end of the document, which must be since we need to resolve the ITS references on the elements in the XLIFF file and detect any duplicate references in the standoff.

    We may need to parse the document twice and pass in the ITS standoff to be written in the processStartDocument method. I'm worried this would enforce unusual usage behavior for the XLIFFSkeletonWriter, but I guess it would be only for handling ITS metadata.

  3. Former user Account Deleted

    Comment 3. originally posted by @ysavourel on 2013-08-22T19:21:07.000Z:

    Maybe that is not worth the trouble.
    The issue occurs only because the schema doesn't match the spec. So it's only when one want to validate the XLIFF document.

    I'll put this has a low-priority issue.

  4. Former user Account Deleted

    Comment 4. originally posted by @ysavourel on 2014-04-07T23:58:39.000Z:

    See also issue #396 - doing it per-file is not valid XLIFF.

  5. Log in to comment