ITS standoff annotations should be inside file
Original issue 363 created by @ysavourel on 2013-08-22T05:42:09.000Z:
Currently the ITS standoff annotations for LQI and provenance are placed at the end of the document, just before </xliff>
This cause issues when the document is split per <file> (e.g. with the XLIFF Splitter step): the notations do not follow.
There is also an issue in the 1.2 specification and schema where they contradict each other as where the extended elements can be within the <xliff> element. See: https://lists.oasis-open.org/archives/xliff/201308/msg00062.html for details.
Because the schema wins the ITS entries should really be placed before the first <file> if they are to stay outside <file>.
Comments (6)
-
Account Deleted -
Account Deleted Comment 2. originally posted by ke...@spartansoftwareinc.com on 2013-08-22T19:05:51.000Z:
The reason why the ITS entries were placed at the end of the file was the list of ITS standoff to be written out is built during the processing of the document by the XLIFFSkeletonWriter. So the standoff isn't known until the end of the document, which must be since we need to resolve the ITS references on the elements in the XLIFF file and detect any duplicate references in the standoff.
We may need to parse the document twice and pass in the ITS standoff to be written in the processStartDocument method. I'm worried this would enforce unusual usage behavior for the XLIFFSkeletonWriter, but I guess it would be only for handling ITS metadata.
-
Account Deleted Comment 3. originally posted by @ysavourel on 2013-08-22T19:21:07.000Z:
Maybe that is not worth the trouble.
The issue occurs only because the schema doesn't match the spec. So it's only when one want to validate the XLIFF document.I'll put this has a low-priority issue.
-
Account Deleted -
- edited description
- changed version to M33
-
- changed status to on hold
Put in backlog
- Log in to comment
Comment 1. originally posted by @ysavourel on 2013-08-22T17:27:00.000Z:
CC'ing Kevin. We talked about this a bunch when he was working on the code, and there are some ugly cases. XLIFF Splitters (which are common) are unlikely to be ITS-aware, since most of them are extremely simple. This means if we want to support that use case, we will need to make sure that each <file> contains all the relevant ITS metadata for its own contents. This is a disconnect from the way ITS standoff can be referenced by multiple locations in the file. To fully support XLIFF splitting, it is therefore sometimes necessary to rewrite existing ITS standoff so that there is one copy for each referencing location. (And then moving that standoff inside the relevant <file>.)
When we were initially working on it, we didn't want to touch any of that stuff, so we went the simple route at the risk of breaking the XLIFF splitter.
But it looks like you're right about the schema; we would need to move it to be before the first <file>, at least.