- changed status to open
XLIFF2FilterWriter MetadataSkeleton compatibility with filters using GenericSkeleton
Hello,
My team has been aiming to upgrade our Okapi version to the latest and consolidate some of our existing homegrown XLIFF2 code with what is existing in Okapi.
However, we find some incompatibilities when reading a file using existing filters (e.g. JSON, HTML, etc) and then writing as XLIFF2. Specifically, we get the following error:
java.lang.ClassCastException: class net.sf.okapi.common.skeleton.GenericSkeleton cannot be cast to class net.sf.okapi.filters.xliff2.MetadataSkeleton (net.sf.okapi.common.skeleton.GenericSkeleton and net.sf.okapi.filters.xliff2.MetadataSkeleton are in unnamed module of loader 'app')
This is referenced in https://groups.google.com/g/okapi-users/c/gPJOBNHCib4/m/3jtvie0iAwAJ. However, in our case I believe we can’t use an XLIFF2 compatible Skeleton as GenericSkeleton
is explicitly written in the various filters.
The following is a simplified version of our code:
public IPipelineDriver buildDriver() {
IPipelineDriver driver = new PipelineDriver();
RawDocumentToFilterEventsStep rawDocumentToFilterEventsStep = new RawDocumentToFilterEventsStep();
rawDocumentToFilterEventsStep.setFilter(new JSONFilter()); // or most other filter like HTML5Filter()
driver.addStep(rawDocumentToFilterEventsStep);
FilterEventsWriterStep filterEventsWriterStep = new FilterEventsWriterStep();
filterEventsWriterStep.setFilterWriter(new XLIFF2FilterWriter());
driver.addStep(filterEventsWriterStep);
driver.addBatchItem(rawDocument, new File(outputPath).toURI(), StandardCharsets.UTF_8.name());
return driver;
}
Note the following cases do work:
- I switch out
JSONFilter
forXLIFF2Filter
(which usesMetadataSkeleton
) - I switch out
XLIFF2FilterWriter
fornet.sf.okapi.common.filterwriter.XLIFFWriter
- I switch out
XLIFF2FilterWriter
for my homegrown XLIFF2 filter writer.
I understand that XLIFF2 in Okapi is a WIP today and isn’t in its current state expected to work at the level of the existing XLIFF1.2 stuff. With that in mind, I ask the following questions:
- Is my understanding of the context of XLIFF2 support right (i.e. XLIFF2FilterWriter is supposed to be at some point usable for the purpose described above, I am referencing the relevant code and using it properly, etc)?
- Is this a (relatively) easily addressable bug?
- If not, what would it take to support this use case in XLIFF2? What context would be necessary to contribute to this?
Comments (8)
-
-
XLIFF2FilterWriter
is normally meant to be used only by theXliff2Filter
during merge (post-translation). You are right that the currentXliff2FilterWriter
is close to a general writer. I don't think it would take too much to enhance the currentXliff2FilterWriter
to be used as a general writer.I don’t think this is a bug.
ISkeleton
has may different implementations and all should be handled - never assumeGenericSkeleton
.Rainbow does have a partial implementation of a general xliff2 writer (that's the one marked as beta). Not sure that is a good starting point.
In summary if you want to have a
IFilterWriter
that can take events from any filter and output xliff2 - that is something that is on the priority list that we just haven't gotten to yet.Sorry for the late edits was having issues with my keyboard making it hard to type a coherent response.
-
I think
XLIFF2FilterWriter
should be package private to drive home the fact it is not designed to be used as a general writer like TmxWriter etc.. -
reporter Thanks for the thorough response, Jim! The situation as you describe it makes sense.
- Is the partial beta implementation of the XLIFF2Writer
okapi/libraries/lib-xliff2/src/main/java/net/sf/okapi/lib/xliff2/writer/XLIFFWriter.java
? That is what my team has currently found that seems closest to this, although we aren’t the most familiar with the repo (please correct if wrong). - What are the differences between that partial implementation and what would be a generic
XLIFF2Writer
? - What would the timeline potentially look like for a generic
XLIFF2Writer
? Is there anything I could help with or contribute to in order for this to be implemented?
- Is the partial beta implementation of the XLIFF2Writer
-
(1) Actually that is the raw xliff2 write in the lib. It’s not beta. The one I refer to is in Rainbow: src/main/java/net/sf/okapi/steps/rainbowkit/xliff/XLIFF2PackageWriter.java. That’s the one referred to as beta in the Rainbow UI.
(2) I haven’t looked at the xliff2 write in rainbow. But I don’t think there would be much difference between the Xliff2FilterWriter and a generic one.
(3) I don’t think it would take much. A start would be to update the
DOCUMENT
type handler to work with moreISkeleton
types likeGenericSkeleton
.I’m booked this next week and have US taxes to file :-( But week after next I can take a closer look. If yo don’t hear anything in two weeks give me another shout.
-
reporter Hey Jim, thanks for the explanations there.
Just giving a shout here since it has been 2 weeks.
-
Hi Marco - I got distracted by another bug but back working on this. I managed to combine this issue with another project I am working on (full general xliff 2 writer). Should have more info by next week.
-
reporter Hi Jim, thanks a lot for working on this! Checking in to see if you have any info.
- Log in to comment