- edited description
Receive TimeoutException in InputStreamFromOutputStream, which Okapi logs but does not rethrow.
I am working with Okapi 0.36.3 in my AWS ECS service and we can get the following exception a number of times when we run out of memory on our ECS task:
2022-10-28T01:14:30.201Z ERROR [QuartzScheduler_Worker-19] (InputStreamFromOutputStream.java:292) - This timeout should never happen, the thread should terminate correctly. Please contact io-tools support.
2022-10-28T01:14:30.201Z java.util.concurrent.TimeoutException
2022-10-28T01:14:30.201Z at java.util.concurrent.FutureTask.get(FutureTask.java:205)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.io.InputStreamFromOutputStream.checkException(InputStreamFromOutputStream.java:275)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.io.InputStreamFromOutputStream.read(InputStreamFromOutputStream.java:392)
2022-10-28T01:14:30.201Z at java.io.PipedInputStream.read(PipedInputStream.java:377)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.io.InputStreamFromOutputStream.read(InputStreamFromOutputStream.java:402)
2022-10-28T01:14:30.201Z at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.StreamUtil.copy(StreamUtil.java:138)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.StreamUtil.copy(StreamUtil.java:121)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.StreamUtil.createResettableStream(StreamUtil.java:279)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.resource.RawDocument.createResettableStream(RawDocument.java:346)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.resource.RawDocument.create(RawDocument.java:255)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.resource.RawDocument.<init>(RawDocument.java:158)
2022-10-28T01:14:30.201Z at com.amazon.atms.docconverter.okapi.steps.AsinMergeStep.handleRawDocument(AsinMergeStep.java:115)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.pipeline.BasePipelineStep.handleEvent(BasePipelineStep.java:76)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.pipeline.Pipeline.execute(Pipeline.java:119)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:231)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.pipeline.Pipeline.process(Pipeline.java:201)
2022-10-28T01:14:30.201Z at net.sf.okapi.common.pipelinedriver.PipelineDriver.processBatch(PipelineDriver.java:182)
2022-10-28T01:14:30.201Z at mypackage.operation.MergeOperation.run(MergeOperation.java:74)
The corresponding block of code is:
/**
* Builds the Okapi pipeline driver to perform this merge operation
*/
public IPipelineDriver buildDriver() {
IPipelineDriver driver = new PipelineDriver();
driver.setFilterConfigurationMapper(filterConfigurationMapper);
driver.addStep(documentMergeStep);
driver.addStep(new RawDocumentWriterStep());
BatchItemContext batchItemContext = new BatchItemContext(xliffDocument, new File(outputPath).toURI(), Charset.forName("UTF-8").name(), sourceDocument);
driver.addBatchItem(batchItemContext);
return driver;
}
/**
* Builds the Okapi pipeline driver and run the merge operation
*/
@Override
public void run() throws DocConverterException{
try {
buildDriver().processBatch(); // ***line 74***
} catch (OkapiException e) {
throw new DocConverterException("Error running merge operation", e);
}
}
The issue is that this exception is only logged but not thrown https://bitbucket.org/okapiframework/okapi/src/23d916f011d51d055115fd248d8ce890d41ccaae/okapi/core/src/main/java/net/sf/okapi/common/io/InputStreamFromOutputStream.java#lines-292:294
So once our service recovers from the OOM, it has no way of catching that this error happened, and our file conversion is truncated. For example, when converting from XLIFF back to JSON, our file would get truncated to:
{
"Items": [
{
"item": 1,
...
},
...
{
"item": n,
...
}
// A number of items are missing at the end and the JSON isn't closed with "]}"
Questions for the devs:
- I understand the exception mentions that it should not happen and that io-tools should be contacted, but is there a way that this exception can be thrown in Okapi? That way, we will have some way to catch it and it will not occur silently?
- Is there any other way we could prevent this TimeoutException from causing our conversions to be truncated (e.g. a different way of calling this)?
Thank you and please let me know if there is additional information I can provide.
Comments (4)
-
reporter -
Hi Marco,
We have seen this bug as well. If memory serves there was some refactoring of
OriginalDocumentXliffMergerStep
to better recover from theTimeoutException
The Okapi version you are using is quite old. Can you run a test with the latest version (1.44.0)? If you are using your own fork you can try to use the latest
OriginalDocumentXliffMergerStep
or copy and paste the relevant code to your merger step.Note that I have updated the code to rethrow the
TimeoutException
and we will test with that update that will be available in 1.45.0Your best bet for a workaround is to recode
OriginalDocumentXliffMergerStep
so that it does not useInputStreamFromOutputStream
. The code will be slower and require disk space to write out a temp file - but you will have full control over any exceptions. -
reporter Hi Jim!
Thanks for the quick response! We will investigate on our end whether it is easier to recode
OriginalDocumentXliffMergerStep
or update to the latest version. Since our Okapi version is pretty old and the service it is in is used by various teams, testing on the latest version might take some effort. -
- changed status to resolved
we now rethrow the TimeoutException so it can be caught
- Log in to comment