Rainbow ExtractionStep causes ClassNotFoundException when trying to instantiate writer object
Hi,
This relates to a message posted on the dev list: https://groups.google.com/forum/#!topic/okapi-devel/Gn9UOK-Wc0c
- Steps to reproduce: currently, only reproducible in our system, where a pipeline is assembled programmtically
- Setup: simple 3-step pipeline with ICML filter, SRX segmentation and creation of package (ExtractionStep, wiht XLIFF 2.0); running M28-SNAPSHOT, Java 1.8.0_31_b13, Mac OS X
- Outcome: ClassNotFoundException on line 188 ExtractionStep
- Expected results: Create XLIFF package from ICML.
Root cause analysis:
The issue seems to be that the ExtractionStep is using the default class loader when trying to instantiate the writer object:
protected Event handleStartBatch (Event event) {
try {
// Get the package format (class name)
String writerClass = params.getWriterClass();
writer = (IPackageWriter)Class.forName(writerClass).newInstance();
writer.setParameters(params);
This is problematic, because when instantiating a pipeline step it is possible to set the classloader attribute on the Step object. However, I presume the step should then also use the same classloader to create further objects and not the default.
If this analysis is indeed correct, then it would require a number of changes: - add a field "loader" to the abstract BasePipelineStep - add a method setLoader() to IPipelineStep - PipelineWrapper would need a field for a class loader, too (and possibly an additional constructor or setter where this loader could be set) so that the attribute "loader" on availableSteps can get set to this class loader. - when creating pipeline steps using a classloader, e.g. in PipelineWrapper.copyInfoStepsToPipeline(), set the same classloader as an attribute on the step - in the specific step class, when instantiating objects, then the loader of the step instance should be used.
Note that this might affect several areas in the library. At least every Step would need to be examined and every place where steps are created. I am not sure how much this really affects.
Then again, this analysis might be missing something important, so feel free to correct me.
For us, this is a major issue, because it currently makes it impossible to use the library. Maybe a fix might be possible in the upcoming M28 release. Alternatively, perhaps a branch could be created for the issue so that we can submit the required changes and see, if everything works with a patched version of Okapi (I'd rather do it through a branch in the repo than using a local build, to things in synch).
Cheers,
Martin
Comments (11)
-
reporter -
I still had no time to look closely at this, but you may want to look at example06: it does pretty much what you try here (except using Maven to pull the dependencies).
BTW: the pom in example06 needs to be updated to have the version:
<dependency> <groupId>net.sf.okapi.lib</groupId> <artifactId>okapi-lib-xliff2</artifactId> </dependency>
should be:
<dependency> <groupId>net.sf.okapi.lib</groupId> <artifactId>okapi-lib-xliff2</artifactId> <version>1.0.1</version> </dependency>
-
reporter And one more finding: If I create a simple static Main class from which to run the same code, I can execute the Okapi pipeline without any problems. It fails when the pipeline is built and executed by a service module from within the running censhare server.
-
reporter Hi Yves et al.,
I have just discussed this with my boss who is more knowledgable about Classloader issues than I am. The issue might be that the ExtractionStep (and any other steps that try to instantiate additional classes) is using the wrong class loader. Instead of the default, it should be using the context class loader of the current thread.
So, instead of this (in ExtractionStep.handleStartBatch()):
String writerClass = params.getWriterClass(); writer = (IPackageWriter) Class.forName(writerClass).newInstance();
It should be doing this:
Thread currThread = Thread.currentThread(); ClassLoader ccl = currThread.getContextClassLoader(); String writerClass = params.getWriterClass(); writer = (IPackageWriter) Class.forName(writerClass, true /* initialize */, ccl).newInstance();
Another option might be to do:
ClassLoader ccl = this.getClass().getClassLoader().
Does that make sense?
Cheers,
Martin
-
Just a thought. I have working extraction/merge pipelines that do not use ExractionStep. However my pipelines are hard coded to produce xliff only. Not any of the Rainbow kits. I don't know if that is an option for you.
This classloader magic always worries me. I think we should avoid it at all costs. If we are going to refactor I would like to push any classloader code into a single class. Just trying to avoid doing it across the framework. My concern is running in server environments, memory leaks and other nasties.
-
reporter Hi Jim,
Thanks a lot for the comment. Would those pipelines of yours be able to go from ICML to XLIFF 2.0 (via SRX segmentation and leveraging) and back again? If so, what are the specific pipeline steps you're using?
As for the classloader issues, I agree. There should be a single static utility class or something like that, which can be used to instantiate objects within the correct classloader context (assuming my root cause analysis is correct). There is an interesting approach described here using strategy pattern: http://www.javaworld.com/article/2077344/core-java/find-a-way-out-of-the-classloader-maze.html
Cheers,
Martin
-
I use XLIFFWriter with FilterEventsWriterStep. I'm not sure if we have an "Xliff2Writer" that implements IFilterWriter. If we do then you should be able to use my "lower level pipeline". I always try to create my pipelines at the lowest level possible using the core steps. Fewer problems and easier to debug issues.
I like the classloader link - this looks like the best solution for a library where you never know how it will be used.
-
reporter @jhargrave If I find the time today, I will try to create a local Okapi build that includes the change mentioned in my last post, so that I can test it. How would I go about creating a local build of the libraries?
-
Martin - I've got a high priority task I over the next two weeks. I'll be actively ignoring emails during that time, but wanted to throw you a bone at least :-)
But basically to build Okapi you need java 1.7 (you can build with 1.8 but you won't be able to use any 1.8 language features), maven and ant. Pull the dev branch from the git repository.
Then you can go to ../deployment/maven and execute one of the "update" scripts that match your OS (windows, Mac or Linux). That should give you a complete distribution. But if you are using maven in your project you can just use your local okapi snapshot artifacts - for that just do the normal "mvn clean install" from the okapi root folder.
If you get stuck maybe one of the other devs can help you out. I'll try to take a peek next week to see how you are doing.
cheers,
Jim
-
reporter OK, great, thanks for the hint, Jim, and for taking the time to reply. I will try and see, if I can get it running using a local build with the line modified to use the context class loader.
Cheers,
Martin
-
reporter - changed status to resolved
In the end, it turned out that this problem was actually caused by erroneous whitespace characters in the pipeline configuration file, which lead to the ClassNotFoundException. So, not related to the class loader at all.
- Log in to comment
Just a few more observations on this: - I forgot to mention that I was to work around this issue by creating adjusted versions of PipelineWrapper and ExtractionStep and use these instead of the ones provided by the Okapi Jars.
Placing the Okapi Jars in the JDK's "endorsed" directory did not fix the problem.
When I examine the classloader of the ExtractionStep object at the point where it tries to instantiate the writer, there are some odd results. I can see that the classloader knows about the following classes from the package net.sf.okapi.steps.rainbowkit from okapi-lib-0.28-SNAPSHOT.jar (by examining the field "classes" in this.getClass().getClassLoader()):
This seems odd, because when I examine the package contents of the Okapi jar, I get the following (using jar tvf okapi-lib-0.28-SNAPSHOT.jar | grep net/sf/okapi/steps/rainbowkit/.*class):
This seems odd, because it looks as if the package contents are only partially loaded, which would contradict what I wrote in the original issue description. It should be either all or nothing. But having only 5 class from that package visible in the classloader at that stage doesn't make any sense to me. But maybe it helps in trying to diagnose the problem.
Cheers,
Martin