Check format-specific exceptions

Issue #50 resolved
Stephen Abrams created an issue

Make sure that all format instance-specific exceptions are caught and the processing of that instance is stopped, but the JHOVE2 invocation continues (if other instances are still available)

Comments (19)

  1. Richard Anderson

    I have done a search of the code under org.jhove2.module looking for any exceptions that are being thrown by the methods in or under this package name. The list below itemizes those exceptions, and the comments that follow, deal with the exception types in the order I have listed them:

    • org.antlr.runtime.RecognitionException
    • org.xml.sax.SAXException
    • java.security.NoSuchAlgorithmException
    • java.io.Exception
    • java.io.IOException
    • java.io.EOFException extends IOException
    • java.io.FileNotFoundException extends IOException
    • java.io.UnsupportedEncodingException extends IOException
    • org.jhove2.core.JHOVE2Exception
  2. Richard Anderson

    org.antlr.runtime.RecognitionException

    This exception is the root of the ANTLR exception hierarchy. It is raised by ANTLR lexers in the SGML module. It appears to be caught by most of the ANTLR parser methods, but many of those same methods also throw the exception (I assume this happens if a subsequent re-throw occurs during error handling). Any such exceptions not previously trapped are caught within format.sgml.OpenSpWrapper.parseFile(SgmlModule) or its callees, such as format.sgml.EsisParser.parseEsisFile() which create Message objects that are attached to the Source unit.

  3. Richard Anderson

    org.xml.sax.SAXException

    This exception type encapsulates a general SAX error or warning.

    If an exception of this type occurs during creation of the SAX parser object, then the exception is caught in the format.xml.SaxParser.createXmlReader() method and translated into a JHOVE2Exception, which is appropriate.

    It an exception of this type is raised by one of the SAX handlers during parsing of an XML source item, then the exception is caught in the format.xml.SaxParser.parse() method and translated into a Message object that is added to the XmlModule instance for the Source item.

  4. Richard Anderson

    java.security.NoSuchAlgorithmException

    This exception type is thrown when a particular cryptographic algorithm is requested but is not available in the environment. It is thrown by constructors of the subclasses of org.jhove2.module.digest.AbstractBufferDigester, such as MD5Digester. I do not see any place in the code where these exceptions get trapped. I assume that they would only be raised during creation of the objects using Spring.

  5. Richard Anderson

    UnsupportedEncodingException (extends IOException)

    is only thrown in the display.AbstractDisplayer.display(Reportable reportable) method. It would occur if a call to create a PrintStream were made with a bad character encoding. The above display method appear to only get invoked from the app.JHOVE2CommandLine.main() method and will cause a program exit (but only after all the previous processing has occurred. It might be prudent for us to test the encoding at the time we create the displayer.

    FileNotFoundException (extends IOException)

    is thrown as well in the display.AbstractDisplayer.display(Reportable reportable) method if there is a problem creating a file-based PrintStream.

  6. Richard Anderson

    FileNotFoundException (extends IOException)

    is thrown in format.tiff.TiffIFD.validate(JHOVE2, Source) but I cannot see the need for that declaration, other than the signature imposed by the IFD interface. This validate method is invoked by the format.tiff.TiffModule.parse(JHOVE2, Source, Input) method, which does not trap the exception, but passes it up the stack to BaseFormatModule.invoke where it is trapped.

  7. Richard Anderson

    EOFException and IOException are required to be thrown by the parse method of any class that implements the format.Parser interface. I question the need to throw both, since EOFException is a subclass of IOException. All calls to any format module's parse method are made via format.BaseFormatModule.invoke(JHOVE2, Source, Input). Which has traps to catch either type of exception and convert it to a Message object. These calls originate from format.DispatcherCommand.execute(JHOVE2, Source, Input)

    EOFException and IOException

    are also thrown in various parse methods of format.icc, format.riff, format.tiff, format.wave, and format.zip Ultimately all these parses originate in a format module's parse method (see above).

    IOException

    is thrown by aggrefy.AggrefierModule.identify(JHOVE2, Source), which signature is specified by the Aggrefier interface. This method is called from AggrifierCommand.execute(JHOVE2, Source, Input), which traps the IOException and creates a Message.

    IOException

    is thrown by aggrefy.GlobPathRecognizer.recognize(JHOVE2, Source), which signature is specified by the Recognizer interface. It, too, is called from AggrifierCommand.execute(JHOVE2, Source, Input), which traps the IOException and creates a Message.

    IOException

    is thrown by digest.DigesterModule.digest(JHOVE2, Source, Input), which signature is specified by the Digester interface. This method is called from DigesterCommand.execute(JHOVE2, Source, Input), which catches the IOException and creates a Message.

    IOException

    is thrown by identify.IdentifierModule.identify(JHOVE2, Source, Input) , which signature is specified by the Identifier interface. This method is called from IdentifierCommand.execute(JHOVE2, Source, Input), which catches the IOException and creates a Message.

    IOException

    is thrown by identify.DROIDIdentifier.identify(JHOVE2, Source, Input) , which signature is specified by the SourceIdentifier interface. This method is also called from alled from IdentifierCommand.execute(JHOVE2, Source, Input), which catches the IOException and creates a Message.

  8. Richard Anderson
    • changed status to new

    java.lang.Exception

    Is thrown by many of the methods provided by the DROID framework. If any of these exceptions were to occur the exception will get trapped by the identify.DROIDIdentifier.identify(JHOVE2, Source, Input) method and translated into a JHOVE2Exception

    DROIDIdentifier.identify

    • calls DROIDIdentifier.getCachedConfigFile(String)
    • ~ calls DROIDWrapper.parseConfigFile(String)
    • ~~~ calls droid.JHOVE2AnalysisControllerUtil.loadConfigFile(String)
    • calls DROIDIdentifier.getCachedSignatureFile(ConfigFile, String)
    • ~ calls DROIDWrapper.parseSignatureFile(ConfigFile, String)
    • ~~~ calls droid.JHOVE2AnalysisControllerUtil.loadSigFile(ConfigFile, String)
  9. Richard Anderson
    • changed status to new

    Finally I did a search for "throw new JHOVE2Exception" to locate the places within modules where this type of exception is being created.

    The following locations look OK to me:

    aggrefy.GlobPathRecognizer.groupSources(Source)

    throws a JHOVE2Exception if there is a IllegalStateException or IndexOutOfBoundsException, either of which would be attributable to a coding defect instead of a file-level problem.

    aggrefy.GlobPathRecognizer.compilePatterns()

    throws a Jhove2Exception if there is a PatternSyntaxException, which would occur if there is an error compiling a fileGroupingToken, mustHaveToken, or mayHaveToken

    display.AbstractDisplayer.display(...)

    throws a JHOVE2Exception if there is a IllegalArgumentException, IllegalAccessException, or InvocationTargetException

    identify.DROIDIdentifier.identify(JHOVE2, Source, Input)

    throws a JHOVE2Exception when it catches any java.lang.Exceptions that originate in the DROID code relating to loading and parsing config and signature files.

    format.tiff.TiffTag.getTag(int)

    throws a JHOVE2Exception if the TreeSet of TiffTags has not been initialized.

    format.xml.SaxParser.createXmlReader()

    throws a JHOVE2Exception if a SAXException occurs because the application could not create a SAX parser.

  10. Richard Anderson
    • changed status to open

    The following uses of throw new JHOVE2Exception do NOT look OK:

    format.tiff.TiffModule.parseIFDList(JHOVE2, Source, Input)

    throws a JHOVE2Exception if there is an IOException reading the offset to the first IFD

    format.tiff.IFD.parse(JHOVE2, Source, Input, Map<Integer, Format>)

    throws a JHOVE2Exception if there is an EOFException or an IOException

  11. Log in to comment