Assessment of TiffModule is complicated

Issue #103 resolved
Richard Anderson created an issue

From our iPRES 2010 slides, The simplified logic for assessment of a tiff module is:

  • If ANY_OF validity == true ; ((ifh.messages contains 'offsetNotByteAligned') or ** (ifd.messages contains 'offsetNotByteAligned') or (ifd.messages contains * 'dateNotWellFormed'))
  • Then ** Acceptable
  • Else ** Not acceptable
  • End If

The actual message texts we are considering acceptable are: "Byte Offset is not word aligned at byte offset {0, number, integer}" "DateTime (306) tag is not properly formatted as YYYY:MM:DD HH:MM:SS" * "DateTime (306) tag is not a valid date or time"

Devising a assessment ruleset that implements this logic has turned out to be much more complicated than I initially (and naively) contemplated.

It would be quite helpful for assessment purposes it the TiffModule's design were enhanced so that either the message store became more centralized, or we added TiffModule methods that could be used to query the count of all messages, and the count of messages having a given type or types.

Comments (8)

  1. Richard Anderson reporter

    As I began to review the task, I soon realized that the simplified logic above ignores other types of feature problems that might cause the validity for the TiffModule to be set to Validity.False. For example, if there is a 'invalidFirstTwoBytes' message, then the TIFF file should be considered "Not acceptable", regardless of whether a 'offsetNotByteAligned' message exists. The "contains" logic should more ideally be expressed as "the message store contains ONLY 'offsetNotByteAligned' or 'dateNotWellFormed' messages". Another way to say this is that "the count of all messages must be equal to the sum of the count of the messages whose type we consider acceptable"

    Implementing that type of logic would be challenging enough, but it turns out that there is no central message store for the TiffModule. Instead, the messages are distributed between the TiffModule class, the IFD class, the TiffIFD class, and the IFDEntry class. Furthermore, each of those classes have separate java fields for each type of message. (I have inventoried those message types below.) The distributed nature of these messages makes sense in terms of making the displayer output more context-rational to the human eye, but it creates a nightmare scenario for assessment.

    The Assessment logic would need to traverse the entire IFD hierarchy and be aware of all the various fields that may contain a message. To illustrate this difficulty, below is some MVEL logic I have successfully tested against the Tiff file Marisa sent, that contains a ByteOffsetNotWordAligned problem.

    • foreach (ifd: getIFDs()) {
      • foreach (entry : ifd.getIFDEntries()) {
        • if (entry.getByteOffsetNotWordAlignedMessage() != null) {
          • return true;
    • }}}
  2. Richard Anderson reporter

    The attached file Tiff-rule-predicates.txt contains a rule for Acceptability of a Tiff file that meets the objective of testing either the validity of the file or the absence of messages other than

    • byteOffsetNotWordAlignedMessage
    • invalidDateTimeFormatMessage,
    • invalidDateTimeMessage
  3. Log in to comment