Issue #3 resolved

bug with sector size greater than 512 (ZVI, OIB files)

Philippe Lagadec
repo owner created an issue

bug reported by Forrest:

I am trying to use this plugin for reading in a ZVI file format for Zeiss Microscopy products, which is based upon OLE2.

In the process I discovered what I think is a bug based upon the assumption that the sectorsize is 512 bytes.

line 1274 was

self.directory_fp = self._open(sect)

now i have it

self.directory_fp = self._open(sect,sectorsize=self.SectorSize)

line 1330 was

def _open(self, start, size = 0x7FFFFFFF, force_FAT=False)

now i have it

def _open(self, start, size = 0x7FFFFFFF, force_FAT=False,sectorsize=512):

lines 1359-1360 were

return _OleStream(self.fp, start, size, sectorsize, 512, self.fat)

now i have

return _OleStream(self.fp, start, size, sectorsize, self.sectorsize, self.fat)

This made the basic test program given above go from failing to working on a test zvi file format which has a 4096 byte sectorsize.

I'm still playing around with using it further, but I hope that the success of reading the directory structure means the rest will work as designed.

Comments (17)

  1. Philippe Lagadec reporter

    After merging pull request #5, there are still several parsing errors with the ZVI samples mentioned above. I simply run "OleFileIO_PL.py <file>":

    • Zeiss-1-Merged.zvi: 512B sectors, parsing OK.
    • Zeiss-1-Stacked.zvi: 4KB sectors, manages to read FAT and entries, but fails when parsing properties.
    • Zeiss-2-Merged.zvi: 512B sectors, parsing OK.
    • Zeiss-2-Stacked.zvi: 4KB sectors, manages to read FAT and entries, but fails when parsing properties.
    • Zeiss-3-Mosaic.zvi: 4KB sectors, fails when parsing FAT ("Incorrect DIFAT" error).
    • Zeiss-4-Mosaic.zvi: 4KB sectors, fails when parsing FAT ("Incorrect DIFAT" error).

    So I will keep this issue open until we find what is missing.

    Niko, do you have Olympus FluoView OIB file samples you could share with me to test? Thank you.

  2. Philippe Lagadec reporter

    For now the only Olympus OIB sample I found is here: http://bisque.ece.ucsb.edu/client_service/view?resource=http://bisque.ece.ucsb.edu/data_service/dataset/1930977

    If I run "OleFileIO_PL.py Image0013.oib" it can now parse it without errors. But there are no property streams, contrary to ZVI files.

    However, when I check if all streams can be read by running "OleFileIO_PL.py -c Image0013.oib", it fails on Stream00954, which seems to be a small stream stored in the miniFAT. I guess this is another part of the code to be fixed to support variable sector sizes.

  3. Dave Jones

    The "Incorrect DIFAT" errors are the ones I ran into when trying to find a library to deal with some CT scans (mentioned in our e-mail conversation the other day). Sorry I don't have time right now to file a full bug report, but I do recall the error in that case was at https://bitbucket.org/decalage/olefileio_pl/src/0be2626f51fc6949b42d8e00b987e5316d51796b/OleFileIO_PL/OleFileIO_PL.py?at=default#cl-1438 - there's an assumption there that DIFAT sectors contain 127 pointers + 1 pointer to the next in the chain, but that's only true with 512 byte sectors (e.g. it's 1023 pointers + 1 in the case of 4096 byte sectors and sector-size/4 in the more general case). The assumption is repeated a couple of times in the subsequent lines with (127 indexing on lines 1449 and 1451 are the obvious bits). I vaguely recall tweaking those bits to try and load the CT scans but ran into more errors after that (possibly the property parsing errors mentioned in the other files).

    I think the approach I took to this in compoundfiles is a little more elegant - construct a Struct for the normal and mini sector formats (just use size/4) and then just defer to that Struct's size when reading, and use the Struct to unpack any bytes read:

    It should work for whatever sector size the source file has (I note the standard doesn't actually limit itself to 512 or 4096 byte sectors - it simply mentions that these are the defaults in certain versions of the reference implementation)

  4. Martijn Berger

    I had to change this to be able to parse a 4069 byte sector file generated by 3ds Max

    --- a/OleFileIO_PL/OleFileIO_PL.py      Sun Apr 13 19:16:03 2014 +0300
    +++ b/OleFileIO_PL/OleFileIO_PL.py      Tue Jul 08 21:46:33 2014 +0200
    @@ -1537,7 +1537,7 @@
    
             # open directory stream as a read-only file:
             # (stream size is not known in advance)
    -        self.directory_fp = self._open(sect, sectorsize=self.SectorSize)
    +        self.directory_fp = self._open(sect)
    
             #[PL] to detect malformed documents and avoid DoS attacks, the maximum
             # number of directory entries can be calculated:
    @@ -1593,7 +1593,7 @@
             self.root.dump()
    
    
    -    def _open(self, start, size = 0x7FFFFFFF, force_FAT=False, sectorsize=512):
    +    def _open(self, start, size = 0x7FFFFFFF, force_FAT=False):
             """
             Open a stream, either in FAT or MiniFAT according to its size.
             (openstream helper)
    @@ -1623,7 +1623,7 @@
                                   self.ministream.size)
             else:
                 # standard stream
    -            return _OleStream(self.fp, start, size, sectorsize,
    +            return _OleStream(self.fp, start, size, self.sectorsize,
                                   self.sectorsize, self.fat, self._filesize)
    
  5. Philippe Lagadec reporter

    Thanks a lot Martijn and Niko, we are progressing on this issue. I merged the pull request and tested with OIB and the ZVI samples from http://openslide.cs.cmu.edu/download/openslide-testdata/Zeiss/ When I run "OleFileIO_PL -c" with the ZVI samples, now the "stacked" ones with 4K sectors are read without error :-). However, the parsing still fails on the "mosaic" files due to a bug in the DIFAT handling:

    Zeiss-1-Merged.zvi: 512B sectors, parsing OK.
    Zeiss-1-Stacked.zvi: 4KB sectors, parsing OK.
    Zeiss-2-Merged.zvi: 512B sectors, parsing OK.
    Zeiss-2-Stacked.zvi: 4KB sectors, parsing OK.
    Zeiss-3-Mosaic.zvi: 4KB sectors, fails when parsing FAT ("Incorrect DIFAT" error).
    Zeiss-4-Mosaic.zvi: 4KB sectors, fails when parsing FAT ("Incorrect DIFAT" error).
    

    I hope to spend some time on this issue soon to fix the code.

  6. Philippe Lagadec reporter

    I fixed the code in OleFileIO.loadfat, following the hints from Dave Jones (see message above). Thanks Dave, now it works fine with all the large ZVI files mentioned previously. And thank you Martijn and Niko for the other fixes. This issue is now closed, OleFileIO_PL handles 4K sectors and large files correctly.

  7. Log in to comment