Reduce skin map pickle sizes

Issue #402 resolved
Jonathan Cole
created an issue

Is it possible to reduce the precision stored in the skin map pickles and use gzip compression?

I'm contemplating adding each row so you can get an animated dose but don't want 16MB pickle files per case.

Comments (12)

  1. Jonathan Cole reporter

    Is it possible to reduce the precision of the stored numbers as well? At the moment they go to many, many decimal places but probably only the first few are required.

  2. David Platten

    I'll take a look. We may need a consensus on how many decimal places to store. If we reduce it too much then things like pacing procedures may appear blank, which would be a shame.

  3. David Platten

    @Jonathan Cole, the skin dose map data that is saved in the pickle files is just the data contained in openSkin's geomclass.SkinDose.totalDose. Would it be more sensible to address the decimal places in openSkin rather than in OpenREM?

  4. David Platten

    No functional changes. Renamed variable f in commented out code that can be used to create a pickle file for every RF study in the database. Now named pickle_file so that if the code is uncommented it works OK (f was already in use elsewhere in the view). References issue #402

    → <<cset f841d3a78145>>

  5. David Platten

    I may be teaching my grandmother to suck eggs, but if you were to use Decimal in openSkin as the datatype for the skin dose map values then it's easy to control the number of decimal places, and therefore how much file space the pickle files take up:

    # Many decimal places
    print Decimal(1.0155)
    pickle.dump( Decimal(1.0155), open("c:\\temp\\pickleSmall.p", "wb"))
    print pickle.load( open("c:\\temp\\pickleLarge.p", "rb") )
    # Fewer decimal places
    print Decimal(1.0155).quantize(Decimal('0.001'))
    pickle.dump( Decimal(1.0155).quantize(Decimal('0.001')), open("c:\\temp\\pickleSmall.p", "wb"))
    print pickle.load( open("c:\\temp\\pickleSmall.p", "rb") )

    The pickleSmall.p file is 36 bytes vs. 85 bytes for pickeLarge.p. Note that both of these files occupy 4096 bytes on my Windows PC - there will be a tangible space saving when many values are stored in a single pickle file.

  6. Jonathan Cole reporter

    I did initially look at Decimal but didn't use it. I can't remember why now, I think maybe it didn't work with something from numpy?

    I will try using it just for the final dose maps and see.

  7. Log in to comment