Reduce skin map pickle sizes

Issue #402 resolved
Jonathan Cole created an issue

Is it possible to reduce the precision stored in the skin map pickles and use gzip compression?

I'm contemplating adding each row so you can get an animated dose but don't want 16MB pickle files per case.

Comments (12)

  1. David Platten

    Added gzip compression for skin dose map pickle files. Users of beta code who already have skin dose map pickle files will need to delete them, as there is no provision to cope with uncompressed pickle files. Fixes issue #402.

    → <<cset 4457e994a863>>

  2. Jonathan Cole reporter

    Is it possible to reduce the precision of the stored numbers as well? At the moment they go to many, many decimal places but probably only the first few are required.

  3. David Platten

    I'll take a look. We may need a consensus on how many decimal places to store. If we reduce it too much then things like pacing procedures may appear blank, which would be a shame.

  4. David Platten

    @jacole, the skin dose map data that is saved in the pickle files is just the data contained in openSkin's geomclass.SkinDose.totalDose. Would it be more sensible to address the decimal places in openSkin rather than in OpenREM?

  5. Jonathan Cole reporter

    1 mGy?

    Anything less than that is likely to be nonsense anyway. To be honest, even 10 or 100 mGy might be within the uncertainly.

  6. David Platten

    No functional changes. Renamed variable f in commented out code that can be used to create a pickle file for every RF study in the database. Now named pickle_file so that if the code is uncommented it works OK (f was already in use elsewhere in the view). References issue #402

    → <<cset f841d3a78145>>

  7. David Platten

    I may be teaching my grandmother to suck eggs, but if you were to use Decimal in openSkin as the datatype for the skin dose map values then it's easy to control the number of decimal places, and therefore how much file space the pickle files take up:

    # Many decimal places
    print Decimal(1.0155)
    1.0155000000000000692779167366097681224346160888671875
    
    pickle.dump( Decimal(1.0155), open("c:\\temp\\pickleSmall.p", "wb"))
    print pickle.load( open("c:\\temp\\pickleLarge.p", "rb") )
    1.0155000000000000692779167366097681224346160888671875
    
    # Fewer decimal places
    print Decimal(1.0155).quantize(Decimal('0.001'))
    1.016
    
    pickle.dump( Decimal(1.0155).quantize(Decimal('0.001')), open("c:\\temp\\pickleSmall.p", "wb"))
    print pickle.load( open("c:\\temp\\pickleSmall.p", "rb") )
    1.016
    

    The pickleSmall.p file is 36 bytes vs. 85 bytes for pickeLarge.p. Note that both of these files occupy 4096 bytes on my Windows PC - there will be a tangible space saving when many values are stored in a single pickle file.

  8. Jonathan Cole reporter

    I did initially look at Decimal but didn't use it. I can't remember why now, I think maybe it didn't work with something from numpy?

    I will try using it just for the final dose maps and see.

  9. Log in to comment