When writing Flac files can get OutOfMemoryError: Direct buffer memory

Issue #150 closed
Former user created an issue

Since the change in FlacTagWriter to use a direct ByteBuffer I've seen a number of OOMEs, especially on resource constrained devices (e.g. NAS devices with 256MB memory). However I have also seen it on my own home server with 8GB RAM!

Example stack trace:

java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:658) at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
    at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
    at org.jaudiotagger.audio.flac.FlacTagWriter.write(FlacTagWriter.java:195) 
    at org.jaudiotagger.audio.flac.FlacFileWriter.writeTag(FlacFileWriter.java:43) 
    at org.jaudiotagger.audio.generic.AudioFileWriter2.write(AudioFileWriter2.java:71) 
    at org.jaudiotagger.audio.AudioFileIO.writeFile(AudioFileIO.java:488) 
    at org.jaudiotagger.audio.AudioFileIO.write(AudioFileIO.java:211)
    at org.jaudiotagger.audio.AudioFile.commit(AudioFile.java:132) 

There are two possible causes I can see:

(1) it's an enormous FLAC file being read and there simply isn't enough memory. (2) the DBB are not being freed after use

Given I have seen this on an 8GB server I think (2) may be the cause in at least some of these cases.

There are some interesting questions and answers on StackOverflow about this: http://stackoverflow.com/a/26777380/28190 . This suggests that DBB are only freed when GC kicks in - which might not occur if the software is otherwise operating within its heap allocation.

I'll see if I can isolate a test, but might be difficult if this is dependent on the surrounding environment.

Comments (18)

  1. Dan Gravell

    D'oh, that was my issue, didn't realise you could add anonymously...

    One example I found of this was in a 1GB system with a 350MB FLAC file...

  2. IJabz repo owner

    Hmm, that kind of crazy I thought one of the advantages of direct buffers is that you are not tied up to the amount of memory you have allocated on heap so you can make use of large amount of memory without having to reconfigure the heap size which is awkward for a user application.

  3. IJabz repo owner

    Yes but I always thought that direct memory was handled by the OS so rather suprised that it is tied to gc memory in the way it is

  4. Dan Gravell

    There's no concept of GC in mainstream OSes as far as I am aware. Memory has to be de-allocated, just like in the good old days.

    Looking at the algorithm, to support memory constrained devices I don't see any other workaround other than going back to the old temporary file approach.

    The algorithm depends on reading in the audio data, writing the tag, then writing the audio data back out. We couldn't do anything clever with mapped memory files because if the entire audio data wasn't read, the writing of the tag will change the file underneath (AIUI). Essentially that audioData needs to be read whole.

    So one way of doing this without consuming a lot of memory (in the case of large FLAC files) is to use the temporary file approach, I think.

    Could an alternative be (and this could also replace the current direct BB approach to keep the code simpler):

    1. Work out the final size of the file (can we do this?)
    2. Memory map the file with the target final size
    3. Shift data after the tag (the audio data) to the end of the file
    4. Overwrite the tag

    It also looks like we might have to do something kooky to make sure the MBB is de-allocated, as there can be similar issues to DBB there.

  5. IJabz repo owner

    But how its not clear how the memory needs to be deallocated there is nothing to indicate that we are meant to do that but if there is a workaround that works ... I certainly don't want to go back to temporary file.

    I quite don't get why would the memory mapped method be better, does it use less memory than DBB ?

  6. Dan Gravell

    To be clear: two strands of conversation here:

    1) Are we clearing up DBB allocations correctly? 2) Should we move to MBB?

    (2) also fixes the problem where there simply isn't enough memory to read in an entire FLAC file. Maybe this is a rare occurrence. But it might not fix the issue of allocations, because AIUI MBB uses a DBB under the covers (although the size of the DBB in that case may be smaller, so it might take longer for the problem to occur).

    I need to do some more investigation...

  7. Dan Gravell

    Another question on this - when we had the old temporary file approach, the changes made to the original FLAC file were atomic with respect to the target file itself - atomic at least in the sense that the only real file operation was a move at the end to the target file name.

    Are you worried about the DBB approach potentially leaving files in an inconsistent state?

  8. IJabz repo owner

    I was at one point but I am not now, we only ever used the temporary file approach for certain formats and only in certain circumstances (there wasn't enough room for the metadata in existing file structure) anyway. The temporary file approached caused a number of problems often related to permissions meaning that either they could modify a file but they couldn't create a new file, or that temporary files were not cleaned up properly and if you look at the code of AudioFileWriter versus AudioFileWriter2 you can see how messy the original approach was although I'm sure it could be done better now)

  9. Dan Gravell

    I've re-written the DBB bit to use a memory mapped file.

    For "normal" size FLACs of around 50MB there's no difference in write speed.

    For large FLACs, I had a 389MB one, as well as using less system memory the new code shows a performance improvement - taking less than half the time of the DBB version.

    I've some more work to do on it, I'll commit into a branch in my fork later.

    Regardless, would you consider merging this into the project?

  10. Dan Gravell

    Today I cleaned up the code and implemented the clean up code. This uses reflection to clean up direct ByteBuffers when explicitly released without waiting for GC. This is to avoid the problem documented above. It does a no-op with a WARNING if it couldn't release.

    I ran a large scale test embedding large album art in lots of FLACs to force the code to run. I did get one file which appears corrupted, so I'll look into that tomorrow, then send you the PR.

  11. Log in to comment