Issue #7 new

Hachoir leaks memory

Robert Xiao
created an issue

I wrote a simple program that uses the BPList parser to parse and extract data from several thousand files in BPList format (specifically, an iPhone backup).

The problem is that the simple implementation leaks memory very badly; after just 100 files, it is using hundreds of MB of memory. This makes it incapable of completing the job.

It turns out, by inspecting gc.garbage, that the field generators are leaking for some unknown reason. Adding a del to GenericFieldSet exacerbates the problem, because it makes all field sets uncollectable, too. So, I can't think of a clean solution.

The only workaround I have for this so far is to simply use "list(p)" on the parser, which exhausts it and thus causes the generator to finish execution.

Of course, this implies a performance penalty, but at least the program is now able to run competently (and memory usage stays nearly constant at ~16MB). I am filing this issue in the hopes that someone knows how to resolve the problem.

Comments (1)

  1. Log in to comment