Memory leak

Create issue
Issue #110 new
Vadim Zeitlin created an issue

I'm trying out hgsubversion under Debian, here is the version information: {{{ % uname -msr Linux 2.6.29-acct x86_64 % hg svn version hg: 1.3 svn bindings: 1.5.1 hgsubversion: ff69f1855086 }}}

I want to import a large (~60000 revisions) remote svn repository and the import seems to run out of memory, at least this is what I see {{{ % time hg clone hg ... [1] 30423 killed hg clone hg hg clone hg [~12 hours total running time, ~30000 revisions pulled] % time hg pull -u ... [r47525] FM: fix an issue with GTK filedialog (already fixed in the trunk) [2] 31608 killed hg pull -u hg pull -u 13459.60s user 424.80s system 41% cpu 9:19:03.82 total }}}

It seems that this happens because it runs out of memory, at least looking at munin statistics for the machine it's running on I see that its RAM consumption topped at 8GB shortly before it crashed.

Looking at {{{%MEM}}} column in top I see that it starts at {{{~}}}1% and seems to grow quite quickly, at least it's already 6% after importing just a couple of dozen of revisions.

This might be not a huge problem if there are not many revisions to clone/pull but it literally brings my (4 CPU with 4GB of physical RAM) machine to its knees :-(

Comments (17)

  1. Former user Account Deleted

    This is a known problem. You could just clone and kill it everytime it grows to large, then pull from inside the clone to resume the cloning. This is a proven workaround.

  2. Vadim Zeitlin reporter

    Thanks for confirming it, I should have probably updated the report myself to indicate that I was indeed able to finish cloning by redoing pull a few times (4 IIRC). I guess this mustn't be trivial to fix but maybe it's worth mentioning this issue in the README and, ideally, maybe there should be an option telling it to stop after cloning/pulling a certain number of revisions (10000 seems to be about the maximum, realistically).

    Also, just as a comparison point, I used git-svn to check if this was an issue with repository or not and it finished in 10 hours IIRC (compared to 24 for hgsubversion) and never consumed any noticeable amount of RAM.

  3. rockhymas

    Thanks for the workaround. Since this is a known problem, has anyone investigated? I can start looking into it, if I have some idea what to start looking at.

  4. Augie Fackler repo owner

    The new workaround is to do something like this:

    hg init foo
    cd foo
    hg pull <svn url>
    # wait for OOM
    # (repeat hg pull as necessary)
  5. Chadwick McNab

    I'm having this issue too and tried the workaround, which got me a little farther, but unfortunately there is a single revision that still aborts with out of memory that I can't get past. The revision changed 3 xml datafiles, one of them quite large, with sizes of 246 MB, 20 MB, and 1.3 MB. I see the hg process jump up to about 700 MB before quiting, although I did see it get much larger, say around 1200 MB, previously when it was pulling before quiting.

    Any thoughts or suggestions? Thanks.

    This is using hg version 1.6.3 on Windows with hgsubversion as of a few days ago.

  6. Dan Villiom Podlaski Christiansen

    cmcnab , if I understand you correctly, your issue is that one specific Subversion revision causes Mercurial to use an awful lot of memory? If so, it seems a separate bug than the one in this issue, which deals with long-running conversions steadily leaking memory. Could you please file a separate issue for this bug? As usual, it would be awfully nice if the repository were publicly available or you could produce a test case to cause the same behaviour!

  7. Chadwick McNab

    Ok sure I'll do that. I posted here because I was having the long-running leak on this same repo: on an initial clone it would grow to the 225th revision or so then quit. I used the workaround listed here, which worked pulling in a 100 or so more revisions past that, but eventually got stuck on that one revision. I wasn't sure if the leak could apply to processing in a single revision pull as well.

    It's not a public repo but I will attempt to create a test one with the change causing my problem and submit that as a separate issue. Thanks for the response!

  8. Ivan Melnychuk

    I have not seen cmcnab posting a separate issue. And I am having a very similar one:

    • one subversion revision adds a rather large file (230Mb) - this is handled
    • one of the subsequent revisions renames it (moves to another directory). and on this revision the *pull* operation fails with "abort: out of memory" while toping at about 850Mb RAM used. Strangly enough, if I re-create the same changesets separately in a small SVN repository, the *pull* works just fine.

    Any ideas?

  9. Former user Account Deleted

    The workaround is not working for me. There is a revision where the subversion root has been renamed. I always results in an out of memory exception after a few minutes.

    Any ideas?

  10. Nassere Besseghir

    The workaround is not working for me. There is a revision where the subversion root has been renamed. I always results in an out of memory exception after a few minutes.

    Any ideas?

  11. Brian DeVries

    Is there any hints on where the memory leak resides? (Moreover, are we sure it's in hgsubversion, and not hg itself?)

    If I have some time, I'm up for looking into it, but I'm balking at trying to grok the whole codebase before I can start trying to figure it out.

    Also, if it helps, I'm having the same problem, and I'm stuck using the SWIG bindings for now, as my distro (Ubuntu 10.10) only has version 0.7.2 in its repository.

  12. Jiri Necas

    I add workaround code for Windows:

    hg init foo
    cd foo
    hg recover
    hg pull <svn url>

    Sometimes when "hg pull" gets out of memory it happens in the middle of transaction. Afterwards it is needed to run "hg recover" to clean it up. If there's no broken transaction, "hg recover" just does nothing.

    If svn source needs authentication, put it to <HOMEDIR>\mercurial.ini (valid for TortoiseHG) as:

    username = <username>
    password = <password>

    Additionally you can edit foo\.hg\hgrc and put the svn url inside:

    default = <svn url>  for example svn+

    Then there is no need to type the url in hg pull command.

    If your svn repository uses some encryption and certificate is not valid, add --insecure to hg pull command:

    hg pull --insecure <svn url>
  13. Log in to comment