BDSIM runs out of memory on Component-Teleporter test

Issue #181 resolved
Jochem Snuverink created an issue

The Component-Teleporter test is failing by running out of memory on the first event.

This happened after the merge with the 'padding' branch:

http://abp-cdash.web.cern.ch/abp-cdash/index.php?project=BDSIM&date=2017-02-09 http://abp-cdash.web.cern.ch/abp-cdash/viewUpdate.php?buildid=32580

It seems to have nothing to do with the teleporter though as the same happens also with nturns=1 and/or the circular option off.

There are no overlaps reported.

I am running Coverity now to see if it finds any new memory leaks.

Comments (8)

  1. Laurie Nevay

    Yes, I saw it was failing.

    We have a new mode of failure recently with infinite tracking loops. These lead to trajectory point information being accumulated for the primary (albeit a small amount) for each step. If the particle never stops tracking (at least in a sensible time), the trajectory point information will continue to grow until likely the available memory is exceeded.

    This is likely happening due to a boundary falling half way through the teleporter that forces multiple steps to be taken through it. The highly unrealistic tracking in the teleporter can catch Geant4 out in this case - Geant4 sees it going to a different z than it should and reruns - it never gets to that z though as the teleporter z is hard coded.

    This will be fixed naturally soon by removing the curvilinear world from affecting tracking - currently under testing. I've been working on it over the last couple of days.

    Still, this is my educated guess. I will investigate once I've finished the curvilinear world changes.

  2. Jochem Snuverink reporter

    Thanks for the update and good to hear it is somewhat understood. Still this happened without the circular option too, and also with --output=none. Is the trajectory point information accumulated also in the latter case?

  3. Laurie Nevay

    Even without the circular flag, the Component-teleporter test surely must have the teleporter built?

    All output structures are prepared even if output == none - ie histograms, trajectory information, sampler hits are all prepared. We could perhaps optimise this, but if there's no output, is there much point in optimising as what would you use it for? If it were to profile cpu or memory you'd want it to be similar to the regular scenario. I use output==none all the time when using the visualiser to check something, but the visualiser is an order or magnitude slower than batch mode purely because of rendering so I doubt it'd make too much difference. We can optimise if we see use of it though.

    Note also that the trajectory point information is required to visualise the track in the visualiser. We, by default, only store this for primary particles, but when using the visualiser store for all particles.

    The trajectory point information is very small per point (I'm guessing < 100bytes). My feeling is we should never run out of memory because of this - we should control our tracking better (shouldn't occur and should be safeguards against overly long running primaries). Certainly for the primary trajectory only.

  4. Jochem Snuverink reporter

    A teleporter is only built with the circular flag. There is no explicit element in the test. Without the flag the teleporter is not built. I also just checked this in the debugger to be completely sure.

    Then likely it has to do with tracking through strong sbends. Still it is an interesting failing test since I haven't seen it failing in this mode (for a while), e.g. issue #178 does not see an increase in memory consumption.

    I asked about output=none to understand better where the increased memory consumption comes from not in order to optimise it.

  5. Laurie Nevay

    Ok, good to find out! Looks like the test probably isn't doing what it's meant to.

    Yes, sorry, was just general discussion as cropped up.

    I'll investigate over the next day or so.

  6. Laurie Nevay

    This was due to accumulated trajectory points. In turn this was due to bad tracking in the dipole and overlaps with the teleporter given recent changes to the curvilinear world. This should now be fixed.

    We could consider putting a limit on the trajectory size for the primary. At around 5 million trajectory points, the usage is ~1GB.

  7. Log in to comment