Assertion `tl>=0 and tl<timelevels' failed error

Issue #163 closed
anonymous created an issue

I'm currently running a Cactus simulation based off the ET mclachlan parameter file. I have some custom thorns which force checkpointing every iteration, and then write new parameter files. The new parameter files are used to run specific functions from the host simulation ("spawning"), but I'm having some problems resuming simulations.

Currently, I get this error when resuming:

INFO (Carpet): GF: rhs: 818k active, 1440k owned (+76%), 1896k total (+32%), 328 steps/time
cactus_sim: /home/azebrowski/Cactus/arrangements/Carpet/CarpetLib/src/th.hh:79: double th::get_time(int, int, int) const: Assertion `tl>=0 and tl<timelevels' failed.
[cyder:32759] Process received signal
[cyder:32759] Signal: Aborted (6)

I'm guessing there's a parameter I'm not setting properly in my child simulation, could anyone give me a pointer to where I should be looking? I looked for things relating to timelevels in the host/spawned parameter files, but didn't see anything that stood out. My parameter files used and full output are attached to this email, with the disclaimer that I modified the spawned parameter file to run every function instead of skipping some in an attempt to bypass any problems that could be caused by skipping some Carpet function on accident.

I've made a bzipped tarball containing the checkpointed data from the simulation. It contains several parameter files. The parameter file of interest here is spawn.par, as it doesn't use any of my custom code but still causes Cactus to abort with an error. I left the other parameter files in on the off chance that I might need to refer to them later.

Here is the source parameter file, which creates the spawned simulation:
http://www.cct.lsu.edu/~azebrowski/ml-ahfinder-spawn.par

Here is the spawned simulation's parameter file:
http://www.cct.lsu.edu/~azebrowski/spawn.par

Here is the full checkpointed data and another copy of the spawned parameter file:
http://www.cct.lsu.edu/~azebrowski/data.tar.bz2

Other information:
I ran the simulation using OpenMP with 12 cores to generate the checkpointed data. I've also tried MPI, but that didn't seem to make a difference.

Thornlist:
http://www.cct.lsu.edu/~azebrowski/ThornList

gcc:
azebrowski@cyder:~/Cactus$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)

Fortran is gfortran-4.4

I'm using the Mercurial version of Carpet, and the ET development thorns.

If you need more information, please let me know.

Keyword:

Comments (14)

  1. Erik Schnetter
    • removed comment

    The number of active time levels for grid functions must be Carpet::prolongation_order_space + 1. That is, one must ensure that all grid functions have the same number of active time levels, and this must correspond to the prolongation order setting. In unigrid simulations, the prolongation order can be set arbitrarily since it is not actually used for interpolation.

  2. Frank Löffler
    • changed status to resolved
    • removed comment

    The problem was a grid function being defined with three timelevels, while Carpets time_prolongation_order was set to 1 - which translates to a global number of timelevels of 2 in Carpet: no GF can have more timelevels in that case. The thorn now uses a workaround to specify the number of timelevels depending on parameter settings.

  3. Erik Schnetter
    • changed status to open
    • removed comment

    I am reopening this ticket, because there needs to be a better error message. That is, this problem needs to be detected much earlier and at a higher level.

  4. Ian Hinder
    • removed comment

    Wouldn't it be better for Carpet to not care about a prolongation parameter if it's not doing any prolongation? I am doing a unigrid run and ran into the same problem. I only learned on recovery that I could not recover.

  5. Erik Schnetter
    • removed comment

    At the point where the error occurs, something tries to access the time associated with a particular time level, and this time is not defined.

    The parameter "prolongation_order_time" is a misnomer, as it has two functions. It determines the prolongation order, as well as the number of past time levels for which metadata need to be kept.

  6. Roland Haas
    • removed comment

    I recently played with a multistep time integration scheme (just the plain Adams-Bashforth-Moulton which one can copy from numerical recipes) for a postprocessing code that needs some time evolution but cannot (since all data comes from saved timeslices) use a RK scheme.

    In that case, I would naturally want to use Carpets previous timelevels for the evolution scheme, but might require more levels than I would need for a the desired time interpolation order. The problem here stems from the fact that Carpet assumes (correctly right now when using MoL) that timelevels are only for its own use during prolongation but the Adams-Bashforth-Moulton scheme actually needs past timelevels for purposes of its own.

    I believe Carpet should ideally be more forgiving if there are more than enough timeleves to satisfy its needs for prolongation.

  7. Ian Hinder
    • removed comment

    If I attempt to set Carpet::prolongation_order_time = 2 on recovery, I get std::out_of_range exceptions, so it looks like there is no workaround apart from repeating the entire run. The parameter is marked as steerable on recovery; I wonder if that is really true.

  8. Erik Schnetter
    • removed comment

    There are certain circumstances under which you can steer this parameter. For example, you can probably increase its value, if you ensure the additional levels are initialised after recovery.

  9. Roland Haas
    • removed comment

    I believe a working workaround was to reduce the number of timelevels of the grid functions which has too many. Of course if you cannot steer the number of timelevels in the thorn with too many of them, you would have to modify that thorn and recompile.

  10. Log in to comment