- removed comment
qc0-mclachlan fails with Assertion `not reg.processors' failed
I am running qc0-mclachlan.par from the ET trunk on Datura. I get this error message during startup:
cactus_sim: /home/ianhin/Cactus/EinsteinToolkit/arrangements/Carpet/CarpetLib/src/region.cc:166: void combine_regions(const std::vector<region_t, std::allocator<region_t>> &, std::vector<region_t, std::allocator<region_t>> &): Assertion `not reg.processors' failed.
I get the same message whether I run on 1 MPI process or 2, and also on another machine. Since qc0-mclachlan is a standard BBH simulation, I assume that a similar error could/does occur for any binary simulation with moving-boxes mesh refinement. Setting priority to Critical as a result.
Keyword:
Comments (7)
-
reporter -
reporter - removed comment
git bisect tells me that the problem first starts appearing introduced in this commit:
commit 744af16b61a3bbbcb752af1ed11ed02831049179 Author: Erik Schnetter <schnetter@gmail.com> Date: Mon Sep 10 22:02:40 2012 -0400
CarpetLib: Ensure that split/combined regions don't have a tree structure attached
http://www.carpetcode.org/hg/carpet/index.cgi/rev/dc343eecda5f
I don't know if this is a detection of an already-existing problem, or if this commit actually introduces new problem.
-
- removed comment
The carpet testsuites, among others, fail with the same errors since that patch. It seems Ian's daily trunk testsuite page has stopped updating since the day that patch was applied, otherwise this might've been noticed.
-
reporter - removed comment
It coincided with the login node that they are launched from being replaced with another one, and the cron jobs not being transferred. I have reinstated the cron job this morning, and the system is churning through commits now.
Erik: would reverting the patch be the right solution? This potentially affects production runs as well as test cases.
-
- removed comment
I introduced the assert because the process tree is not handled correctly when regions are split.
While we work on the correct solution, I suggest to replace the assert by a comment.
-
- changed status to resolved
- removed comment
Corrected.
-
- edited description
- changed status to closed
- Log in to comment
Backtrace is
The oldregs and superregss seem to cause problems for the debugger, with many traceback errors saying it can't access certain memory.