- removed comment
NaNChecker should not use integer division
Issue #2035
new
I profiled the NaNChecker, and it seems as if it spent two thirds of its time performing integer division. I assume that these are the integer divisions where the code re-calculates the (i,j,k)
triple from a linear index. This part of the code could easily be rewritten.
Keyword:
Comments (2)
-
-
reporter - removed comment
Here is an excerpt of the most expensive routines of a unigrid Cowling benchmark run:
+ 18.10% 17.83% cactus_sim cactus_sim [.] ML_ADMConstraints::ML_ADMConstraints_evaluate_Body + 13.39% 0.35% cactus_sim cactus_sim [.] void HydroToyOpenMP::tiled_task_loop + 6.40% 6.21% cactus_sim cactus_sim [.] NaNChecker::CHECK_DATA<double> + 6.34% 2.08% cactus_sim libc-2.17.so [.] __memset_sse2
- ADMConstraints is much more expensive than it should be. I don't know yet why, but I also see it is not being vectorized.
- The call to memset comes mostly from within Carpet, and is likely due to poisoning that I activated.
- I don't show I/O here that is also taking significant time, but that is fine since the benchmark run lasts only for ten iterations, so output is relatively more expensive. Ditto for setting up initial conditions.
- The second column shows how much time is spent in the particular routine itself. Since the hydro implementation calls subroutines, that time is very small.
- Log in to comment
Do you have numbers to show for a typical run how much time is spend in NaNchecker (compared to eg the McLachlan RHS)? Not that spending 2/3 of the time doing integer division is a good thing of course.