- marked as minor
- edited description
- changed title to Timer output does not add up
Timer output does not add up
I have just received the following timer output:
Eigensolver 0.00 min 0.0 %
Initialization 22.38 min 5.0 %
Advance steps 383.57 min 85.9 %
(redistribute 85.425 min 19.1 %)
(field solve 5.128 min 1.1 %)
(diagnostics 20.817 min 4.7 %)
Re-initialize 322.85 min 72.3 %
Finishing 0.00 min 0.0 %
total from timer is: 446.45 min
As I understand it, the items in brackets are a breakdown of the item before the brackets (Advance steps
). Therefore, I would expect the sum of the items in brackets to equal the item before the brackets. However, this is not the case:
85.425 + 5.128 + 20.817 = 111.370 which is not equal to 383.57
Furthermore, I would expect the sum of the items not in brackets to equal the total. However, this is not the case either:
22.38 + 383.57 + 322.85 = 728.80 which is not equal to 446.45
If one were to include Re-initialize
in Advance steps
(i.e. assume there should be brackets around Re-initialize
), then we get closer but the numbers still do not match:
85.425 + 5.128 + 20.817 + 322.85 = 434.220 which is not equal to 383.57
and:
22.38 + 383.57 = 405.95 which is not equal to 446.45
On the plus side, the total is correct (or nearly correct) as I have the time taken from an external measurement as 07:26:39, while 446.45 mins = 07:26:27 so that agrees within ~10 seconds, which is fine.
Comments (13)
-
reporter -
reporter - edited description
-
It's just a partial breakdown -- the left over is what is spent in doing things like the linear advance, collisions, the non-linear term etc.
-
Just to add, the current implementation stores timings for a given operation in a length-2 vector. These vectors are either global variables (e.g.
time_reinit
) or members of thetimers
class, which is itself a member ofstate
(e.g.state%timers%init
). The latter approach is probably more desirable. However, this is a pain to implement as the subroutine calls must be changed to passstate
through to the functions where lower level timers would be called. For example, if I want to time collisions, I have to call timing functions either side ofvspace_derivative
, so I need to passstate
toadvance
,advance_<fields_implementation>
andtimeadv
, so it's available to the timing subroutine.Moreover, it's questionable whether
timers
orstate
are stored in the right module, as trying to passstate
down results in circular dependencies. Maybestate
could stay where it is, buttimers
could be moved togs2_time
? -
reporter Even if it's a partial breakdown it's wrong because the sum is more than the total!
Timings are potentially useful information but the current output is confusing and does not inspire much trust in the numbers it's telling. I think this should be rectified. It doesn't need to be perfect. It doesn't need to add up exactly. But I think it should be made clearer what things sum to make something else and the sum should agree (at least approximately) with the (sub-)total. And if there's other stuff that's not included, this should be captured in a category called 'other' or similar so that the sum and total match to make it clear that there is other stuff going on.
Moving
timers
togs2_time
sounds sensible. -
reporter Also, I just got this from a run with nstep = 0:
Initialization 26.32 min 99.6 % Advance steps 0.00 min 0.0 % (redistribute 6.610 min 25.0 %) (field solve 0.000 min 0.0 %) (diagnostics 0.000 min 0.0 %) Re-initialize 0.00 min 0.0 % Finishing 0.00 min 0.0 % total from timer is: 26.42 min
Notice re-distribute, which is listed as a contribution to advance steps, is non-zero even though advance steps is zero (because nstep = 0). This is because there are some re-distributes during initialisation. I find this misleading too.
-
Thanks Steve. You're right, these times are reported as if they are logically nested, but actually they are all independent timers. Timers are both started and stopped by calling
time_message
. There's no check that timers don't overlap, or that timers have been started without being stopped.I've sketched out a possible improvement in the branch features/timings. The idea is to make a
timers
class ings2_time
. These would be for the big steps, like "Initialization" and "Advance Step". Each class would also contain asubtimers
class, which would be for things like "redistribute", "field solve" and "diagnostics". This would let us know whether the redistribute is being called in initialization or time advance.This sort of works, but it looks like it'll be a pain to set it up with functions in
utils
(like redistributions) without breaking non-gs2 code. Also I still need to look at making sure timers are started and stopped at sensible times. -
reporter Brilliant! Sounds like a good fix so long as we can avoid breaking related projects. Thanks :)
-
The branch features/timing should be functional now, if you'd like to try it out.
-
reporter OK, thanks.
I've tried it out, as discussed below. NB: I have not looked at the implementation at all - only the output I get from running it.
I get the following output:
Initialization 0:00:44.68 100.0 % advance 0:00:00.00 0.0 % redistribute 0:00:04.27 9.6 % field solve 0:00:00.00 0.0 % collisions 0:00:00.00 0.0 % nonlinear term 0:00:00.00 0.0 % diagnostics 0:00:00.82 1.8 % reinitialize 0:00:00.00 0.0 % finishing 0:00:00.00 0.0 % Advance 0:01:09.89 100.0 % advance 0:00:35.83 51.3 % redistribute 0:00:04.27 6.1 % field solve 0:00:08.00 11.5 % collisions 0:00:00.01 0.0 % nonlinear term 0:00:13.56 19.4 % diagnostics 0:00:09.26 13.2 % reinitialize 0:00:00.00 0.0 % finishing 0:00:34.02 48.7 % Finishing 0:00:00.00 NaN % advance 0:00:00.00 NaN % redistribute 0:00:04.27 Inf % field solve 0:00:00.00 NaN % collisions 0:00:00.00 NaN % nonlinear term 0:00:00.00 NaN % diagnostics 0:00:00.00 NaN % reinitialize 0:00:00.00 NaN % finishing 0:00:00.00 NaN % Total 0:00:43.86 100.0 % advance 0:00:00.00 0.0 % redistribute 0:00:04.27 9.7 % field solve 0:00:00.00 0.0 % collisions 0:00:00.00 0.0 % nonlinear term 0:00:00.00 0.0 % diagnostics 0:00:00.00 0.0 % reinitialize 0:00:00.00 0.0 % finishing 0:00:00.00 0.0 %
I like the layout as it indicates more clearly what is nested within something else.
However, there are some issues with this:
-
I would expect all of the top level times (initialization, advance, and finishing) to be nested within the total so that their percentage is the percentage of the total (currently they are all showing 100%)
-
Furthermore, I would expect the sub-items under total to be sums of the corresponding sub-items from each of the other top level timers (e.g. total redist = init redist + advance redist + finishing re-dist) - this is currently not the case.
-
The total time does not agree with an external timer nor does the sum of the top level timers agree with an external timer (this was the case previously).
-
I would expect the sum of the times of the sub-items to be the time of the top level item. I would expect the sum of the percentages of the sub-items to be 100%. Neither of these is currently the case. Do we need to add other timers, e.g. to time communication? Or just a catch-all category called 'other' or similar to mop up any un-timed computation?
-
Finishing has a time of zero so all the sub-item percentages are NaN or Inf. However, the redistribute sub-item is non-zero so I think the root of the problem is that the top level item has not correctly summed the sub-items, similar to the above issue.
On the plus side, the sum of the sub-item percentages multiplied by the top level item time is equal to the sum of the sub-item times so there is at least some agreement there.
Finally, writing out these issues makes me realise that we are effectively hard-coding our own implementation of profiling into gs2; this makes me worry that we could be slowing gs2 down unnecessarily for users who are not interested in this timing information. This has got me thinking...
Would we be better off testing whether we get any speed-up by removing the timers altogether? If not, then fine, we carry on fixing the timers. But if we do get speed-up, then maybe we're best off removing the timers and saying that anyone interested in profiling gs2 should use an external profiling tool. How about that?
-
-
Thanks for the feedback Steve. Like the idiot I am, I forgot to push the final commit... The current head of the branch should fix "finishing" problem, and mostly fix the sub-levels. You're right about how the totals should be displayed - I'll fix that.
I'll also look at the performance without timers. I'd be surprised if it's a big deal, and I think that people would want some indication of timing without doing external profiling... but we'll see how big an effect it is.
-
- changed milestone to 8.2
-
-
assigned issue to
-
assigned issue to
- Log in to comment