'Total time for simulation' wrong for Carpet

Create issue
Issue #1115 closed
Frank Löffler created an issue

For some reason, the 'Total time for simulation' printed by Cactus is wrong when using Carpet (I say this because a similar paramter file for PUGH doesn't show this effect). It is roughly twice of what I would expect.

Reproduce: take par/static_tov.par, reduce the final_time to something more managable, e.g. 2.0 and add Cactus timer output at the end (cactus::cctk_timer_output = "full"). Run with the unix command 'time' and one process and one thread.

What I see is, e.g., 'time' reporting 2m29.615s => 150s, 'Total time for CCTK_EVOL' (by far largest chunk) 121.2, but 'Total time for simulation' 296.0. I don't see any noticeable startup or shutdown delays. I would expect the 'total' time to be about half of what it shows to be. I confirmed (by using 'top' that this is indeed only using one thread).

Keyword:

Comments (9)

  1. Ian Hinder
    • removed comment

    This is because the corresponding timer ("CCTK total time") is stopped twice. Carpet (in Shutdown.cc) stops this timer before shutdown, because it caused problems with PAPI (ask Erik for more details). When Cactus then prints the values of the timers, including this one, with cctk_timer_output = "full" on shutdown, it assumes that the timer is running, stops it, then restarts it. It appears that stopping a Cactus timer twice causes its value to double. Instead, it should be an error. I propose the following changes:

    1. Stopping a Cactus timer which is not running should be an error (maybe a level 1 warning).
    2. The logic in Cactus ScheduleInterface.c which prints the timers at the end of a run should check to see if the timers are running; shutdown is a special case, so special-case logic is OK.
  2. Frank Löffler reporter
    • removed comment

    The attached patch adds a flag indicating if a timer is running or not. Starting a running timer and stopping a not running timer will produce a level 1 warning and will otherwise do nothing (will not actually start/stop twice). This fixes the issue at hand - although it will now produce that warning for the total simulation time timer - which is stopped multiple times within Carpet.

  3. Erik Schnetter
    • removed comment

    I notice that the logic that keeps track of which timers are running is handled at a lower level than the logic that checks this and emits warnings. Why?

  4. Erik Schnetter
    • removed comment

    It usually doesn't work to output timers that are currently running. Is there a check for this? (I can't see this in the diff.)

  5. Frank Löffler reporter
    • removed comment

    There are checks at the same level. These checks prevent double starting/stopping, but don't emit a warning. The reason for that is that at this level there isn't enough information readily available to create a meaningful error message. There are additional checks one level up (cctk_FunI instead of cctkI_Fun) which emit the warnings (but still call the lower level).

    There is no change to output of running timers. We could do this, but it would be another issue with a separate patch.

  6. Erik Schnetter
    • changed status to open
    • removed comment

    Starting/stopping timers gives wrong results, outputting running timers gives wrong results... Sure, we can open another ticket for this.

    Please apply.

  7. Log in to comment