WatchDog: new thorn to automatically terminate jobs that hang
David Radice
Branch: dradice/cactusutils:watchdog
Branch: cactuscode/cactusutils:master
Declined
Declined pull request
Already applied in git hash 9d7cc1a.
Closed by: Roland Haas·2015-05-01
WatchDog is thorn that terminates jobs that do not make progress over a user-defined time frame. Internally, WatchDog updates an internal timer at CCTK_ANALYSIS and uses the pthread library to spawn watcher thread that periodically checks if the timer has been updated. If the timer has not been updated for more than a user-defined time frame, the thread calls "exit()" to terminate the process (and the job).