WatchDog: new thorn to automatically terminate jobs that hang

Declined
#1 · Created  · Last updated

Declined pull request

Already applied in git hash 9d7cc1a.

Closed by: ·2015-05-01

Description

WatchDog is thorn that terminates jobs that do not make progress over a user-defined time frame. Internally, WatchDog updates an internal timer at CCTK_ANALYSIS and uses the pthread library to spawn watcher thread that periodically checks if the timer has been updated. If the timer has not been updated for more than a user-defined time frame, the thread calls "exit()" to terminate the process (and the job).

0 attachments

0 comments

Loading commits...