monit got stuck on "check file"

Issue #565 closed
ethaniel 1 created an issue

I use this configuration to remount NFS if it becomes stale/broken:

check file nfs_alive with path /www/do_not_remove_nfscheck.txt
if does not exist then exec "/bin/bash -c '/bin/umount -fl /www/sessions; /bin/umount -fl /www; /bin/mount /www ; /bin/mount /www/sessions'" 

Today, monit got stuck in a loop checking if a file exists without actually performing the exec:

[MSK Feb 25 00:13:25] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 00:13:25] info     : 'nfs_alive' exec: /bin/bash
[MSK Feb 25 00:13:29] info     : 'nfs_alive' file exists
[MSK Feb 25 00:45:01] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 00:45:01] info     : 'nfs_alive' exec: /bin/bash
[MSK Feb 25 00:45:04] info     : 'nfs_alive' file exists
[MSK Feb 25 00:48:00] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 00:48:00] info     : 'nfs_alive' exec: /bin/bash
[MSK Feb 25 00:48:04] info     : 'nfs_alive' file exists
[MSK Feb 25 17:48:38] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:48:38] info     : 'nfs_alive' exec: /bin/bash
[MSK Feb 25 17:48:42] info     : 'nfs_alive' file exists
[MSK Feb 25 17:48:45] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:48:45] info     : 'nfs_alive' exec: /bin/bash
[MSK Feb 25 17:48:48] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:48:51] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:48:54] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:48:57] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:00] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:03] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:06] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:09] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:12] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:15] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:18] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:21] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:25] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:28] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:31] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:34] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:37] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:40] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:43] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:46] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:49] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:52] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:55] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:49:58] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:01] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:04] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:07] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:10] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:13] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:16] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:19] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:22] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:25] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:28] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:31] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:34] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:37] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:40] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:43] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:46] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:49] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:52] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:56] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:50:59] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:02] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:05] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:08] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:11] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:14] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:17] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:20] error    : 'nfs_alive' file doesn't exist
[MSK Feb 25 17:51:23] error    : 'nfs_alive' file doesn't exist

Comments (6)

  1. Tildeslash repo owner

    The behaviour is correct: Monit 5.16.0 and later executes the action only once (on state change). Changelog excerpt:

    New: The exec action is now executed only once, on state change,
    same way as the alert action. The new "repeat" option allows to
    repeat the exec action after given number of cycles if the error persists.
    Syntax:
        if <test> then exec <script> repeat every <x> cycles
    If you want to get the old behaviour, use "repeat every 1 cycle". Example:
        if failed port 1234 then exec "/usr/bin/myscript.sh" repeat every 5 cycles
    
  2. ethaniel 1 reporter

    Thank you! However, I've noticed that this check doesn't have a timeout which leads monit giving all it's resources to checking for the file existance (notice the loop) instead of checking for well-being of other processes. Or am I wrong?

  3. Tildeslash repo owner

    Monit just checks the file once and continues with other service checks - the message is normally logged only in the case of error, so the service checks which were successful are silent.

    You can run monit in debug mode (-v option) to see progress of all checks.

  4. Log in to comment