Status failed - monitor pending

Issue #296 closed
Matjaz Skerjanec created an issue

I have configured monit to monitor few processes on production system. It is working fine but from time to time I get "Status failed"

Program 'check_system4'
  status                            Status failed
  monitoring status                 Monitored
  last started                      Thu, 10 Dec 2015 08:58:57
  last exit value                   1
  data collected                    Thu, 10 Dec 2015 08:58:57

When I want to reset it it will go to "Status failed - monitor pending"

Program 'check_system4'
  status                            Status failed - monitor pending
  monitoring status                 Monitored
  last started                      Thu, 10 Dec 2015 08:58:57
  last exit value                   1
  data collected                    Thu, 10 Dec 2015 08:58:57

I'm trying to reset it with command:

/opt/monit/bin/monit -c /opt/monit/conf/monitrc unmonitor check_system4
/opt/monit/bin/monit -c /opt/monit/conf/monitrc monitor check_system4

The only way to get it working is to stop and start monitd which is not so good since everyone will get notification in that case.

What is a proper way to get the right status of monitored instance?

Thank you, Matjaz

Comments (17)

  1. Tildeslash repo owner

    Which monit version it is? ("monit -V")

    Please can you provide the configuration of "check_system4" service and output from monit log?

  2. Matjaz Skerjanec reporter

    Monit version 5.15 this is a part for system4 from ./conf.d/, check is running on system3

    check program check_system4 with path "/opt/monit/sbin/check_system4.sh" if status != 0 for 5 cycles then alert

    sh program is sending curl and expecting a right result.

  3. Matjaz Skerjanec reporter

    In log file first I get an alarm:

    Dec 10 08:57:57 system3 monit[24182]: 'check_system4' '/opt/monit/sbin/check_system4.sh' failed with exit status (1) -- CRITICAL - system4 test failed!

    and later on when I try to awake it:

    Dec 11 14:34:36 system3 monit[24182]: 'check_system4' monitor on user request Dec 11 14:34:36 system3 monit[24182]: Monit daemon with PID 24182 awakened

    But stil the status is "Status failed - monitor pending"

  4. Matjaz Skerjanec reporter

    I wonder if service is now monitored or I have to restart monit?

    Program 'check_system4' status Status failed - monitor pending monitoring status Monitored

  5. Tildeslash repo owner

    The service is still monitored, just the monitor action flag remains set, you can ignore it.

    The problem could be related to issue #283 which is fixed in the development version already, we'll investigate it.

  6. Matjaz Skerjanec reporter

    Hello, can you tell me when is expected development (beta version) to be GA? We also purchased a licence for MMONIT and would like to use it on production asap if possible.

    Thanks, mates

  7. Tildeslash repo owner

    Hello, we cannot replicate the issue (using monit 5.15 and development version too).

    When you do some action, monit sets a flag and handles that action as soon as possible. In you case it seems that it hung on some other service test, so it cannot handle the requested action (the timestamp doesn't change: "Thu, 10 Dec 2015 08:58:57"). The problem is not related to the "check_system4" configuration, but rather some other service check (maybe some long timeout).

    Please can you start monit in debug mode and provide output when it'll hang?:

    monit -vI
    

    You can get the development version snapshot from bitbucket (https://bitbucket.org/tildeslash/monit)

  8. Matjaz Skerjanec reporter

    Ok. Thank you for information. Will run debug nex time it hangs and let you know.

  9. Tildeslash repo owner

    The debug mode would have to be enabled before the hang ... in that mode monit logs every operation, if the hang cannot be easily reproduced, it'll be better to take a coredump or at least get a stacktrace:

    gdb <path to monit> <pid of monit>
    gdb> thr apply all bt
    
  10. Matjaz Skerjanec reporter

    Hang did not repeate until now. I have your instructions and will switch on debug mode if hanging will continue to appear.

  11. oniseijin

    I am seeing this as well with a simple wrapper script to docker: place {{service_name}} with any docker container name

    #!/bin/bash
    sudo docker top {{service_name}};
    exit $?;
    

    config:

    check program {{service_name}} with path /etc/monit/monit_{{service_name}}_docker_top.sh
      if status != 0 then alert
    

    Output:

    status                            Status failed
      monitoring status                 Monitored
      not yet started
      data collected                    Sun, 25 Sep 2016 04:47:40
    
    Ubuntu 14 LTS, monit simply install with apt-get monit 
    monit:
      Installed: 1:5.6-2
      Candidate: 1:5.6-2
      Version table:
     *** 1:5.6-2 0
            500 http://us.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
            100 /var/lib/dpkg/status
    
  12. Tildeslash repo owner
    • changed status to closed
    • edited description

    Unable to reproduce the problem, no data available - closing the issue

    The latest monit version is 5.24.0, there were lot of changes since then, including fixes in the service error flags

  13. Log in to comment