Monit service status shows Timeout where as actually the start of that service is successfull and is running

Issue #92 resolved
chiremat created an issue

i see that when i stop the service monit detects the service restart but fails to populate the right status on monit cli/ui. if i see that particular serivce status then actually it is running state. [root@copper-eventbroker-01 ~]# monit summary The Monit daemon 5.5 uptime: 11m

Process 'sensu-client' Execution failed Process 'redis' Running Process 'lumberjack' Running Process 'ls-lumberjack' Running Process 'ls-huron-event' Running Process 'logstash-central-huron-event' Running Process 'proxy' Running Process 'collectd' Timeout System 'copper-eventbroker-01' Running [root@copper-eventbroker-01 ~]# service collectd status collectdmon (pid 10156) is running... [root@copper-eventbroker-01 ~]# monit -V This is Monit version 5.5 Copyright (C) 2001-2012 Tildeslash Ltd. All Rights Reserved. [root@copper-eventbroker-01 ~]#

Comments (5)

  1. Tildeslash repo owner

    Please check Monit logs, it can be just configuration issue.

    On some platforms (RHE/CentOS) there was problem with program execution, the upcoming 5.9 fixes it.

    You can test the development branch if you want to, you can get snapshot here: https://bitbucket.org/tildeslash/monit/get/master.tar.gz

    To compile:

    tar -xzf master.tar.gz
    cd tildeslash*
    ./bootstrap
    ./configure
    make
    
  2. chiremat Account Deactivated reporter

    Thanks for the information.

    i would like to know more on the issue.

    My config is as below: check process collectd with pidfile /var/run/collectd/collectdmon.pid start program = "/etc/init.d/collectd start" stop program = "/etc/init.d/collectd stop" if changed pid then exec "/etc/sensu/plugins/monit-trigger.sh 'collectd PID changed' '' 1" if 1 restarts within 1 cycles then exec "/etc/sensu/plugins/monit-trigger.sh 'collectd restarted' '' 1"

    Steps i did to hit the issue: 1.i did manual stoping of the service by "service collectd stop". 2. after that once i see the ''monit summary" i notice Timeout but if i check the "service collect status" then it shows it is running.

    what could be the reason or cause for it. plz help me understand this issue.

  3. Tildeslash repo owner

    Few questions:

    1.) please can you check the Monit log mentioned in the previous post? If there was some problem, it will provide more details 2.) what platform it is? (for example CentOS 6.3?) 3.) what Monit version it is?

  4. chiremat Account Deactivated reporter

    Monit Logs: monit[3722]: 'collectd' process is not running Sep 17 18:13:42 host-192-168-60-13 monit[3722]: 'collectd' trying to restart Sep 17 18:13:42 host-192-168-60-13 monit[3722]: 'collectd' start: /etc/init.d/collectd : monit[3722]: 'collectd' service restarted 1 times within 1 cycles(s) - exec Sep 17 18:13:57 host-192-168-60-13 monit[3722]: 'collectd' exec: /opt/monit/plugin/monit_syslog_plugin.sh Sep 17 18:13:57 host-192-168-60-13 monit[3722]: 'collectd' process is running with pid 11117 Sep 17 18:13:57 host-192-168-60-13 collectd: SAEVENT { "ts": "Wed, 17 Sep 2014 18:13:57", "eventID": "MonitServiceStatusEvent", "eventLevel": "2", "category": "system_event", "eventSource" : [{ "hostname" : "copper-eventbroker-01"}, {"serviceID" : "MONIT" }] , "dataParam" : [ {"serviceName" : "collectd"}, {"serviceStatus": "Timeout"}, {"description": "collectd restarted" } , {"eventOriginator":"monit"} ] }

    2.it is CentOs 6.5. 3.Monit version is 5.5.

  5. Log in to comment