Option to delay check failures due to long process spinup

Issue #284 closed
Former user created an issue

Right now we are testing a setup where we are using Monit to monitor Logstash and trigger Keepalived to failover should Logstash crash or lock up. Crashing is easy to catch using Monit. Lockups, however, we're using an HTTP check that is filtered to be dropped by Logstash since it's only for health check purposes. The problem is that Logstash (due to Java and Ruby) takes a looong time to spin up. The PID is online immediately, but our HTTP check dies for upwards of 30 seconds.

What I propose is a keyword: "SPINUP DELAY FOR x"

The reason for this is such that Monit can handle starting Logstash (using Monit to invoke it upon bootup as is advertised), but can then wait before checking doing "IF FAILED" checks for the spinup delay. Example:

check process logstash with pidfile /var/run/logstash.pid
  start program = "/etc/init.d/logstash start"
  stop program = "/etc/init.d/logstash stop"
  SPINUP DELAY FOR 60
  if failed
    host 127.0.0.1 port 58888 protocol http
    request "/"
    status = 200
  then restart
  if 3 restarts with 10 cycles then exec "/opt/keepalived/force_fault_state.sh"
  if 4 restarts with 10 cycles then timeout

What this does is runs the START keyword due to a PID failure, but waits to execute the HTTP test (or any others within this CHECK block) for 60 seconds.

Thanks!

Comments (1)

  1. Tildeslash repo owner

    This feature is implemented already ... monit delays the connection tests for start program's timeout second, for example the following will postpone the connection test by 60 seconds after process restart:

    check process logstash with pidfile /var/run/logstash.pid
      start program = "/etc/init.d/logstash start" with timeout 60 seconds
      stop program = "/etc/init.d/logstash stop"
      if failed
        host 127.0.0.1 port 58888 protocol http
      then restart
    

    If it doesn't work for you, please check the monit version (monit -V) and upgrade monit.

  2. Log in to comment