"failed to stop" always after 60 seconds

Issue #109 resolved
Manuel Meurer created an issue

I'm using Monit 5.8.1 to monitor a background job queue (Sidekiq) with the following config:

check process sidekiq
  with pidfile /var/run/sidekiq.pid
  start program = "/bin/su - chief -c 'cd /var/www/myapp && RAILS_ENV=production bundle exec sidekiq -C /var/www/myapp/config/sidekiq.yml'" with timeout 60 seconds
  stop program = "/bin/su - chief -c 'cd /var/www/myapp && bundle exec sidekiqctl shutdown /var/www/myapp/tmp/sidekiq.pid 70'" with timeout 80 seconds
  if totalmem > 1000 MB for 5 cycles then alert
  if cpu > 50% for 5 cycles then alert
  if cpu > 90% for 5 cycles then restart
  if 3 restarts within 5 cycles then timeout

The problem appears when I try to stop the process. After exactly 60 seconds, Monit always reports "failed to stop" and immediately afterwards "stop action done":

[EST Nov 12 08:56:12] info     : 'sidekiq' stop on user request
[EST Nov 12 08:56:12] info     : monit daemon with PID 22405 awakened
[EST Nov 12 08:56:12] info     : Awakened by User defined signal 1
[EST Nov 12 08:56:12] info     : 'sidekiq' stop: /bin/su
[EST Nov 12 08:57:12] error    : 'sidekiq' failed to stop
[EST Nov 12 08:57:13] info     : 'sidekiq' stop action done

I would expect it to report that after 80 seconds, if the process cannot be stopped.

I changed the daemon check interval (set daemon) which was initially set to 60, to other values like 120 or 20, but the behaviour didn't change.

Comments (3)

  1. Tildeslash repo owner

    The problem was fixed in Monit 5.9 (the timeout was inherited from the start command), we recommend to upgrade to the latest version (5.10)

  2. Log in to comment