Stop monit command kills my processes
I couldn't find any information on it but it looks like when I change configuration via puppet the module restarts monit, i.e.
/sbin/stop monit
/sbin/start monit
However crude it might be, I think it produces undesirable effect on my system which needs to be avoided.
Monit kills some monitored processes during exit and starts it again . This is of course not intended especially when it is done during production time.
I noticed that some how if those processes are started when monit executed in foreground, they are not killed when a simple kill is issued to a monit process. Neither does later subsequent restart with upstart init has any effect.
I monitor rsyslogd , sshd , mysql, ntpd, winbindd, and our application x. Only rsyslogd and application x are affected, i.e. killed on /sbin/stop monit
Thanks,
Comments (18)
-
repo owner -
repo owner Yet one note: check also the configuration of your service framework (upstart?) for Monit ... the configuration for "/sbin/stop monit" may also contain some service related actions, but it is 3rd party stuff and not Monit itself.
You can see examples for upstart and systemd setup here:
-
reporter Monit doesn't tell much in logs
Oct 22 02:03:56 node1cl2-chi monit[4403]: Shutting down monit HTTP server Oct 22 02:03:56 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="4452" x-info="http://www.rsyslog.com"] exiting on signal 15. Oct 22 02:05:29 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="6296" x-info="http://www.rsyslog.com"] start Oct 22 02:05:44 node1cl2-chi monit[6257]: 'xbroker' process is running with pid 6282 Oct 22 02:05:44 node1cl2-chi monit[6257]: 'rsyslogd' process is running with pid 6296
I have CentOS 6.5
This is my upstart file
# File managed by puppet on node1cl2-chi.siptalk.com. # Changes made to this file outside of puppet will be lost on the next puppet run! # xcast::roles::xbroker::inittab # # To install disable the old way of doing things: # # /etc/init.d/monit stop && update-rc.d -f monit remove # # then put this script here: # # /etc/init/monit.conf # # and reload upstart configuration: # # initctl reload-configuration # # You can manually start and stop monit like this: # # start monit # stop monit # description "Monit service manager" limit core unlimited unlimited limit nofile 131072 196608 start on runlevel [2345] stop on runlevel [!2345] expect daemon respawn exec /usr/bin/monit -c /etc/monitrc pre-stop exec /usr/bin/monit -c /etc/monitrc quit
I can reproduce the issue just doing stop monit / start monit from command line. I don't need puppet to execute it.
-
reporter Looks like if monit starts the processes below, they are killed with signal 15 at the time monit stopped.
Oct 22 02:54:43 node1cl2-chi monit[12202]: Shutting down monit HTTP server Oct 22 02:54:43 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="12241" x-info="http://www.rsyslog.com"] exiting on signal 15. Oct 22 02:56:19 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="12554" x-info="http://www.rsyslog.com"] start Oct 22 02:56:34 node1cl2-chi monit[12515]: 'xbroker' process is running with pid 12540 Oct 22 02:56:34 node1cl2-chi monit[12515]: 'rsyslogd' process is running with pid 12554
I ran strace -ff on the stop monit, but I don't see anything special and related to external processes. However notice how upon shutting down monit , rsyslog is exiting on signal 15 (xborker does the same thing just doesn't write into syslog log locally).
Unfortunately monit -v doesn't produce more detailed logs.
-
repo owner Can you please provide output of the following command:
ps -ef | egrep "(rsyslogd|monit)"
Which Monit version it is?
-
reporter ps -ef | egrep "(rsyslogd|monit)" root 15433 1 0 03:19 ? 00:00:40 /usr/bin/monit -vvv -c /etc/monitrc root 15465 1 0 03:19 ? 00:00:02 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
monit -V This is Monit version 5.9
-
reporter If you have a debugging version of monit I can run it to reproduce the issue and send you logs.
-
reporter I updated the monit to the latest, still the issue remains. If process was started outside of monit (from shell via command line), then stop / start of monit has no effect on the process. However if process was started by monit, stop of the monit causes the process to exit via signal 15.
-
reporter I looked at my scripts and upstart config and I see nothing. Can the state of the monit have anything to do with the issue? I am providing configs just in case
/etc/monitrc set daemon 15 # check services at 15-sec intervals # set logfile syslog facility log_daemon set idfile /var/run/.monit.id set mailserver localhost set eventqueue basedir /var/monit # set the base directory where events will be stored slots 100 # optionally limit the queue size set alert xxxxxxxx not on { instance, action } ## Monit has an embedded web server which can be used to view status of ## services monitored and manage services from a web interface. See the ## Monit Wiki if you want to enable SSL for the web server. # set httpd port 2812 and #use address localhost # only accept connection from localhost allow localhost # allow localhost to connect to the server and allow x.x.x.x/24 # allow x.x.x.x/24 to connect to the server and allow w.w.w.w/24 # allow w.w.w.w/24 to connect to the server and allow y.y.y.y/24 # allow y.y.y.y/24 to connect to the server and allow admin:monit # require user 'admin' with password xxxxx allow @xxxx # allow users of group 'monit' to connect (rw) allow @users readonly # allow users of group 'users' to connect readonly ## Check general system resources such as load average, cpu and memory ## usage. Each test specifies a resource, conditions and the action to be ## performed should a test fail. # check system sbc11n2-la.siptalk.com if loadavg (1min) > 12 then alert if loadavg (5min) > 10 then alert if memory usage > 89% then alert if swap usage > 0% then alert # if cpu usage (user) > 70% then alert # if cpu usage (system) > 30% then alert # if cpu usage (wait) > 20% then alert if cpu usage (user) > 60% then alert if cpu usage (system) > 60% then alert # Includes include /etc/monit.d/*
/etc/monit.d/xbroker check process xbroker matching "/usr/local/registrator/bin/xbroker -c" start program = "/usr/local/registrator/bin/xbroker_ctl start" as uid xcast and gid xcast stop program = "/usr/local/registrator/bin/xbroker_ctl stop" as uid xcast and gid xcast alert xxxxxx not on { instance, action } alert xxxxxz not on { instance, action } if failed host hostxxxx port 5060 type udp protocol sip with target test@hostxxxx and maxforward 0 with timeout 5 seconds retry 2 /etc/monit.d/system check process ntpd with pidfile "/var/run/ntpd.pid" every 4 cycles start program = "/etc/init.d/ntpd start" stop program = "/etc/init.d/ntpd stop" if 5 restarts within 20 cycles then timeout check process rsyslogd with pidfile "/var/run/syslogd.pid" start program = "/etc/init.d/rsyslog start" stop program = "/etc/init.d/rsyslog stop" if 5 restarts within 5 cycles then timeout check process sshd with pidfile "/var/run/sshd.pid" start program = "/etc/init.d/sshd start" stop program = "/etc/init.d/sshd stop" sync #restart program = "/etc/init.d/sshd restart" if failed port 22 protocol ssh then restart if 5 restarts within 5 cycles then timeout check process mysql with pidfile "/var/lib/mysql/hostnamexxxx.pid" every 4 cycles start program = "/etc/init.d/mysql start" stop program = "/etc/init.d/mysql stop" if failed host 127.0.0.1 port 3306 protocol mysql then alert if 5 restarts within 20 cycles then timeout check process winbindd with pidfile "/var/run/samba/winbindd.pid" start program = "/etc/init.d/winbind start" stop program = "/etc/init.d/winbind stop" if 5 restarts within 5 cycles then timeout
-
reporter And upstart
/etc/init/monit.conf # To install disable the old way of doing things: # # /etc/init.d/monit stop && update-rc.d -f monit remove # # then put this script here: # # /etc/init/monit.conf # # and reload upstart configuration: # # initctl reload-configuration # # You can manually start and stop monit like this: # # start monit # stop monit # description "Monit service manager" limit core unlimited unlimited limit nofile 131072 196608 start on runlevel [2345] stop on runlevel [!2345] expect daemon respawn exec /usr/bin/monit -c /etc/monitrc pre-stop script exec /usr/bin/monit -c /etc/monitrc quit end script
-
repo owner It is very strange ... the "ps" output shows that monit and rsyslogd (started via monit) are independent (PPID is 1/init in both cases). If monit stops, it doesn't send any signals to the monitored processes itself and as it's not parent to rsyslogd (rsyslogd runs as daemon too), it won't trigger any signal to rsyslogd.
I have tried to replicate the issue on CentOS 6.5 - added monit to upstart using simple configuration, tried to restart rsyslogd via monit and then stopped monit ... works fine, rsyslogd keeps running.
set daemon 5 set httpd port 2812 allow monit:monit set logfile syslog check process rsyslogd with pidfile "/var/run/syslogd.pid" start program = "/etc/init.d/rsyslog start" stop program = "/etc/init.d/rsyslog stop" if 5 restarts within 5 cycles then timeout
The stop seems to be driven externally - by some 3rd party SW (puppet?)
Can you start Monit outside of upstart control?:
- /sbin/stop monit #stop monit via upstart
- /usr/bin/monit #start monit manually
- /usr/bin/monit restart rsyslogd #restart rsyslogd process via monit
- /usr/bin/monit quit #stop monit
- check if rsyslogd was restarted at the same time or after monit stopped
-
repo owner Any news on this?
-
reporter Sorry for not updating this. It is still the mystery. I have tried all things you suggested and couldn't see anything abnormal. The problem was intermittent to say the list. Since I have not seen the re-occurrence of this lately, I suggest you close the issue. I can revisit it when it happens again hopefully with the better information next time.
-
repo owner - changed status to closed
Closed as per reporters suggestion.
-
Hi,
I had this issue when using Monit on centos7. When I stop or restart Monit by command "service monit stop|restart", my monitored applications are killed
-
I've experienced this behavior on Fedora Server 21. In my case, I've tracked this down to use of the default systemd
KillMode
, which iscontrol-group
.http://www.freedesktop.org/software/systemd/man/systemd.kill.html
Adding a unit file to set
KillMode=process
formonit.service
ensures thatsystemctl stop monit
only kills the monit daemon and not the other processes in its control group. -
Thank baraabasata, you save my life :)
-
repo owner - changed status to resolved
updated the systemd template to reflect the KillMode=process(thanks to @baraabasata)
- Log in to comment
This doesn't look like Monit issue - Monit doesn't stop any processes when it quits. In order to stop the processes Monit must be explicitly instructed to do so (for example by "monit stop all") => it seems that the service stop is externally driven (by puppet too?)
Please check your Monit logs (you can enable it using "set logfile" statement).