Stop monit command kills my processes

Issue #106 resolved

Alexander Litvak created an issue 2014-10-22

I couldn't find any information on it but it looks like when I change configuration via puppet the module restarts monit, i.e.

/sbin/stop monit
/sbin/start monit

However crude it might be, I think it produces undesirable effect on my system which needs to be avoided.

Monit kills some monitored processes during exit and starts it again . This is of course not intended especially when it is done during production time.

I noticed that some how if those processes are started when monit executed in foreground, they are not killed when a simple kill is issued to a monit process. Neither does later subsequent restart with upstart init has any effect.

I monitor rsyslogd , sshd , mysql, ntpd, winbindd, and our application x. Only rsyslogd and application x are affected, i.e. killed on /sbin/stop monit

Thanks,

Comments (18)

Tildeslash repo owner
This doesn't look like Monit issue - Monit doesn't stop any processes when it quits. In order to stop the processes Monit must be explicitly instructed to do so (for example by "monit stop all") => it seems that the service stop is externally driven (by puppet too?)

Please check your Monit logs (you can enable it using "set logfile" statement).
- 2014-10-22T09:00:42+00:00
Tildeslash repo owner
Yet one note: check also the configuration of your service framework (upstart?) for Monit ... the configuration for "/sbin/stop monit" may also contain some service related actions, but it is 3rd party stuff and not Monit itself.

You can see examples for upstart and systemd setup here:
1. http://mmonit.com/wiki/Monit/Upstart
2. http://mmonit.com/wiki/Monit/Systemd
- 2014-10-22T09:04:21+00:00

Alexander Litvak reporter

Monit doesn't tell much in logs

Oct 22 02:03:56 node1cl2-chi monit[4403]: Shutting down monit HTTP server
Oct 22 02:03:56 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="4452" x-info="http://www.rsyslog.com"] exiting on signal 15.
Oct 22 02:05:29 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="6296" x-info="http://www.rsyslog.com"] start
Oct 22 02:05:44 node1cl2-chi monit[6257]: 'xbroker' process is running with pid 6282
Oct 22 02:05:44 node1cl2-chi monit[6257]: 'rsyslogd' process is running with pid 6296

I have CentOS 6.5

This is my upstart file

# File managed by puppet on node1cl2-chi.siptalk.com.
# Changes made to this file outside of puppet will be lost on the next puppet run!
# xcast::roles::xbroker::inittab
#


# To install disable the old way of doing things:
#
#   /etc/init.d/monit stop &amp;&amp; update-rc.d -f monit remove
#
# then put this script here:
#
#   /etc/init/monit.conf
#
# and reload upstart configuration:
#
#   initctl reload-configuration
#
# You can manually start and stop monit like this:
# 
# start monit
# stop monit
#

description "Monit service manager"

limit core unlimited unlimited
limit nofile 131072 196608

start on runlevel [2345]
stop on runlevel [!2345]

expect daemon
respawn

exec /usr/bin/monit -c /etc/monitrc

pre-stop exec /usr/bin/monit -c /etc/monitrc quit

I can reproduce the issue just doing stop monit / start monit from command line. I don't need puppet to execute it.

2014-10-22T09:12:20+00:00

Alexander Litvak reporter

Looks like if monit starts the processes below, they are killed with signal 15 at the time monit stopped.

Oct 22 02:54:43 node1cl2-chi monit[12202]: Shutting down monit HTTP server
Oct 22 02:54:43 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="12241" x-info="http://www.rsyslog.com"] exiting on signal 15.
Oct 22 02:56:19 node1cl2-chi rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="12554" x-info="http://www.rsyslog.com"] start
Oct 22 02:56:34 node1cl2-chi monit[12515]: 'xbroker' process is running with pid 12540
Oct 22 02:56:34 node1cl2-chi monit[12515]: 'rsyslogd' process is running with pid 12554

I ran strace -ff on the stop monit, but I don't see anything special and related to external processes. However notice how upon shutting down monit , rsyslog is exiting on signal 15 (xborker does the same thing just doesn't write into syslog log locally).

Unfortunately monit -v doesn't produce more detailed logs.

2014-10-22T10:15:33+00:00

Tildeslash repo owner
Can you please provide output of the following command:
```
 ps -ef | egrep "(rsyslogd|monit)"
```
Which Monit version it is?
- 2014-10-22T11:44:05+00:00

Alexander Litvak reporter

ps -ef | egrep "(rsyslogd|monit)"
root     15433     1  0 03:19 ?        00:00:40 /usr/bin/monit -vvv -c /etc/monitrc
root     15465     1  0 03:19 ?        00:00:02 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5

monit -V
This is Monit version 5.9

2014-10-22T16:26:41+00:00

Alexander Litvak reporter
If you have a debugging version of monit I can run it to reproduce the issue and send you logs.
- 2014-10-22T17:26:34+00:00
Alexander Litvak reporter
I updated the monit to the latest, still the issue remains. If process was started outside of monit (from shell via command line), then stop / start of monit has no effect on the process. However if process was started by monit, stop of the monit causes the process to exit via signal 15.
- 2014-10-23T08:46:52+00:00

Alexander Litvak reporter

I looked at my scripts and upstart config and I see nothing. Can the state of the monit have anything to do with the issue? I am providing configs just in case

/etc/monitrc

set daemon  15              # check services at 15-sec intervals
#
set logfile syslog facility log_daemon
set idfile /var/run/.monit.id
set mailserver  localhost
set eventqueue
     basedir /var/monit  # set the base directory where events will be stored
     slots 100           # optionally limit the queue size
set alert xxxxxxxx not on { instance, action }
## Monit has an embedded web server which can be used to view status of 
## services monitored and manage services from a web interface. See the
## Monit Wiki if you want to enable SSL for the web server. 
#
set httpd port 2812 and
    #use address localhost  # only accept connection from localhost
    allow localhost        # allow localhost to connect to the server and
    allow x.x.x.x/24       # allow x.x.x.x/24 to connect to the server and
    allow w.w.w.w/24        # allow w.w.w.w/24 to connect to the server and
    allow y.y.y.y/24        # allow y.y.y.y/24 to connect to the server and
    allow admin:monit      # require user 'admin' with password xxxxx
    allow @xxxx           # allow users of group 'monit' to connect (rw)
    allow @users readonly  # allow users of group 'users' to connect readonly
## Check general system resources such as load average, cpu and memory
## usage. Each test specifies a resource, conditions and the action to be
## performed should a test fail.
#
check system sbc11n2-la.siptalk.com
    if loadavg (1min) > 12 then alert
    if loadavg (5min) > 10 then alert
    if memory usage > 89% then alert
    if swap usage > 0% then alert
#    if cpu usage (user) > 70% then alert
#    if cpu usage (system) > 30% then alert
#    if cpu usage (wait) > 20% then alert
    if cpu usage (user) > 60% then alert
    if cpu usage (system) > 60% then alert
# Includes
include /etc/monit.d/*

/etc/monit.d/xbroker

check process xbroker matching "/usr/local/registrator/bin/xbroker -c"
    start program = "/usr/local/registrator/bin/xbroker_ctl start"
        as uid xcast and gid xcast
    stop program = "/usr/local/registrator/bin/xbroker_ctl stop"
        as uid xcast and gid xcast
    alert xxxxxx not on { instance, action }
    alert xxxxxz not on { instance, action }
    if failed
                host hostxxxx port 5060 type udp protocol sip
                with target test@hostxxxx and maxforward 0
                with timeout 5 seconds
                retry 2

/etc/monit.d/system

check process ntpd with pidfile "/var/run/ntpd.pid"
    every 4 cycles
   start program = "/etc/init.d/ntpd start"
   stop  program = "/etc/init.d/ntpd stop"
   if 5 restarts within 20 cycles then timeout
check process rsyslogd with pidfile "/var/run/syslogd.pid"
   start program = "/etc/init.d/rsyslog start"
   stop  program = "/etc/init.d/rsyslog stop"
   if 5 restarts within 5 cycles then timeout
check process sshd with pidfile "/var/run/sshd.pid"
   start program = "/etc/init.d/sshd start"
   stop  program = "/etc/init.d/sshd stop" sync
   #restart program = "/etc/init.d/sshd restart"
   if failed port 22 protocol ssh then restart
   if 5 restarts within 5 cycles then timeout
check process mysql with pidfile "/var/lib/mysql/hostnamexxxx.pid"
    every 4 cycles
   start program = "/etc/init.d/mysql start"
   stop  program = "/etc/init.d/mysql stop"
   if failed host 127.0.0.1 port 3306 protocol mysql then alert
   if 5 restarts within 20 cycles then timeout
check process winbindd with pidfile "/var/run/samba/winbindd.pid"
   start program = "/etc/init.d/winbind start"
   stop  program = "/etc/init.d/winbind stop"
   if 5 restarts within 5 cycles then timeout

2014-10-23T09:29:56+00:00

Alexander Litvak reporter

And upstart

/etc/init/monit.conf

# To install disable the old way of doing things:
#
#   /etc/init.d/monit stop &amp;&amp; update-rc.d -f monit remove
#
# then put this script here:
#
#   /etc/init/monit.conf
#
# and reload upstart configuration:
#
#   initctl reload-configuration
#
# You can manually start and stop monit like this:
# 
# start monit
# stop monit
#

description "Monit service manager"

limit core unlimited unlimited
limit nofile 131072 196608

start on runlevel [2345]
stop on runlevel [!2345]

expect daemon
respawn

exec /usr/bin/monit -c /etc/monitrc

pre-stop script 
    exec /usr/bin/monit -c /etc/monitrc quit
end script

2014-10-23T09:39:21+00:00

Tildeslash repo owner
It is very strange ... the "ps" output shows that monit and rsyslogd (started via monit) are independent (PPID is 1/init in both cases). If monit stops, it doesn't send any signals to the monitored processes itself and as it's not parent to rsyslogd (rsyslogd runs as daemon too), it won't trigger any signal to rsyslogd.

I have tried to replicate the issue on CentOS 6.5 - added monit to upstart using simple configuration, tried to restart rsyslogd via monit and then stopped monit ... works fine, rsyslogd keeps running.
```
set daemon 5
set httpd port 2812 allow monit:monit
set logfile syslog

check process rsyslogd with pidfile "/var/run/syslogd.pid"
   start program = "/etc/init.d/rsyslog start"
   stop  program = "/etc/init.d/rsyslog stop"
   if 5 restarts within 5 cycles then timeout
```
The stop seems to be driven externally - by some 3rd party SW (puppet?)

Can you start Monit outside of upstart control?:
1. /sbin/stop monit #stop monit via upstart
2. /usr/bin/monit #start monit manually
3. /usr/bin/monit restart rsyslogd #restart rsyslogd process via monit
4. /usr/bin/monit quit #stop monit
5. check if rsyslogd was restarted at the same time or after monit stopped
- 2014-10-23T21:55:08+00:00
Tildeslash repo owner
Any news on this?
- 2014-12-19T13:19:56+00:00
Alexander Litvak reporter
Sorry for not updating this. It is still the mystery. I have tried all things you suggested and couldn't see anything abnormal. The problem was intermittent to say the list. Since I have not seen the re-occurrence of this lately, I suggest you close the issue. I can revisit it when it happens again hopefully with the better information next time.
- 2014-12-19T16:13:57+00:00
Tildeslash repo owner
- changed status to closed
Closed as per reporters suggestion.
- 2014-12-19T18:15:26+00:00
chelskyboy@gmail.com
Hi,

I had this issue when using Monit on centos7. When I stop or restart Monit by command "service monit stop|restart", my monitored applications are killed
- 2016-01-10T18:32:21+00:00
Baraa Basata
I've experienced this behavior on Fedora Server 21. In my case, I've tracked this down to use of the default systemd KillMode, which is control-group.

http://www.freedesktop.org/software/systemd/man/systemd.kill.html

Adding a unit file to set KillMode=process for monit.service ensures that systemctl stop monit only kills the monit daemon and not the other processes in its control group.
- 2016-01-12T21:41:56+00:00
chelskyboy@gmail.com
Thank baraabasata, you save my life :)
- 2016-01-14T10:08:11+00:00
Tildeslash repo owner
- changed status to resolved
updated the systemd template to reflect the KillMode=process(thanks to @baraabasata)
- 2016-01-18T20:53:17+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: resolved

Component: –

Version: –

Votes: 0

Watchers: 3