Running shell script with "check program" results in zombie child
Issue #242
on hold
Hi,
Many cases, when we tried to implement our own checks as simple shell scripts monit has ended up creating defunct child zombies when running our checks. For example I have a script that checks if a directory does not contain any regular files:
#!/bin/bash
# Checks if a directory does not contain regular files
if [[ ! -d $1 ]]; then
echo "Invalid usage"
exit 100
fi
(( $(find $1 -type f | wc -l) == 0 ))
So this shell scripted is invoked from monit like this:
...
check program "monit-include-files-exist-copera" with path "/bin/bash -c '! /etc/scripts/directory_is_empty.sh some_dir'"
every 5 cycles
if status != 0 then alert
...
Then ps aux | grep defunc shows:
root 23223 0.0 0.0 0 0 ? Zs 15:06 0:00 [bash] <defunct>
root 23930 0.0 0.0 0 0 ? Zs 15:10 0:00 [bash] <defunct>
root 23932 0.0 0.0 0 0 ? Zs 15:10 0:00 [monit_check_pro] <defunct>
root 23934 0.0 0.0 0 0 ? Zs 15:10 0:00 [bash] <defunct>
root 23938 0.0 0.0 0 0 ? Zs 15:10 0:00 [monit_check_pro] <defunct>
root 23946 0.0 0.0 0 0 ? Zs 15:10 0:00 [monit_check_pro] <defunct>
root 23951 0.0 0.0 0 0 ? Zs 15:10 0:00 [bash] <defunct>
root 23958 0.0 0.0 0 0 ? Zs 15:10 0:00 [monit_check_pro] <defunct>
root 23968 0.0 0.0 0 0 ? Zs 15:10 0:00 [bash] <defunct>
root 23974 0.0 0.0 0 0 ? Zs 15:10 0:00 [monit_check_pro] <defunct>
root 24015 0.0 0.1 6388 684 pts/0 S+ 15:11 0:00 grep defunc
It is pretty similar with some of our other shell scripts.
Any ideas?
Thanks a lot.
Cheers, Gyuri
Comments (7)
-
repo owner -
repo owner - changed status to on hold
Will be fixed in the next major version of Monit
-
repo owner - edited description
-
repo owner - edited description
-
repo owner - removed version
Removing version: 5.12 (automated comment)
-
repo owner Issue
#450was marked as a duplicate of this issue. -
repo owner Issue
#697was marked as a duplicate of this issue. - Log in to comment
TL;DR; Monit runs in cycles and sub-processes started in one cycle are cleaned up in the next cycle. This is explained in more detail in the Monit manual:
*The asynchronous nature of the program check allows for non-blocking behavior in the current Monit design, but it comes with a side-effect: when a program has finished executing and is waiting for Monit to collect the result, it becomes a so-called "zombie" process. A zombie process does not consume any system resources (only the PID remains in use) and it is under Monit's control. The zombie process is removed from the system as soon as Monit collects the exit status. This means that every "check program" will be associated with either a running process or a temporary zombie. This unwanted zombie side-effect will be removed in a later release of Monit. *