- edited description
Suggestion: add process name in description of Data access error
I think it would be really helpful if the description of "Data access error" event would include the process name as well, not just the PID.
Background: I have to use sometimes the "CHECK PROCESS ... MATCHING <regex>" format, as the monitored process gets restarted regularly after updates and Monit would keep on sending unnecessary warnings that the PID has changed, if I were using the "... PIDFILE <pidfile>" format. I was using the following config:
CHECK PROCESS spamd MATCHING "spamd"
START PROGRAM = "/etc/rc.d/rc.spamd start"
STOP PROGRAM = "/etc/rc.d/rc.spamd stop"
IF NOT EXIST FOR 5 CYCLES THEN RESTART
CHECK PROGRAM spamd-update WITH PATH "/etc/monit.d/spamd-update"
EVERY 60 CYCLES
IF STATUS = 1 THEN ALERT
When using the above, I kept on receiving regular email warnings with the description:
Data access error Service spamd
Description: process with pid 7739 is a zombie
It took me quite a while to realise that Monit was occasionally matching the "spamd-update" process, not "spamd" - which was a zombie because of how Monit works (where scripts are run on one cycle, and Monit checks the output on the next cycle - but as a result they are left as zombies for one cycle). Renaming the "spamd-update" script to something else so that the regex above it matches spamd uniquely solved the problem in my case - but if the error description would have included the name of the process, not just the PID, I would have realised much sooner what I was doing wrong.
Comments (2)
-
reporter -
reporter - edited description
- Log in to comment