Understanding "restarts within" behavior

Issue #1073 new
和田竜郎 created an issue

Hello,

I have a question about the behavior of the "restarts within" condition in a Monit service configuration.

Here's my configuration:

check process program with pidfile /var/run/program.pid
start program = "/bin/systemctl start program.service" with timeout 10 seconds
stop program = "/bin/systemctl stop program.service"
if does not exist then restart
else if succeeded then exec "/etc/monit.d/program_trap.sh"
if 2 restarts within 2 cycles then exec "/etc/monit.d/reboot.sh"

According to my understanding, if the service stops and Monit restarts it twice within two cycles, the system should reboot. However, in practice, it seems that the system only reboots after the third consecutive cycle where the service is stopped and restarted.

Could you please explain why this is the case, and how exactly the "restarts within" condition works?

Thank you in advance for your help.

Comments (3)

  1. 和田竜郎 reporter

    Hello again,

    After further investigation and testing, I've come to a deeper understanding about the "restarts within" condition in Monit.

    It appears that the count for "restarts within" is evaluated at the end of the specified monitoring cycles, not immediately after the second restart occurs. Even if there are two consecutive restarts, the condition is not evaluated at that moment.

    The evaluation only happens at the start of the next monitoring cycle. At this point, if the number of restarts in the previous cycle matches the "restarts within" condition, then the specified action (in my case, system reboot) is triggered.

    Furthermore, the state of the program (active or inactive) in the next cycle does not influence the "restarts within" condition. It purely depends on the number of restarts that occurred within the specified monitoring cycles.

    Thank you for your support and I hope this insight might be helpful to others as well.

    Best regards,

    Tatsuro Wada

  2. Lutz Mader

    Sorry Tatsuro Wada for the late answer,
    in short you are right, this is the way how monit work.

    The action of the "restart within" will done if the number of restarts will exceed the given count. The count will not restart when the program status is OK.

    A configuration reload or unmonitor/monitor of the service restart the counter.

    Lutz

    BTW

    Something like
    "if 2 restarts within 2 cycles then exec "/etc/monit.d/reboot.sh"" is not the best idea.

    The max number of restarts in two cycles is two, a cycles span of three/four is more usefull, from my point of view.

  3. Log in to comment