Regression, failed link check generates superfluous alerts

Issue #840 resolved
Former user created an issue

With a config containing:

check network External
  interface eth0
  if failed link then alert

and then remove the ethernet cable from eth0, I now receive many emails …

Link down Service External 
    Date:        Thu, 25 Jul 2019 09:10:53
    Action:      alert
    Host:        pbx4
    Description: link down

Link up Service External 
    Date:        Thu, 25 Jul 2019 09:11:53
    Action:      alert
    Host:        pbx4
    Description: link data collection succeeded

Link down Service External 
    Date:        Thu, 25 Jul 2019 09:11:54
    Action:      alert
    Host:        pbx4
    Description: link down

Link up Service External 
    Date:        Thu, 25 Jul 2019 09:12:54
    Action:      alert
    Host:        pbx4
    Description: link data collection succeeded

(each entry a separate email alert) all while the eth0 interface has no cable connected.

BTW, I tried “changed” instead of “failed” with the same issue.

I’m currently using 5.26.0, 5.25.3 has the same issue, but I’m not sure when the regression occurred.

Since I compile from source, I can easily test a patch fix.

Comments (3)

  1. Former user Account Deleted reporter

    I tested this patch, and it fixes this issue.

    --- monit-5.26.0/src/validate.c.orig    2019-07-25 14:34:01.725453914 -0500
    +++ monit-5.26.0/src/validate.c 2019-07-25 14:34:54.548704707 -0500
    @@ -1762,9 +1762,6 @@
             END_TRY;
             if (! havedata)
                     return State_Failed; // Terminate test if no data are available
    -        for (LinkStatus_T link = s->linkstatuslist; link; link = link->next) {
    -                Event_post(s, Event_Link, State_Succeeded, link->action, "link data collection succeeded");
    -        }
             // State
             if (! Link_getState(s->inf.net->stats)) {
                     for (LinkStatus_T link = s->linkstatuslist; link; link = link->next)
    

    Before commit 5dc268139 this for loop generated an invalid event, so "fixing" the event type actually causes a problem.

    Possibly this code was added for debugging at one time, but was never removed ?

  2. Log in to comment