- changed status to resolved
New "if link down" test is unclear/question
Hello Tildeslash,
based on your release note I do not understand the definitions of "if failed link", "if link down" and "if link up". The tests seems to work and the results and behavior fits to my understanding, but the recovery definition for "if link down" does not fit, I think.
The definition to check the link (now deprecated):
check network en1 with interface en1 if failed link then alert else if succeeded then alert
And the new definition is:
check network en1 with interface en1 if link down then alert else if failed then alert
From my point of view, the recovery definition "failed" used by the new test does not fit, "succeeded" should be used as well.
The new opposite test is:
check network en1 with interface en1 if link up then alert else if succeeded then alert
This fit well to the other definitions, "succeeded" is the right recovery definition.
The recovery definition "failed" is useful with the new "if succeeded …" definitions only.
Nice to see a positive test will result in an alert now, this behaviour is very useful sometimes and fit to the existence test became available with 5.21.0 for some service types.
With regards,
Lutz
p.s.
See "src/p.y", compare "if failed link" to "if link down".
linkstatus : IF FAILED LINK rate1 THEN action1 recovery_success { /* Deprecated */ addeventaction(&(linkstatusset).action, $<number>6, $<number>7); addlinkstatus(current, &linkstatusset); } | IF LINK DOWN rate1 THEN action1 recovery_failure { linkstatusset.check_invers = false; addeventaction(&(linkstatusset).action, $<number>6, $<number>7); addlinkstatus(current, &linkstatusset); } | IF LINK UP rate1 THEN action1 recovery_success { linkstatusset.check_invers = true; addeventaction(&(linkstatusset).action, $<number>6, $<number>7); addlinkstatus(current, &linkstatusset); } ;
Comments (6)
-
repo owner -
reporter Hello Tildeslash,
this test| IF LINK DOWN rate1 THEN action1 recovery_success {
will fit to the old definition now.
: IF FAILED LINK rate1 THEN action1 recovery_success {
But this was correct and should not changed, I think.
| IF LINK UP rate1 THEN action1 recovery_success
It fits well to the "NOT EXIST" and "EXIST" tests pair.
: IF NOT EXIST rate1 THEN action1 recovery_success { | IF EXIST rate1 THEN action1 recovery_success {
The recovery definition "failed" is useful with the new "IF SUCCEEDED …" definitions only.
With regards,
Lutz -
reporter - changed status to open
Are you sure, I think this is not the solution. See my comment, Lutz
-
repo owner In my book the “IF LINK UP THEN <action1> ELSE IF FAILED <action2>” sounds OK. The LINK UP event is fired, when the link became usable (i.e. the link is functional => OK). When the link is DOWN, it not functional => kind of failed.
The LINK UP test can be used for example to start VPN when the link was enabled, so one cannot say that “LINK UP” is failure (it may be in certain context, if you want to check that the link is always down, but that is just one use case).
There is a new short format of the else, which doesn’t require the “succeeded/failed” keyword. The test could be written like this:
IF LINK UP THEN <action1> ELSE <action2>
I think this will be the usual use case with “IF LINK UP”. The “else if succeeded/failed” is verbose format, which usually will make sense only if used with the event rate (such as “else if failed 3 times within 5 cycles then …”)
-
repo owner - changed status to resolved
-
reporter Hello Tildeslash,
the term failed/succeeded is based on the point of view.There is a new short format of the else, which doesn’t require the “succeeded/failed” keyword. The test could be written like this:
IF LINK UP THEN <action1> ELSE <action2>
I think this will be the usual use case with “IF LINK UP”. The “else if succeeded/failed” is verbose format, which usually will make sense only if used with the event rate (such as “else if failed 3 times within 5 cycles then …”)
You are right, but I use the event rate to make my rules more fault tolerant (see https://mmonit.com/monit/documentation/monit.html#FAULT-TOLERANCE). The used simple status model seems to be at the end, see my post from the past "Status and state handling improvement" (#923) and your list to similar request in this category.
Thanks for your explanation/help,
Lutz - Log in to comment
Fixed: Issue
#971(5.28.0 typo): The optional "ELSE" part of the "IF LINK UP THEN <action> ELSE IF FAILED THEN <action>" and IF LINK DOWN THEN <action> ELSE IF SUCCEDED THEN <action>".→ <<cset 7fe9a0c8f66e>>