- edited description
incorrect (way too low) filesystem service time when NFS failed
I had configured a “check filesystem” with “if service time > 50 milliseconds for 3 times within 5 cycles then alert”.
Accidentally we had a NFS failure this morning that lasted over 1000 seconds (that is there was no response within more than 1000 seconds).
However the monit log logged “… service time 1.380 s/operation matches resource limit [service time > 50 ms/operation]“
I think there is a big difference between 1.3 seconds and 1000 seconds. Also, more importantly, the issue was logged when the NFS server responded again, not when it had begun failing.
Comments (3)
-
reporter -
reporter - edited description
-
Hello Ulrich,
the output use "%.3f ms/operation" to format the output sometimes. Therefore 1.380 use a thousands seperator with "." also (I think).See src/validate.c resource servicetime format handling with "Convert_time2str(serviceTime" and in addition libmonit/src/util/Convert.c with "Convert_time2str" to use a useful time format.
I think,
Lutz
- Log in to comment