- attached monit-0200-linux-net-statsV2.patch
[patch] Linux 32-bit NetStatistics.c fix
Hi again, :-)
Since the values of /proc/net/dev on 32-bit Linux systems are 32-bit, the values easily overflow and rollover... about 40 seconds for a 1 GB interface.
The attached patch monitors the delta of the /proc/net/dev entries and adds 2**32 if it has rolled over and sums the difference to the "now" value. 64-bit values should continue to work.
As long as a daemon delay of 30 is used a 1 GB interfaces can be monitored.
If there was a way to call "ethtool -S eth0" instead then it could be used to obtain all of the Linux net statistics values, but given this is in libmonit that probably isn't an option.
While this may not be an ideal solution, is there a better fix ?
Comments (10)
-
Account Deleted reporter -
Account Deleted reporter The updated patch simplifies things, and adds an argument to whether a 32-bit adjustment is needed.
Additionally the "now" value is guaranteed to be increasing, which is needed if the history to have meaning. So if the system counters got reset the next sample would be 0 change and further samples would continue normally.
I think it is important the key Monit counter "now" is incremented rather than copied from the system counters.
Also if it were ever desired for the Monit counter "now" to start at 0LL whenever Monit is started, this is now possible, but not currently implemented.
This works in my testing.
BTW, the patch only implements _updateRawStats() for "os/linux", but could be applied to the others.
-
repo owner Hi Lonnie,
the problem was fixed in the development version, thanks for pointing out the problem and for the patch.
-
repo owner - changed status to resolved
-
Account Deleted reporter Thanks for making this happen.
But, looking over your implementation I see a couple issues...
1) In libmonit / src / system / Link.c, _updateValue(), the added overflow constant should be 2^32 not 2^32 - 1
value = data->now + 0xFFFFFFFFLL - data->raw + raw; // Counter wrapped
Change to:
value = data->now + 0x100000000LL - data->raw + raw; // Counter wrapped
2) Also in ibmonit / src / system / Link.c, _reset() the value of data->raw needs to be initialized to 0LL since _updateValue() uses data->raw before setting it
add to _reset()
#ifndef __LP64__ T->ibytes.raw = 0LL; T->ipackets.raw = 0LL; T->ierrors.raw = 0LL; T->obytes.raw = 0LL; T->opackets.raw = 0LL; T->oerrors.raw = 0LL; #endif
3) Also, not 100% sure but it appears to my eye that moving
data->last = data->now = 0LL;
to _resetData() is incorrect, it should be done only once via _reset().
-
Account Deleted reporter Bump... Your [ libmonit / src / system / Link.c ] code looks good, but still is missing my above 2) comment from the previous note, initializing the .raw values to 0ULL in _reset()
-
repo owner Hi Lonnie,
thanks for report, yes, the raw value should be reset too - fixed in _resetData(), which is called with 0 from _reset() (will be NOOP when called from first _updateHistory() call)
-
Account Deleted reporter Thanks, but are you certain that setting raw = now via _updateHistory() is proper ?
-
repo owner yes, it is called on first update where 'now' equals 'raw'
-
repo owner - removed version
Removing version: 5.11 (automated comment)
- Log in to comment
Updated version of the the patch.
Also available here