[patch] Linux 32-bit NetStatistics.c fix

Issue #138 resolved
Lonnie Abelbeck created an issue

Hi again, :-)

Since the values of /proc/net/dev on 32-bit Linux systems are 32-bit, the values easily overflow and rollover... about 40 seconds for a 1 GB interface.

The attached patch monitors the delta of the /proc/net/dev entries and adds 2**32 if it has rolled over and sums the difference to the "now" value. 64-bit values should continue to work.

As long as a daemon delay of 30 is used a 1 GB interfaces can be monitored.

If there was a way to call "ethtool -S eth0" instead then it could be used to obtain all of the Linux net statistics values, but given this is in libmonit that probably isn't an option.

While this may not be an ideal solution, is there a better fix ?

Comments (10)

  1. Lonnie Abelbeck reporter

    The updated patch simplifies things, and adds an argument to whether a 32-bit adjustment is needed.

    Additionally the "now" value is guaranteed to be increasing, which is needed if the history to have meaning. So if the system counters got reset the next sample would be 0 change and further samples would continue normally.

    I think it is important the key Monit counter "now" is incremented rather than copied from the system counters.

    Also if it were ever desired for the Monit counter "now" to start at 0LL whenever Monit is started, this is now possible, but not currently implemented.

    This works in my testing.

    BTW, the patch only implements _updateRawStats() for "os/linux", but could be applied to the others.

  2. Tildeslash repo owner

    Hi Lonnie,

    the problem was fixed in the development version, thanks for pointing out the problem and for the patch.

  3. Lonnie Abelbeck reporter

    Thanks for making this happen.

    But, looking over your implementation I see a couple issues...

    1) In libmonit / src / system / Link.c, _updateValue(), the added overflow constant should be 2^32 not 2^32 - 1

    value = data->now + 0xFFFFFFFFLL - data->raw + raw; // Counter wrapped
    

    Change to:

    value = data->now + 0x100000000LL - data->raw + raw; // Counter wrapped
    

    2) Also in ibmonit / src / system / Link.c, _reset() the value of data->raw needs to be initialized to 0LL since _updateValue() uses data->raw before setting it

    add to _reset()

    #ifndef __LP64__
      T->ibytes.raw = 0LL;
      T->ipackets.raw = 0LL;
      T->ierrors.raw = 0LL;
      T->obytes.raw = 0LL;
      T->opackets.raw = 0LL;
      T->oerrors.raw = 0LL;
    #endif
    

    3) Also, not 100% sure but it appears to my eye that moving

    data->last = data->now = 0LL;
    

    to _resetData() is incorrect, it should be done only once via _reset().

  4. Lonnie Abelbeck reporter

    Bump... Your [ libmonit / src / system / Link.c ] code looks good, but still is missing my above 2) comment from the previous note, initializing the .raw values to 0ULL in _reset()

  5. Tildeslash repo owner

    Hi Lonnie,

    thanks for report, yes, the raw value should be reset too - fixed in _resetData(), which is called with 0 from _reset() (will be NOOP when called from first _updateHistory() call)

  6. Lonnie Abelbeck reporter

    Thanks, but are you certain that setting raw = now via _updateHistory() is proper ?

  7. Log in to comment