Monit 5.8.1 dies

Issue #74 closed
Artem Russakovskii created an issue

Hi,

Monit 5.8.1 on 32bit OpenSuse 12.3 seems to die every few days with this error:

critical : AssertException: nbytes > 0
 raised in Mem_alloc at src/system/Mem.c:51

The box has 4GB RAM and 2GB swap and is usually not loaded. Monit seems to be the only process that dies too.

Comments (19)

  1. Tildeslash repo owner

    Hello,

    please can you send Monit log (enabled with "set logfile" statement) and your Monit configuration file to support@mmonit.com?

  2. Artem Russakovskii reporter

    Hi,

    Thanks for the quick response.

    Sending the log now, with some auth information replaced with * characters.

  3. Tildeslash repo owner

    Hi,

    we're unable to replicate the issue in our lab (testing on OpenSUSE 12.3 32-bit with the same configuration).

    Please can you run enable coredump ("ulimit -c unlimited" before starting monit) + restart monit?

    You can also set global coredump like this:

    sysctl -w kernel.core_pattern="/var/crash/core.%e.%t.%p"
    sysctl -w kernel.core_uses_pid=1
    

    When Monit will terminate, please send the stacktrace:

    gdb <path to monit> <path to monit's coredump>
    (gdb) thr apply all bt
    
  4. Artem Russakovskii reporter

    Wow, just checked the 64-bit OpenSUSE 13.1 install, and it crashes there too.

    [PDT Sep 6 10:54:57] critical : AssertException: nbytes > 0 raised in Mem_alloc at src/system/Mem.c:51

    I've just run your commands and restarted monit. Let's see if it generates a core dump next time.

  5. Tildeslash repo owner

    Any news on this? We are ready with Monit 5.9, but we do not want to release the new version before this is fixed. Unfortunately we need your help since we are not able to reproduce with out config test.

  6. Artem Russakovskii reporter

    @tildeslash Unfortunately, no. It's been running without a crash ever since my last comment.

    I don't think you should delay the release, as the issue was there for me on the 5.8 and 5.8.1 releases, and I think it may be very limited. Either way, 5.9 wouldn't introduce it as a new bug, so I think you're good to go.

  7. Artem Russakovskii reporter

    Well, it crashed, but no core. I've always had trouble getting cores generated in OpenSUSE, there are too many things that might prevent it - it's like the perfect storm of 5 different variables that need to be set in order for the core to be produced.

  8. Tildeslash repo owner

    Unfortunately, we need the core to pinpoint the problem and so far you are the only one who have seen this crash.

  9. Tildeslash repo owner

    It may help to start Monit from console (not via system's start/stop scripts):

    1.) open terminal (or you can use the "screen" utility to be able to return to the session) 2.) run "ulimit -c unlimited" (enabled coredump with no size limit) 3.) run monit from that terminal ("-I" option allows to run it in the foreground, don't close this terminal): monit -I

    The core will be dumped either to central location (if the mentioned kernel.core_pattern is set) or current directory.

  10. Artem Russakovskii reporter

    Unfortunately, no. The core dump wasn't generated even given everything above. It also hasn't crashed much lately, though that may be attributed to the load of the system which has been lower lately.

  11. Tildeslash repo owner

    We're closing this task for now. Please reopen if the problem should reoccur and you have more info.

  12. grigorye

    Just in case, the coredump would take something like 769MB (I tried to kill monit for the test). I can try to lldb it as necessary though (it looks like it works, at least I can see stacktraces).

  13. Log in to comment