** Error in `/usr/bin/monit': malloc(): memory corruption: 0x0000000002229970 ***

Issue #1005 resolved
Tom Hodder created an issue

I tried deploying monit to monitor a ceph cluster, and I’m seeing a crash as it startsup

in particular it seems to be failing on check_remote_host::

Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(Socket_test+0x132)[0x426012]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit[0x42e315]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(check_remote_host+0x1c3)[0x4
32693]

which is odd, because i don’t have of those defined in my config (i.e. no check host) at all

Oct 18 12:14:50 mon-node1 monit[7638]: Starting Monit 5.26.0 daemon with http inte
rface at [0.0.0.0]:2812
Oct 18 12:14:50 mon-node1 monit[7638]: Monit will delay for 30s on first start aft
er reboot ...
Oct 18 12:15:20 mon-node1 monit[7638]: 'mon-node1' Monit 5.26.0 started
Oct 18 12:15:20 mon-node1 monit[7638]: 'ceph_mon_proc' service restarted 3 times w
ithin 3 cycles(s) - unmonitor
Oct 18 12:15:20 mon-node1 monit[7638]: *** Error in `/usr/bin/monit': malloc(): me
mory corruption: 0x0000000002229970 ***
Oct 18 12:15:20 mon-node1 monit[7638]: ======= Backtrace: =========
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(+0x82b36)[0x7fedc3ca3b36]
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(__libc_malloc+0x4c)[0x7fed
c3ca678c]
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(+0x218bd)[0x7fedc3c428bd]
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(+0xdc0ef)[0x7fedc3cfd0ef]
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(regexec+0xc5)[0x7fedc3d028
15]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(check_generic+0x19c)[0x44700
c]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit[0x425dac]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(Socket_test+0x132)[0x426012]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit[0x42e315]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(check_remote_host+0x1c3)[0x4
32693]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(validate+0x1b6)[0x42f9b6]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit(main+0x4f1)[0x40d0a1]
Oct 18 12:15:20 mon-node1 monit[7638]: /lib64/libc.so.6(__libc_start_main+0xf5)[0x
7fedc3c43555]
Oct 18 12:15:20 mon-node1 monit[7638]: /usr/bin/monit[0x40d6e3]
Oct 18 12:15:20 mon-node1 monit[7638]: ======= Memory map: ========
Oct 18 12:15:20 mon-node1 monit[7638]: 00400000-004b2000 r-xp 00000000 fd:00 85422
/usr/bin/monit
Oct 18 12:15:20 mon-node1 monit[7638]: 006b1000-006b2000 r--p 000b1000 fd:00 85422
/usr/bin/monit
Oct 18 12:15:20 mon-node1 monit[7638]: 006b2000-006b4000 rw-p 000b2000 fd:00 85422
/usr/bin/monit
Oct 18 12:15:20 mon-node1 monit[7638]: 006b4000-006b8000 rw-p 00000000 00:00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 021f7000-02255000 rw-p 00000000 00:00 0
[heap]
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedb4000000-7fedb4021000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedb4021000-7fedb8000000 ---p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedbc000000-7fedbc021000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedbc021000-7fedc0000000 ---p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc15b7000-7fedc15cc000 r-xp 00000000 fd:
00 33576299                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc15cc000-7fedc17cb000 ---p 00015000 fd:
00 33576299                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc17cb000-7fedc17cc000 r--p 00014000 fd:
00 33576299                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc17cc000-7fedc17cd000 rw-p 00015000 fd:
00 33576299                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc17cd000-7fedc17ce000 ---p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc17ce000-7fedc1fce000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc1fce000-7fedc1fda000 r-xp 00000000 fd:
00 34214705                   /usr/lib64/libnss_files-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc1fda000-7fedc21d9000 ---p 0000c000 fd:
00 34214705                   /usr/lib64/libnss_files-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc21d9000-7fedc21da000 r--p 0000b000 fd:
00 34214705                   /usr/lib64/libnss_files-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc21da000-7fedc21db000 rw-p 0000c000 fd:
00 34214705                   /usr/lib64/libnss_files-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc21db000-7fedc21e1000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc21e1000-7fedc2241000 r-xp 00000000 fd:
00 33566932                   /usr/lib64/libpcre.so.1.2.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2241000-7fedc2441000 ---p 00060000 fd:
00 33566932                   /usr/lib64/libpcre.so.1.2.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2441000-7fedc2442000 r--p 00060000 fd:
00 33566932                   /usr/lib64/libpcre.so.1.2.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2442000-7fedc2443000 rw-p 00061000 fd:
00 33566932                   /usr/lib64/libpcre.so.1.2.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2443000-7fedc2467000 r-xp 00000000 fd:
00 34122070                   /usr/lib64/libselinux.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2467000-7fedc2666000 ---p 00024000 fd:
00 34122070                   /usr/lib64/libselinux.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2666000-7fedc2667000 r--p 00023000 fd:
00 34122070                   /usr/lib64/libselinux.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2667000-7fedc2668000 rw-p 00024000 fd:
00 34122070                   /usr/lib64/libselinux.so.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2668000-7fedc266a000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc266a000-7fedc266d000 r-xp 00000000 fd:
00 33567120                   /usr/lib64/libkeyutils.so.1.5
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc266d000-7fedc286c000 ---p 00003000 fd:
00 33567120                   /usr/lib64/libkeyutils.so.1.5
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc286c000-7fedc286d000 r--p 00002000 fd:
00 33567120                   /usr/lib64/libkeyutils.so.1.5
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc286d000-7fedc286e000 rw-p 00003000 fd:
00 33567120                   /usr/lib64/libkeyutils.so.1.5
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc286e000-7fedc287c000 r-xp 00000000 fd:
00 33615309                   /usr/lib64/libkrb5support.so.0.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc287c000-7fedc2a7c000 ---p 0000e000 fd:
00 33615309                   /usr/lib64/libkrb5support.so.0.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2a7c000-7fedc2a7d000 r--p 0000e000 fd:
00 33615309                   /usr/lib64/libkrb5support.so.0.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2a7d000-7fedc2a7e000 rw-p 0000f000 fd:
00 33615309                   /usr/lib64/libkrb5support.so.0.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2a7e000-7fedc2a82000 r-xp 00000000 fd:
00 33567095                   /usr/lib64/libcap-ng.so.0.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2a82000-7fedc2c82000 ---p 00004000 fd:
00 33567095                   /usr/lib64/libcap-ng.so.0.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2c82000-7fedc2c83000 r--p 00004000 fd:
00 33567095                   /usr/lib64/libcap-ng.so.0.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2c83000-7fedc2c84000 rw-p 00005000 fd:
00 33567095                   /usr/lib64/libcap-ng.so.0.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2c84000-7fedc2cb5000 r-xp 00000000 fd:
00 33580020                   /usr/lib64/libk5crypto.so.3.1
Oct 18 12:15:20 mon-node1 systemd[1]: monit.service: main process exited, code=kil
led, status=6/ABRT
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2cb5000-7fedc2eb4000 ---p 00031000 fd:
00 33580020                   /usr/lib64/libk5crypto.so.3.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2eb4000-7fedc2eb6000 r--p 00030000 fd:
00 33580020                   /usr/lib64/libk5crypto.so.3.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2eb6000-7fedc2eb7000 rw-p 00032000 fd:
00 33580020                   /usr/lib64/libk5crypto.so.3.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2eb7000-7fedc2eba000 r-xp 00000000 fd:
00 34214736                   /usr/lib64/libcom_err.so.2.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc2eba000-7fedc30b9000 ---p 00003000 fd:
00 34214736                   /usr/lib64/libcom_err.so.2.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc30b9000-7fedc30ba000 r--p 00002000 fd:
00 34214736                   /usr/lib64/libcom_err.so.2.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc30ba000-7fedc30bb000 rw-p 00003000 fd:
00 34214736                   /usr/lib64/libcom_err.so.2.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc30bb000-7fedc3194000 r-xp 00000000 fd:
00 33580028                   /usr/lib64/libkrb5.so.3.3
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3194000-7fedc3393000 ---p 000d9000 fd:
00 33580028                   /usr/lib64/libkrb5.so.3.3
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3393000-7fedc33a1000 r--p 000d8000 fd:
00 33580028                   /usr/lib64/libkrb5.so.3.3
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc33a1000-7fedc33a4000 rw-p 000e6000 fd:
00 33580028                   /usr/lib64/libkrb5.so.3.3
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc33a4000-7fedc33ee000 r-xp 00000000 fd:
00 33615305                   /usr/lib64/libgssapi_krb5.so.2.2
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc33ee000-7fedc35ee000 ---p 0004a000 fd:
00 33615305                   /usr/lib64/libgssapi_krb5.so.2.2
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc35ee000-7fedc35ef000 r--p 0004a000 fd:
00 33615305                   /usr/lib64/libgssapi_krb5.so.2.2
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc35ef000-7fedc35f1000 rw-p 0004b000 fd:
00 33615305                   /usr/lib64/libgssapi_krb5.so.2.2
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc35f1000-7fedc35f3000 r-xp 00000000 fd:
00 33562483                   /usr/lib64/libfreebl3.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc35f3000-7fedc37f2000 ---p 00002000 fd:
00 33562483                   /usr/lib64/libfreebl3.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc37f2000-7fedc37f3000 r--p 00001000 fd:
00 33562483                   /usr/lib64/libfreebl3.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc37f3000-7fedc37f4000 rw-p 00002000 fd:
00 33562483                   /usr/lib64/libfreebl3.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc37f4000-7fedc37f6000 r-xp 00000000 fd:
00 34214693                   /usr/lib64/libdl-2.17.so
Oct 18 12:15:20 mon-node1 systemd[1]: Unit monit.service entered failed state.
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc37f6000-7fedc39f6000 ---p 00002000 fd:
00 34214693                   /usr/lib64/libdl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc39f6000-7fedc39f7000 r--p 00002000 fd:
00 34214693                   /usr/lib64/libdl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc39f7000-7fedc39f8000 rw-p 00003000 fd:
00 34214693                   /usr/lib64/libdl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc39f8000-7fedc3a16000 r-xp 00000000 fd:
00 34214741                   /usr/lib64/libaudit.so.1.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3a16000-7fedc3c15000 ---p 0001e000 fd:
00 34214741                   /usr/lib64/libaudit.so.1.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3c15000-7fedc3c16000 r--p 0001d000 fd:
00 34214741                   /usr/lib64/libaudit.so.1.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3c16000-7fedc3c17000 rw-p 0001e000 fd:
00 34214741                   /usr/lib64/libaudit.so.1.0.0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3c17000-7fedc3c21000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3c21000-7fedc3de5000 r-xp 00000000 fd:
00 33562772                   /usr/lib64/libc-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3de5000-7fedc3fe4000 ---p 001c4000 fd:
00 33562772                   /usr/lib64/libc-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3fe4000-7fedc3fe8000 r--p 001c3000 fd:
00 33562772                   /usr/lib64/libc-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3fe8000-7fedc3fea000 rw-p 001c7000 fd:
00 33562772                   /usr/lib64/libc-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3fea000-7fedc3fef000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc3fef000-7fedc4226000 r-xp 00000000 fd:
00 33610125                   /usr/lib64/libcrypto.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4226000-7fedc4425000 ---p 00237000 fd:
00 33610125                   /usr/lib64/libcrypto.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4425000-7fedc4441000 r--p 00236000 fd:
00 33610125                   /usr/lib64/libcrypto.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4441000-7fedc444e000 rw-p 00252000 fd:
00 33610125                   /usr/lib64/libcrypto.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc444e000-7fedc4452000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4452000-7fedc44b9000 r-xp 00000000 fd:
00 33580036                   /usr/lib64/libssl.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc44b9000-7fedc46b9000 ---p 00067000 fd:
00 33580036                   /usr/lib64/libssl.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc46b9000-7fedc46bd000 r--p 00067000 fd:
00 33580036                   /usr/lib64/libssl.so.1.0.2k
Oct 18 12:15:20 mon-node1 systemd[1]: monit.service failed.
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc46bd000-7fedc46c4000 rw-p 0006b000 fd:
00 33580036                   /usr/lib64/libssl.so.1.0.2k
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc46c4000-7fedc46db000 r-xp 00000000 fd:
00 34214697                   /usr/lib64/libnsl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc46db000-7fedc48da000 ---p 00017000 fd:
00 34214697                   /usr/lib64/libnsl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc48da000-7fedc48db000 r--p 00016000 fd:
00 34214697                   /usr/lib64/libnsl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc48db000-7fedc48dc000 rw-p 00017000 fd:
00 34214697                   /usr/lib64/libnsl-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc48dc000-7fedc48de000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc48de000-7fedc48f4000 r-xp 00000000 fd:
00 34214715                   /usr/lib64/libresolv-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc48f4000-7fedc4af4000 ---p 00016000 fd:
00 34214715                   /usr/lib64/libresolv-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4af4000-7fedc4af5000 r--p 00016000 fd:
00 34214715                   /usr/lib64/libresolv-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4af5000-7fedc4af6000 rw-p 00017000 fd:
00 34214715                   /usr/lib64/libresolv-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4af6000-7fedc4af8000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4af8000-7fedc4b00000 r-xp 00000000 fd:
00 33562777                   /usr/lib64/libcrypt-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4b00000-7fedc4cff000 ---p 00008000 fd:
00 33562777                   /usr/lib64/libcrypt-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4cff000-7fedc4d00000 r--p 00007000 fd:
00 33562777                   /usr/lib64/libcrypt-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4d00000-7fedc4d01000 rw-p 00008000 fd:
00 33562777                   /usr/lib64/libcrypt-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4d01000-7fedc4d2f000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4d2f000-7fedc4d46000 r-xp 00000000 fd:
00 34214713                   /usr/lib64/libpthread-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4d46000-7fedc4f45000 ---p 00017000 fd:
00 34214713                   /usr/lib64/libpthread-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4f45000-7fedc4f46000 r--p 00016000 fd:
00 34214713                   /usr/lib64/libpthread-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4f46000-7fedc4f47000 rw-p 00017000 fd:
00 34214713                   /usr/lib64/libpthread-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4f47000-7fedc4f4b000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4f4b000-7fedc4f60000 r-xp 00000000 fd:
00 34214734                   /usr/lib64/libz.so.1.2.7
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc4f60000-7fedc515f000 ---p 00015000 fd:
00 34214734                   /usr/lib64/libz.so.1.2.7
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc515f000-7fedc5160000 r--p 00014000 fd:
00 34214734                   /usr/lib64/libz.so.1.2.7
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5160000-7fedc5161000 rw-p 00015000 fd:
00 34214734                   /usr/lib64/libz.so.1.2.7
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5161000-7fedc516e000 r-xp 00000000 fd:
00 33615332                   /usr/lib64/libpam.so.0.83.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc516e000-7fedc536e000 ---p 0000d000 fd:
00 33615332                   /usr/lib64/libpam.so.0.83.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc536e000-7fedc536f000 r--p 0000d000 fd:
00 33615332                   /usr/lib64/libpam.so.0.83.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc536f000-7fedc5370000 rw-p 0000e000 fd:
00 33615332                   /usr/lib64/libpam.so.0.83.1
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5370000-7fedc5471000 r-xp 00000000 fd:
00 34214695                   /usr/lib64/libm-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5471000-7fedc5670000 ---p 00101000 fd:
00 34214695                   /usr/lib64/libm-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5670000-7fedc5671000 r--p 00100000 fd:
00 34214695                   /usr/lib64/libm-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5671000-7fedc5672000 rw-p 00101000 fd:
00 34214695                   /usr/lib64/libm-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5672000-7fedc5694000 r-xp 00000000 fd:
00 34214690                   /usr/lib64/ld-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5880000-7fedc588b000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5890000-7fedc5893000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5893000-7fedc5894000 r--p 00021000 fd:
00 34214690                   /usr/lib64/ld-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5894000-7fedc5895000 rw-p 00022000 fd:
00 34214690                   /usr/lib64/ld-2.17.so
Oct 18 12:15:20 mon-node1 monit[7638]: 7fedc5895000-7fedc5896000 rw-p 00000000 00:
00 0
Oct 18 12:15:20 mon-node1 monit[7638]: 7fffa04dd000-7fffa04fe000 rw-p 00000000 00:
00 0                          [stack]
Oct 18 12:15:20 mon-node1 monit[7638]: 7fffa058d000-7fffa058f000 r-xp 00000000 00:
00 0                          [vdso]
Oct 18 12:15:20 mon-node1 monit[7638]: ffffffffff600000-ffffffffff601000 r-xp 0000
0000 00:00 0                  [vsyscall]

The last thing in the log

[root@mon-node1 ~]# tail -f --lines=0 /var/log/monit.log
[UTC Oct 18 12:10:41] error    : Monit: the monit daemon is not running
[UTC Oct 18 12:12:58] info     : Starting Monit 5.26.0 daemon with http interface
at [0.0.0.0]:2812
[UTC Oct 18 12:12:58] info     : Monit will delay for 30s on first start after reb
oot ...
[UTC Oct 18 12:13:03] error    : Cannot create socket to [0.0.0.0]:2812 -- Connect
ion refused
[UTC Oct 18 12:13:07] error    : Cannot create socket to [0.0.0.0]:2812 -- Connect
ion refused
[UTC Oct 18 12:13:19] error    : Cannot create socket to [0.0.0.0]:2812 -- Connect
ion refused
[UTC Oct 18 12:13:28] info     : 'mon-node1' Monit 5.26.0 started
[UTC Oct 18 12:13:28] error    : 'ceph_mon_proc' service restarted 3 times within
3 cycles(s) - unmonitor
[UTC Oct 18 12:14:50] info     : Starting Monit 5.26.0 daemon with http interface
at [0.0.0.0]:2812
[UTC Oct 18 12:14:50] info     : Monit will delay for 30s on first start after reb
oot ...
[UTC Oct 18 12:15:20] info     : 'mon-node1' Monit 5.26.0 started
[UTC Oct 18 12:15:20] error    : 'ceph_mon_proc' service restarted 3 times within
3 cycles(s) - unmonitor

Comments (8)

  1. Tom Hodder reporter

    [root@mon-node1 ~]# cat /etc/monitrc
    ###############################################################################
    ## Monit control file
    ###############################################################################

    set daemon 30
    with start delay 30
    set log syslog
    set mailserver localhost
    set eventqueue
    basedir /var/monit  # set the base directory where events will be stored

    set mail-format {
    from: monit@mon-node1
    subject: $SERVICE $EVENT  ($HOST)
    message: message: $EVENT Service $SERVICE
    Date:        $DATE
    Action:      $ACTION
    Host:        $HOST
    Description: $DESCRIPTION
    Yours sincerely,
    monit
    http://$HOST:2812/

    }

    set alert root@localhost
    # NOT ON { action, instance, pid, ppid }

    set httpd port 2812 and
    use address 127.0.0.1
    allow admin:"changeme"

    include /etc/monit.d/*

    and my main ceph.conf

    [root@mon-node1 ~]# cat /etc/monit.d/ceph.conf

    check file ceph_mon_bin with path /usr/bin/ceph-mon
    mode passive
    group ceph

    if failed checksum then unmonitor
    if failed permission 755 then unmonitor
    if failed uid root then unmonitor
    if failed gid root then unmonitor
    # if 3 restarts within 5 cycles then unmonitor

    check process ceph_mon_proc matching "/bin/ceph-mon"
    # mode passive
    group ceph
    # if not exist then exec "/usr/bin/echo test"
    # if disk write activity > 500 operations/s then alert
    if cpu > 60% for 2 cycles then alert
    if cpu > 80% for 5 cycles then restart
    if totalmem > 200.0 MB for 5 cycles then restart
    if children > 250 then restart
    if disk read > 500 kb/s for 10 cycles then alert
    if disk write > 500 kb/s for 10 cycles then alert
    if 3 restarts within 5 cycles then unmonitor

    check file ceph_mds_bin with path /bin/ceph-mds
    mode passive
    group ceph
    # start program = "/usr/bin/systemctl start ceph-mds@mon-node1"
    # stop program  = "/usr/bin/systemctl stop  ceph-mds@mon-node1"

    if failed checksum then unmonitor
    if failed permission 755 then unmonitor
    if failed uid root then unmonitor
    if failed gid root then unmonitor

    check process ceph_mds_proc matching "/bin/ceph-mds"
    if not exist then exec "/usr/bin/echo test"

    check file ceph_mgr_bin with path /bin/ceph-mgr
    mode passive
    group ceph
    start program = "/usr/bin/systemctl start ceph-mgr@mon-node1"
    stop program  = "/usr/bin/systemctl stop  ceph-mgr@mon-node1"

    if failed checksum then unmonitor
    if failed permission 755 then unmonitor
    if failed uid root then unmonitor
    if failed gid root then unmonitor

    check process ceph_mgr_proc matching "ceph-mgr"
    if not exist then exec "/usr/bin/echo test"

    check program ceph_health with path "/usr/bin/ceph health"
    if status != 0 then alert
    # if content != "HEALTH_OK" then alert

  2. Tildeslash repo owner

    The check_remote_host() shows up probably because of the stack corruption - the backtrace may not correspond to the program flow in such case.

    Is the problem easily reproducible? If yes, please can you send us your monit configuration to support@mmonit.com?

    Note that the monit 5.26.0 is more then 2 years old, it is possible the problem was fixed already - please test with the monit 5.29.0 if you can.

  3. Tom Hodder reporter

    Ah I see. I had left an old config file in there:

    this is the one, if I remove it, I don’t get that error…

    Remote Host Name      = ceph_mon_host
    Address              = 192.168.0.2
    Monitoring mode      = active
    On reboot            = start
    Ping                 = if failed [count 3 size 64 with timeout 5 s] then alert
    Port                 = if failed [192.168.0.2]:6789 type TCP/IP protocol generic with timeout 15 s then alert

    machine is

    [root@mon-node1 ~]# cat /etc/os-release
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

    monit is

    [root@mon-node1 ~]# rpm -qi monit
    Name        : monit
    Version     : 5.26.0
    Release     : 1.el7
    Architecture: x86_64
    Install Date: Sun 17 Oct 2021 10:51:36 PM UTC
    Group       : Applications/Internet
    Size        : 842200
    License     : AGPLv3
    Signature   : RSA/SHA256, Wed 04 Mar 2020 06:07:33 AM UTC, Key ID 6a2faea2352c64e5
    Source RPM  : monit-5.26.0-1.el7.src.rpm
    Build Date  : Wed 04 Mar 2020 06:03:52 AM UTC
    Build Host  : buildvm-04.phx2.fedoraproject.org
    Relocations : (not relocatable)
    Packager    : Fedora Project
    Vendor      : Fedora Project
    URL         : http://mmonit.com/monit/
    Bug URL     : https://bugz.fedoraproject.org/monit
    Summary     : Manages and monitors processes, files, directories and devices
    Description :
    monit is a utility for managing and monitoring, processes, files, directories
    and devices on a UNIX system. Monit conducts automatic maintenance and repair
    and can execute meaningful causal actions in error situations.

  4. Tom Hodder reporter

    Ah ok, so

    expect “^ceph” 
    

    this fails:

    check host ceph_mon_host with address 192.168.0.2
    if failed ping then alert
    if failed
    port 6789
    expect "^ceph"
    then alert

    this doesn’t

    check host ceph_mon_host with address 192.168.0.2
    if failed ping then alert
    if failed
    port 6789
    then alert

  5. Tildeslash repo owner

    Fixed: Issue #1005: When the port statement is used with the generic protocol and the server returned zeros in the response, Monit may crash. The problem was introduced in Monit 5.20.0.

    → <<cset 1924bfc8e63d>>

  6. Log in to comment