- changed status to open
sshguard dies with EPIPE on log rotation by metalog
Issue #183
open
- OS: Source Mage GNU/Linux
- SSHGuard version: 2.4.3
- SSHGuard invocation:
/usr/sbin/sshguard -i /var/run/sshguard.pid
- Firewall backend:
/usr/libexec/sshg-fw-ipset
I run metalog on a couple of servers, and every 5-6 days sshguard randomly dies. I tracked it down to this EPIPE
error:
01:56:23 read(4, "\1\0\0\0\200\0\0\0p\10\0\0 \0\0\0log-2023-09-21-01:56:23\0\0\0\0\0\0\0\0\0\2\0\0\0\0\10\0\0\0\0\0\0\0\0\0\0", 76) = 64
01:56:23 open("/var/log/sshd/current", O_RDONLY|O_NONBLOCK) = 5
01:56:23 lstat("/var/log/sshd/current", {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
01:56:23 fstat(5, {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
01:56:23 fstatfs(5, {f_type=0x58465342, f_bsize=4096, f_blocks=3668731, f_bfree=3323354, f_bavail=3323354, f_files=14691264, f_ffree=14626964, f_fsid={64774, 0}, f_namelen=255, f_frsize=4096}) = 0
01:56:23 close(3) = 0
01:56:23 open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
01:56:23 fstat(3, {st_mode=S_IFREG|0644, st_size=2997, ...}) = 0
01:56:23 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdc9b3f3000
…
01:56:23 write(2, "tail: ", 6) = -1 EPIPE (Broken pipe)
01:56:23 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6128, si_uid=0} ---
01:56:23 +++ killed by SIGPIPE +++
I’m assuming this happens when metalog rotates logs:
# ls -lh /var/log/sshd/
total 108K
-rw-r--r-- 1 root root 252 Sep 21 03:52 current
-rw-r--r-- 1 root root 87K Aug 23 21:50 log-2023-08-24-00:24:14.xz
-rw-r--r-- 1 root root 1.8K Aug 30 21:00 log-2023-08-31-00:14:59.xz
-rw-r--r-- 1 root root 1.9K Sep 6 21:45 log-2023-09-07-03:23:56.xz
-rw-r--r-- 1 root root 1.3K Sep 13 23:30 log-2023-09-14-00:33:33.xz
-rw-r--r-- 1 root root 1.3K Sep 20 18:34 log-2023-09-21-01:56:23.xz
This is the relevant parts of /etc/metalog.conf
:
maxsize = 104857600 # size in bytes (1048576 = 1 megabyte)
maxtime = 604800 # time in seconds (86400 = 1 day)
maxfiles = 5 # num files per directory
postrotate_cmd = "/usr/bin/xz"
…
SSH Server:
program = "sshd"
logdir = "/var/log/sshd"
#break = 1
SSH Guard:
program = "sshguard"
logdir = "/var/log/sshguard"
maxtime = 86400
maxfiles = 31
break = 1
Is there a way to gracefully handle this situation, or, maybe, it’s already being done and this is just a very specific case that somehow breaks regular operations?
Comments (1)
-
- Log in to comment
Can you clarify how you're launching SSHGuard? Specifically:
FILES
in sshguard.conf so that SSHGuard tracks files?Your system trace also shows a process that is being killed by SIGPIPE. Can you find out which process in the SSHGuard pipeline that is? Specifically, is it the shell script
sshg-logtail
(that may show up as sh or bash) or something else that is receiving this signal?