sshguard dies with EPIPE on log rotation by metalog

Issue #183 open
Val V created an issue
  • OS: Source Mage GNU/Linux
  • SSHGuard version: 2.4.3
  • SSHGuard invocation: /usr/sbin/sshguard -i /var/run/sshguard.pid
  • Firewall backend: /usr/libexec/sshg-fw-ipset

I run metalog on a couple of servers, and every 5-6 days sshguard randomly dies. I tracked it down to this EPIPE error:

01:56:23 read(4, "\1\0\0\0\200\0\0\0p\10\0\0 \0\0\0log-2023-09-21-01:56:23\0\0\0\0\0\0\0\0\0\2\0\0\0\0\10\0\0\0\0\0\0\0\0\0\0", 76) = 64
01:56:23 open("/var/log/sshd/current", O_RDONLY|O_NONBLOCK) = 5
01:56:23 lstat("/var/log/sshd/current", {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
01:56:23 fstat(5, {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
01:56:23 fstatfs(5, {f_type=0x58465342, f_bsize=4096, f_blocks=3668731, f_bfree=3323354, f_bavail=3323354, f_files=14691264, f_ffree=14626964, f_fsid={64774, 0}, f_namelen=255, f_frsize=4096}) = 0
01:56:23 close(3)                       = 0
01:56:23 open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
01:56:23 fstat(3, {st_mode=S_IFREG|0644, st_size=2997, ...}) = 0
01:56:23 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdc9b3f3000

01:56:23 write(2, "tail: ", 6)          = -1 EPIPE (Broken pipe)
01:56:23 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6128, si_uid=0} ---
01:56:23 +++ killed by SIGPIPE +++

I’m assuming this happens when metalog rotates logs:

# ls -lh /var/log/sshd/
total 108K
-rw-r--r-- 1 root root  252 Sep 21 03:52 current
-rw-r--r-- 1 root root  87K Aug 23 21:50 log-2023-08-24-00:24:14.xz
-rw-r--r-- 1 root root 1.8K Aug 30 21:00 log-2023-08-31-00:14:59.xz
-rw-r--r-- 1 root root 1.9K Sep  6 21:45 log-2023-09-07-03:23:56.xz
-rw-r--r-- 1 root root 1.3K Sep 13 23:30 log-2023-09-14-00:33:33.xz
-rw-r--r-- 1 root root 1.3K Sep 20 18:34 log-2023-09-21-01:56:23.xz

This is the relevant parts of /etc/metalog.conf:

maxsize  = 104857600  # size in bytes (1048576 = 1 megabyte)
maxtime  = 604800     # time in seconds (86400 = 1 day)
maxfiles  = 5          # num files per directory

postrotate_cmd = "/usr/bin/xz"



SSH Server:
        program         = "sshd"
        logdir          = "/var/log/sshd"
        #break          = 1

SSH Guard:
        program         = "sshguard"
        logdir          = "/var/log/sshguard"
        maxtime         = 86400
        maxfiles         = 31
        break           = 1

Is there a way to gracefully handle this situation, or, maybe, it’s already being done and this is just a very specific case that somehow breaks regular operations?

Comments (1)

  1. Kevin Zheng
    • changed status to open

    Can you clarify how you're launching SSHGuard? Specifically:

    • Are you piping logs from metalog to SSHGuard
    • Or setting FILES in sshguard.conf so that SSHGuard tracks files?

    Your system trace also shows a process that is being killed by SIGPIPE. Can you find out which process in the SSHGuard pipeline that is? Specifically, is it the shell script sshg-logtail (that may show up as sh or bash) or something else that is receiving this signal?

  2. Log in to comment