The problem with checking Unix-socket - Monit

Issue #580 duplicate
Alexandr created an issue

The problem with checking Unix-socket - Monit

Hello. In service a Monit we was detect the problem. Between version 5.8 and 5.8.1 in this commit ( https://bitbucket.org/tildeslash/monit/commits/5aa93749973e632371cb5632c6c435f246331dcc ) when Monit checks the uptime of the process before perform the test socket availability. The source code in src/validate.c it looks like this:

 if (s->portlist)
                 /* skip further tests during startup timeout */
                 if (s->start)
                         if (s->inf->priv.process.uptime < s->start->timeout) return TRUE;
                 for (pp = s->portlist; pp; pp = pp->next)
                         check_connection(s, pp);

The logic of this test is that a timeout has not yet occurred in the full launch of the process and check sockets early. At the same time, for unknown reasons, within the LXC container (presumably only 2.x) monit can not get the uptime of the process. The output monit status, it looks like this:

root@debian-8:~/monit-5.9/src# monit status
The Monit daemon 5.9 uptime: 20m 

Process 'dockerd'
 status                            Running
 monitoring status                 Monitored
 pid                               224
 parent pid                        1
 uid                               0
 effective uid                     0
 gid                               0
 uptime                               <<< No uptime info
 children                          1
 memory kilobytes                  15.9 MB
 memory kilobytes total            20.7 MB
 memory percent                    1.5%
 memory percent total              2.0%
 cpu percent                       0.0%
 cpu percent total                 0.0%
 unix socket response time         0.000s to /var/run/docker.sock [HTTP]
 data collected                    Mon, 27 Feb 2017 22:43:07

We used a test configuration:

check process dockerd with pidfile /var/run/docker.pid
start program = "/bin/systemctl start docker" 
stop program = "/bin/systemctl stop docker" 
if failed unixsocket /var/run/docker.sock protocol HTTP request "/version" then alert

And with «monit -c / etc / monit / monitrc -vv -I» can be seen as a monit parse the config:

Process Name          = dockerd
Pid file             = /var/run/docker.pid
Monitoring mode      = active
Start program        = '/bin/systemctl start docker' timeout 30 second(s)
Stop program         = '/bin/systemctl stop docker' timeout 30 second(s)
Existence            = if does not exist then restart
Pid                  = if changed then alert
PPid                 = if changed then alert
Unix Socket          = if failed [/var/run/docker.sock [protocol HTTP] with timeout 5000 seconds] then restart

That is, by default monit waits 30 seconds per process run. But because uptime it does not parse, then 30 seconds is always more and call check_connection () is never executed.

The problem found in the following systems:

This issue is not detect to Ubuntu 14.04 with monit 5.6 and Debian 7 monit 5.4. At the same CentOS 6 has a monit 5.14 and then the bug is already too. brute force managed to find that 5.18 monit already able to get the uptime (in this case 5.17 are not going), but we discovered another problem. Here is:

if failed unixsocket /var/run/docker.sock protocol HTTP request "/version" then alert

In version 5.18 performs a HEAD request (instead of the usual GET), and API docker this demon does not know (but understands GET / version) and provides 404.

Comments (4)

  1. Tildeslash repo owner

    Hello,

    please can you test with the latest Monit version? (5.21.0) Lot of things has changed since Monit 5.8.1.

    Regarding the HEAD method ... it's true Monit now prefers the HEAD method to save bandwidth, it switches to GET if either the response content test or checksum test is enabled. For example:

    #note: replace the "Version" string bellow with any string you expect to be present in the response
    if failed unixsocket /var/run/docker.sock protocol HTTP request "/version" with content "Version" then alert
    

    We will add support for changing the request method in the future.

  2. Tildeslash repo owner

    We have implemented fallback to the GET method if HEAD failed in the next Monit release + also support for overriding the automatic method, for example:

    if failed
        unixsocket /var/run/docker.sock
        protocol HTTP
        method GET #note: this is new for monit 5.22.0 or later, defaults to HEAD if not used (and no content/checksum test is enabled)
        request "/version"
    then alert
    

    The next Monit release should work even without the "method GET" option due to the automatic fallback.

    If you want to test it, you can get development snapshot:

    wget https://bitbucket.org/tildeslash/monit/get/master.tar.gz
    tar -xzf master.tar.gz
    cd tildeslash*
    ./bootstrap
    ./configure
    make
    
  3. Tildeslash repo owner

    cset 8584ce1f0a2a update: dropped the automatic HEAD->GET fallback which was added in cset e1c01a39af2e ... it could break some scenarios, such as if the user sets test for HEAD method that is expected to fail (because he want to make sure HEAD is not supported), example:

    if failed
        unixsocket /var/run/docker.sock
        protocol HTTP
        method GET
        request "/version"
    then alert
    
  4. Log in to comment