MySQL test might produce "HOST blocked because of too many connection errors"

Issue #98 resolved
Tildeslash
repo owner created an issue

The new MySQL test in Monit 5.9, connects, read the handshake package from MySQL and then disconnect. One CentOS 6 user reports that the new protocol test generate a host is blocked because of many connection errors in MySQL.

According to MySQL Documentation this error and subsequent blocking of host happens if mysqld has received many connection requests from the given host that were interrupted in the middle. It is unclear if Monit's MySQL test which just connect and then disconnect is considered a connection attempt interrupted in the middle. Especially because our test on OSX with MySQL 5.6 and with MySQL 5.5 on Ubuntu does not generate this error.

If you see this error, the workaround until this is clarified is

  • Login into MySQL and issue flush hosts;
  • Issue SET GLOBAL max_connect_errors=4294967295;

Which should buy you at least 135 years before the error is triggered again.

Comments (22)

  1. Simon Rycroft

    I have been having this issue on one of my MySQL servers (Debian 6.0.10/MySQL 5.5.38), but oddly, not another one. The two machines should be identical, although they're not quite the same (same OS/MySQL). I'm currently using monit 5.8.1, although I originally experienced this in an earlier version. I fixed it by adding a call to "mysqladmin flush-hosts" on cron, but the max_connect_errors fix above looks much better.

  2. Tildeslash reporter

    @Simon Rycroft The MySQL protocol test was changed in Monit 5.9. Previous versions tried to login using a guest or anonymous account. In this new version, there is no login attempt and in theory the test should not produce any errors. But as noted above, it is unclear if it does. If you don't mind test with the latest 5.9 version and could report your findings it would be much appreciated.

  3. Simon Rycroft

    No joy already I'm afraid. Just got the following error from m/monit:

    failed protocol test [MYSQL] at INET[msq-scratchpads.nhm.ac.uk:3306] via TCP -- Server returned error code 1129 -- msq-scratchpads.nhm.ac.uk' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'

  4. Tildeslash reporter

    Thanks for trying. So the test works fine with one MySQL server and not the other and the OS and MySQL version is the same. Hmm, tricky, could it be that you have an old Monit running in the background which was not stopped on the machine with the problems or some other program that connects to MySQL and produce this error? In the meantime then, until we figure out what the problem is, the workaround above should do the trick (though not satisfactory).

  5. Tildeslash reporter

    We're still unable to reproduce the issue.

    According to the mysql documentation the connection errors counter is incremented in the case the the connection is interrupted. It is very likely the MySQL protocol test failed 10 times before your host was blocked for some other reason - this error will point to the root cause.

    Please can you enable Monit logging (add “set logfile <path>|syslog” statement if it’s missing + reload monit), unblock the host (using for example “mysqladmin flush-hosts”) and when the host will be locked out again, send the whole mysql-related monit log and mysql error log. The errors preceding the blocked host error message are trigger the issue.

  6. Eric Pot

    If found same problem in Monit 5.13 mysql Ver 15.1 Distrib 10.0.16-MariaDB, for Linux (x86_64) using readline 5.1

    failed protocol test [MYSQL] at [*.*.*.*]:3306 [TCP/IP] -- Server returned error code 1129 -- *.*.*.*' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'
    
  7. Tildeslash reporter

    @Eric Pot Please can you check the monit log (enabled by "set logfile" statement) to see if there are any errors preceding the "blocked because of many connection error" message?

    Note that it is necessary to create a user so Monit is allowed to connect to the MySQL database (no grants are needed). If there is some existing user with no restricted access ("%") or for given monit host, then it will work fine, otherwise add the user for Monit host for example like this (replace the ip address, hostname, password by monit host's ip/hostname and custom credentials):

    CREATE USER someuser@192.168.0.10 IDENTIFIED BY 'mypassword';
    
  8. Tildeslash reporter

    @Eric Pot please can you check yet the monit log to see if there are some other errors reported before the host was blocked?

    If logging was not enabled, please add the "set logfile <path>|syslog" statement and reload monit, then unblock the host ("mysqladmin flush-hosts") and check for errors preceding the "blocked" message.

  9. Eric Pot

    Log was enabled see below

    [CEST May 24 20:57:02] info     : M/Monit heartbeat started
    [CEST May 24 21:02:03] error    : 'mysql' failed protocol test [MYSQL] at [*.*.*.*]:3306 [TCP/IP] -- Server returned error code 1129 -- *.*.*.*' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'
    
  10. Tildeslash reporter

    @Eric Pot please can you stop monit, unblock the host ("mysqladmin flush-hosts") and run monit in debug mode?:

    monit -v
    

    We need to see what preceded the error ... if all connections from monit were successful without any error and then suddenly the host is blocked, we'll need to get either packet capture (tcpdump/wireshark) of the communication between Monit and MySQL.

  11. Tildeslash reporter

    Finally we were able to reproduce the issue using MariaDB 10.0.17 (we have tested with MySQL 5.1.x and 5.5.x before and despite identical problems were reported with it, we were never able to reproduce the issue).

    Will update as soon as the root cause is uncovered, no need for more data at this point.

  12. Tildeslash reporter

    Fix Issue #98: MySQL test might blocked the host because of too many connection errors.

    We have to respond with handshake response packet, otherwise the client may be blocked. It seems the restriction is not applied for loopback device and it even wasn't possible to reproduce the issue in lab when testing remote host, but when testing via public interface on the same machine where MySQL runs, then the problem occurred.

    → <<cset 79cc6015c2a9>>

  13. Simon Rycroft

    Great to hear that you've managed to get to the bottom of this issue. It wasn't a killer issue for me, but it's certainly nice to know that my new Chef server setup won't require a small monit work around when monitoring MySQL servers.

  14. Log in to comment