Strange behavior when clamav-daemon runs with "Debug false" (pyClamd v 0.3.16)

Create issue
Issue #1 resolved
Charles Hamilton created an issue

I'm working on a project that uses pyclamd+clamd to scan file uploads (specific file can be found here) for malicious content and I've noticed that when the "Debug" directive in clamd.conf is configured to "false", I receive errors when I believe I shouldn't. However if Debug is set to "true", everything works correctly. The steps to reproduce are as follows:

  1. Set the "Debug" directive in clamd.conf to "false"
  2. Restart clamd
  3. Execute the following code:
import pyclamd
cd = pyclamd.ClamdUnixSocket()

open('/tmp/TESTFILE', 'w').write("0"*1024000)

def read_in_chunks(file_object, chunk_size=65536):
    while True:
            data = file_object.read(chunk_size)
            if not data:
                    break
            print("Virus: {0} Length: {1}".format(cd.scan_stream(data), len(data)))

#First attempt, might not get an error
f = open('/tmp/TESTFILE', 'r')
read_in_chunks(f)

#Second attempt, probably will get an error
f = open('/tmp/TESTFILE', 'r')
read_in_chunks(f)

#Third attempt, definitely will get an error
f = open('/tmp/TESTFILE', 'r')
read_in_chunks(f)

When "Debug" is set to true, the snippet above will simply output the result of cd.scan_stream (None if no Virus) and the length of the input chunk. However, if "Debug" is set to false, a ConnectionError will eventually (or immediately) be raised. I've traced the source of the error to lines 501~505 in pyclamd.py, and I've confirmed it by modifying the lines, thus:

            except  socket.error  as  e: 
                print (result, e) 
                pass  
               #raise ConnectionError('Unable to scan stream')  

With this modification in place, if I reconfigure "Debug" back to "false" and then restart clamd, the problem goes away completely. This leads me to wonder if the exception raised here (in my experience it's always been "104 Connection reset by peer") is normal because when I use my upload handler with this modification, the md5sum of the uploaded file matches that of the original file -- regardless of how many exceptions are triggered (and subsequently printed to stdout.) So, I was wondering if you could shed some light on this behavior. This is an issue I can live with by simply catching the exception and ignoring it, but I fear that might not be the appropriate way to handle this situation. Thanks in advance!

Comments (8)

  1. Alexandre Norman repo owner

    Thanks for your extensive bug report ! You're right, I could reproduce it.

    In fact if after sending data to the steam socket if the read connection is too quick, we got this connection error.

    In scan_stream the result loop works when you bypass the raise of exception. Anyway if there is a good reason to raise this exception (for example clamd server is stopped while reading) we'll go in an infinite loop.

    So I changed _recv_response and _recv_response_multiline to loop a few times before failing :

        def _recv_response(self):
            """
            receive response from clamd and strip all whitespace characters
            """
            # If we connect too quickly
            # sometimes we get a connexion error
            # so we retry
            failed_count = 5
            while True:
                try:
                    data = self.clamd_socket.recv(4096)
                except socket.error:
                    time.sleep(0.01)
                    failed_count -= 1
                    if failed_count == 0:
                        raise
                else:
                    break
    

    This way, it work (at least for me)… please try the last version (0.3.17 - http://xael.org/pages/pyClamd-0.3.17.tar.gz) and tell me if it works for you as well.

  2. Charles Hamilton reporter

    @Xael: That appears to have resolved the problem. I'll do some more testing over the next few days; if I find anything I'll let ya know. Thanks!

  3. Log in to comment