dugong.StateError: Previous response not read completely

Issue #140 open
Maxim Kostrikin
created an issue

I have a s3ql fail with exception:

2015-04-26 08:30:47.275 46932:Metadata-Upload-Thread s3ql.mount.run: Dumping metadata...
2015-04-26 08:30:47.304 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..objects..
2015-04-26 08:30:59.616 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..blocks..
2015-04-26 08:32:59.210 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..inodes..
2015-04-26 08:34:10.187 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..inode_blocks..
2015-04-26 08:34:56.328 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..symlink_targets..
2015-04-26 08:34:57.267 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..names..
2015-04-26 08:36:31.958 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..contents..
2015-04-26 08:39:17.831 46932:Metadata-Upload-Thread s3ql.metadata.dump_metadata: ..ext_attributes..
2015-04-26 08:39:18.887 46932:Metadata-Upload-Thread s3ql.mount.run: Compressing and uploading metadata...
2015-04-26 08:48:13.959 46932:Metadata-Upload-Thread s3ql.mount.run: Wrote 1.10 GiB of compressed metadata.
2015-04-26 08:48:13.965 46932:Metadata-Upload-Thread s3ql.mount.run: Cycling metadata backups...
2015-04-26 08:48:13.975 46932:Metadata-Upload-Thread s3ql.metadata.cycle_metadata: Backing up old metadata...
2015-04-26 08:52:23.863 46932:Metadata-Upload-Thread root.excepthook: Uncaught top-level exception:
Traceback (most recent call last):
  File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
    run_old(*args, **kw)
  File "/usr/lib/s3ql/s3ql/mount.py", line 692, in run
    cycle_metadata(backend)
  File "/usr/lib/s3ql/s3ql/metadata.py", line 121, in cycle_metadata
    cycle_fn("s3ql_metadata_bak_%d" % i, "s3ql_metadata_bak_%d" % (i + 1))
  File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 291, in copy
    self._copy_or_rename(src, dest, rename=False, metadata=metadata)
  File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 325, in _copy_or_rename
    self.backend.copy(src, dest, metadata=meta_raw)
  File "/usr/lib/s3ql/s3ql/backends/s3.py", line 94, in copy
    extra_headers=extra_headers)
  File "/usr/lib/s3ql/s3ql/backends/common.py", line 52, in wrapped
    return method(*a, **kw)
  File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 393, in copy
    resp = self._do_request('PUT', '/%s%s' % (self.prefix, dest), headers=headers)
  File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 435, in _do_request
    query_string=query_string, body=body)
  File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 695, in _send_request
    return read_response()
  File "/usr/lib/s3ql/s3ql/backends/s3c.py", line 657, in read_response
    resp = self.conn.read_response()
  File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 708, in read_response
    return eval_coroutine(self.co_read_response(), self.timeout)
  File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1396, in eval_coroutine
    if not next(crt).poll(timeout=timeout):
  File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 727, in co_read_response
    raise StateError('Previous response not read completely')
dugong.StateError: Previous response not read completely
2015-04-26 08:56:34.288 46932:MainThread s3ql.mount.unmount: Unmounting file system...

Please advise.

Comments (12)

  1. Maxim Kostrikin reporter
    $ apt-cache policy s3ql python3-dugong
    s3ql:
      Installed: 2.13+dfsg-1~trusty2
      Candidate: 2.13+dfsg-1~trusty2
      Version table:
     *** 2.13+dfsg-1~trusty2 0
            500 http://ppa.launchpad.net/nikratio/s3ql/ubuntu/ trusty/main amd64 Packages
            100 /var/lib/dpkg/status
         1.16-1 0
            500 http://us.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
    python3-dugong:
      Installed: 3.5+dfsg-1~9~ubuntu14.04.1
      Candidate: 3.5+dfsg-1~9~ubuntu14.04.1
      Version table:
     *** 3.5+dfsg-1~9~ubuntu14.04.1 0
            500 http://ppa.launchpad.net/nikratio/s3ql/ubuntu/ trusty/main amd64 Packages
            100 /var/lib/dpkg/status
    

    s3ql patched agains empty 503 responce from S3.

    Happens only once. I posted the trace to keep it recorded if some one could reproduce it.

  2. Daniel Jagszent

    S3QL 2.23
    Dugong 3.7.1
    Backend Swift (OVH)

    I got this error while deleting ~180,000 data objects (approx. 1.7TB). The error occurred only once after deleting ~300GB worth of objects. After a fsck run and a remount the deletion of the remaining 1.4TB went through without an error. I'm afraid that I cannot reproduce this error, too. 😔

    I used find . -mindepth 1 -maxdepth 1 -type d | xargs s3qlrm to mass-delete all sub-folders of a folder. I executed s3qlstat /mount/point sporadically in another shell to have some kind of progress indicator.

    I re-checked all execution paths of the new _delete_multi method but could not find a path that did not call either readall or discard on self.conn. Have I missed one? I'm not familiar with asyncio and Python's coroutines so reading the Dugong source code is a little harder but as I unterstand it after readall / discard is called on a HTTPConnection object its _in_remaining property should always be None. I'm afraid I am at a loss to debug this error.

    Here is the mount.log:

    2017-08-22 20:36:43.812 11840:MainThread s3ql.mount.determine_threads: Using 2 upload threads.
    2017-08-22 20:36:43.814 11840:MainThread s3ql.mount.main: Autodetected 10202 file descriptors available for cache entries
    2017-08-22 20:36:44.217 11840:MainThread s3ql.backends.swift._detect_features: Detected Swift features for storage.gra1.cloud.ovh.net:443: copy via COPY, Bulk delete 1000 keys at a time, maximum meta value length is 255 bytes
    2017-08-22 20:36:45.676 11840:MainThread s3ql.mount.get_metadata: Using cached metadata.
    2017-08-22 20:36:45.687 11840:MainThread s3ql.mount.get_metadata: Last file system check was more than 1 month ago, running fsck.s3ql is recommended.
    2017-08-22 20:36:45.695 11840:MainThread s3ql.mount.main: Mounting swiftks://auth.cloud.ovh.net/GRA1:iwp-backup/ at /home/jails/backup-sftp/cloud/iwp-backup...
    2017-08-22 20:36:45.702 11845:MainThread s3ql.daemonize.detach_process_context: Daemonizing, new PID is 11846
    2017-08-22 20:49:29.964 11846:Thread-13 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    2017-08-22 20:49:33.488 11846:Thread-14 s3ql.mount.exchook: Unhandled top-level exception during shutdown (will not be re-raised)
    2017-08-22 20:49:33.489 11846:Thread-14 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    2017-08-22 20:50:04.091 11846:Thread-6 s3ql.mount.exchook: Unhandled top-level exception during shutdown (will not be re-raised)
    2017-08-22 20:50:04.091 11846:Thread-6 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    2017-08-22 20:50:37.932 11846:Thread-10 s3ql.mount.exchook: Unhandled top-level exception during shutdown (will not be re-raised)
    2017-08-22 20:50:37.933 11846:Thread-10 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    2017-08-22 20:53:10.385 11846:Thread-5 s3ql.mount.exchook: Unhandled top-level exception during shutdown (will not be re-raised)
    2017-08-22 20:53:10.385 11846:Thread-5 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    2017-08-22 20:54:00.376 11846:Thread-9 s3ql.mount.exchook: Unhandled top-level exception during shutdown (will not be re-raised)
    2017-08-22 20:54:00.376 11846:Thread-9 root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/s3ql/s3ql/block_cache.py", line 697, in _removal_loop
        backend.delete_multi(['s3ql_data_%d' % i for i in ids])
      File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 251, in delete_multi
        return self.backend.delete_multi(keys, force=force)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 549, in delete_multi
        self._delete_multi(tmp, force=force)
      File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 474, in _delete_multi
        resp = self._do_request('POST', '/', subres='bulk-delete', body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 240, in _do_request
        resp = self._do_request_inner(method, path, body=body, headers=headers)
      File "/usr/lib/s3ql/s3ql/backends/swift.py", line 278, in _do_request_inner
        return self.conn.read_response()
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/usr/lib/python3/dist-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    
  3. Roger Gammans

    I'm getting had this twice is row now - a day or so after starting the mount.

    This is with 2.23 - Is there any additional debugging I can insert?

    Here is the latest tb

    root.excepthook: Uncaught top-level exception:
    Traceback (most recent call last):
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/mount.py", line 64, in run_with_except_hook
        run_old(*args, **kw)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/mount.py", line 646, in run
        upload_metadata(backend, fh, self.param)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/metadata.py", line 324, in upload_metadata
        cycle_metadata(backend)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/metadata.py", line 123, in cycle_metadata
        cycle_fn("s3ql_metadata_bak_%d" % i, "s3ql_metadata_bak_%d" % (i + 1))
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/comprenc.py", line 268, in copy
        self._copy_or_rename(src, dest, rename=False, metadata=metadata)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/comprenc.py", line 300, in _copy_or_rename
        self.backend.copy(src, dest, metadata=meta_raw)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/s3.py", line 99, in copy
        extra_headers=extra_headers)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/common.py", line 108, in wrapped
        return method(*a, **kw)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/s3c.py", line 431, in copy
        resp = self._do_request('PUT', '/%s%s' % (self.prefix, dest), headers=headers)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/s3c.py", line 476, in _do_request
        query_string=query_string, body=body)
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/s3c.py", line 739, in _send_request
        return read_response()
      File "/root/s3qlenv/lib/python3.4/site-packages/s3ql/backends/s3c.py", line 701, in read_response
        resp = self.conn.read_response()
      File "/root/s3qlenv/lib/python3.4/site-packages/dugong/__init__.py", line 765, in read_response
        return eval_coroutine(self.co_read_response(), self.timeout)
      File "/root/s3qlenv/lib/python3.4/site-packages/dugong/__init__.py", line 1496, in eval_coroutine
        if not next(crt).poll(timeout=timeout):
      File "/root/s3qlenv/lib/python3.4/site-packages/dugong/__init__.py", line 781, in co_read_response
        raise StateError('Previous response not read completely')
    dugong.StateError: Previous response not read completely
    
  4. Roger Gammans

    This has happened to three times in the last month; and I can confirm (duue to some additional instrumentation) that one of the request in each case is a metadata backup PUT request such as "/<STORE_PATH>/s3ql_metadata_bak_8"

  5. Roger Gammans
    • Dugong 3.7.3
    • S3QL 2.30
    • Python 3.4.2

    This is now happening on fsck (with v2.30) ; so I can no longer start my filesystem at all. The backtrace is identical part from a small change in line numbers from the upload_metadata() call down from the one I posted earlier.

    fsck had reported it was on the "Backing up old metadata... " prior to the exception.

    In case it helps, according to a previous line by metadata size is now is 4.54GB, on a TB order filesystem.

  6. Log in to comment