Issue #20 new

pep381run refetches too much

Georges Racinet
created an issue

Hi there,

I'm a fairly new user of pep381client, running an in-house mirror, so maybe I missed something.

I've noticed that the client seems to fetch again all versions of a given distribution each time there's a new version. Of course this is quite undesireable (all versions for projects with long history can take hours to download, and this stresses pypi a lot).

I don't know if this is systematic, but it seems to be because currently, PyPI does not provide Etags, while the caching logic in maybe_copy_file() apparently relies on them.

(extract from maybe_copy_file())

        etag = self.storage.etag(path)
        if etag:
            h.putheader("If-none-match", etag)
        h.endheaders()
        r = h.getresponse()
        if r.status == 304:
            # not modified, discard data
            r.read()
            return

here's curl session showing the lack of Etag:

$ curl -vO http://pypi.python.org/packages/source/G/GeoBases/GeoBases-4.23.0.zip
* About to connect() to pypi.python.org port 80 (#0)
*   Trying 140.211.10.69...
* Connected to pypi.python.org (140.211.10.69) port 80 (#0)
> GET /packages/source/G/GeoBases/GeoBases-4.23.0.zip HTTP/1.1
> User-Agent: curl/7.26.0
> Host: pypi.python.org
> Accept: */*
> 
* additional stuff not fine transfer.c:1037: 0 0
* HTTP 1.1 or later with persistent connection, pipelining supported
< HTTP/1.1 200 OK
< Server: nginx/1.1.19
< Date: Mon, 22 Apr 2013 15:43:47 GMT
< Content-Type: application/zip
< Content-Length: 10396883
< Last-Modified: Fri, 08 Feb 2013 16:33:37 GMT
< Accept-Ranges: bytes
< 

Also, there's a potential bug in case an Etag would be provided : etag is fetched from the local "file" DB before the leading '/' is stripped, but written after. I couldn't check the effectivity of that potential bug, 'cause my local "files" DB hasn't any Etag,

Comments (3)

  1. Georges Racinet reporter

    That branch history is a real mess (could not really test on a dev box, used bitbucket to move changes easily to servers), but if works correctly now Will try and clean that up, then submit

  2. Log in to comment