Issue #21 resolved

CacheFilter implementation

Anonymous created an issue

Re-implement caching: * use the same technique as CP1 was using * Or investigate what can be done with filters

Reported by rdelon

Comments (17)

  1. Anonymous

    I think that filters can be used for caching, thought it means that the normal filter ordering & processing has to be bypassed. In other words: if a cache filter intercepts a request, and serves it, then it can't call the other filters in the chain. It also means that the cache filter will usually be the first filter in the chain. This involves either a InternalRedirect, or some other way to abort the request processing and to avoid that any filter has a chance to be called.

    The option for the InternalRedirect is as follows: redirect the cache requests to a special /cache/<cache-obj-id> URL. The cpg.root.cache would become reserved for this purpose, and would not be exposed by "normal" processing (some kind of protection would be useful but not mandatory). The cache object would have a simple filter list (not even the gzip filter is needed, if the cache stores the already gzipped image of the page). Each cached page would have its own ID, that would be used to locate it into the cache.

  2. Anonymous

    How would you know what pages you want to have cached (or would this be 'all pages for wich the caching filter is active)? and how do you want to cache them? Using a filter as the last filter in the chain - so you can grab the content and header info sent to the client to store it in cpg.root.cache?

  3. Anonymous

    Basic design

    CacheReadFilter: the input part of the cache filter. Should be the one of the earliest filters in the chain. It uses

    CacheWriteFilter: the output part of the cache filter. Should be the last filter in the chain, or at least, it should be installed after the last filter that changes the contents of the page.

    CacheStats: class that can be instantiated and plugged into the object tree to allow to retrieve statistics about the cache.

    Customization

    CacheReadFilter(key) CacheReadFilter(key, delay, maxobjsize, maxsize, maxobjects)

    • key: a callable that takes no arguments, and that returns a string. (it makes no sense to pass any argument; the point of this routine is that the user may want ot need to override it and use any arbitrary info from the cpg.request structure to generate the key.
    • delay: expiration time, in seconds, as a float, just as in time.time()
    • maxobjsize: maximum object size that can be stored on the cache, in bytes.
    • maxsize: maximum size of the cache, in bytes.
    • maxobjects: maximum number of objects that can be stored in the cache.

    Magic objects

    One possible idea is to fine-tune the cache behavior by checking special magic objects in the tree. For example, objects with the attribute cacheable=False would not be cached. This idea as some merits but also may pollute the code a lot. It's not going to be implemented but it's left documented here.

  4. Anonymous

    Initial implementation

    Not quite enough to close the ticket though. Check the CacheFilter wiki page for documentation and usage notes.

  5. Anonymous

    It was reported today on IRC (by vi) that the following code snippet is not working. It seems that the socket object is not being attached to the Tee. It is possible that some change may have done in trunk that stopped it from working.

    from cherrypy import cpg
    from cherrypy.lib.filter.cachefilter import CacheInputFilter,CacheOutputFilter,CacheStats
    
    class HelloWorld:
        def index(self):
            return """Hello,<br>"""
        index.exposed = True
    
    
    cpg.root = HelloWorld()
    cpg.root._cpFilterList=[CacheInputFilter(),CacheOutputFilter()]
    cpg.root.stats = CacheStats()
    cpg.server.start(configFile = 'config.conf')
    
  6. Anonymous

    By putting some "print" in the cachefilter code: adding a print the end of afterRequestBody and print just before the "if isinstance(cpg.response.wfile, Tee):" in beforeResponse and in afterResponse.

    As a cross check, I print the defaultCacheKey() too.

    When I run the above example, I receive the following output:

    2005/02/13 14:14:48 HTTP INFO 127.0.0.1 - GET /index HTTP/1.1
    after request body <cherrypy.lib.filter.cachefilter.Tee instance at 0x4020240c> 
    before response <socket._fileobject object at 0x40204d14> http://localhost:7080/index 
    after response <socket._fileobject object at 0x40204d14> http://localhost:7080/index 
    

    Why the wfile object is overwritten ? because I'm on Gentoo ? becuse my CP installation is not correct (should be trunk of 13/02/2005) ?

  7. Anonymous

    I must add that after a full delete of my /usr/lib/python2.3/site-packages/cherrypy directory, a re-install with the trunk version of cherrypy, IT WORKS FINE.

    With a very simple page, ab2 results are: - 154#/sec without cache - 268#/sec with cache!!!!

    Sorry for confusion, thanks for great caching functionality.

  8. Robert Brewer

    CP 2.1 has a cachefilter which differs substantially from the one discussed in this ticket (complete with tests which currently pass). Please open new tickets for any further issues with the CacheFilter.

  9. Log in to comment