Yeah, i'm concerned with resetting the cache(s) of a long-running process(es) without restarting the process(es).
With a DBM based cache, if I want to drop the cache it seems I can just delete the dir and then run a script to re-generate the cache files. That doesn't seem to cause too many error. Not sure how to handle memcached, etc. Cycling the cache backend tends to cause errors.
The best way to handle invalidating memcached without dogpile errors seems to be site-stop, memcached off, memcached on, sleep(5), site-start
In theory, it would be possible to have the backend support a site-wide invalidate without too much extra code. Just make the current CacheRegion.invalidate() check to see if the backend has a similar method, call that. Have the backend store (in it's actual store) a special key that indicates that anything older than <timestamp> is invalid.
I think that would be a reasonable feature add. It would add another lookup (perhaps something that could be done on an interval and is stored in a local var) to verify cache validity.
OK seems like you're talking about two things. "if the backend has a similar method", I guess you mean if the backend is a dict, we want dict.clear() type of thing, we have an existing convention for backend-specific features which is that you call it from the backend directly:
now, if the feature is instead "Have the backend store (in it's actual store) a special key that indicates that anything older than <timestamp> is invalid", that's not specific to a backend, that could be done agnostically with CacheRegion. What I don't like about it is that it's slow, adds an extra cache hit to all operations. If we turn it off, we're cluttering up CacheRegion with ever more conditionals to suit use cases that are extremely rare (I'd never need a feature like this). I'd like to explore first how CacheRegion could allow extensibility in ways like this without cluttering it up, then the "augment all cache operations with an explicit invalidation key check" can be an extension feature in a separate module.
I was actually thinking of the same mechanism as the current CacheRegion invalidate.
With regards to something like dict.clear(), I think it is useful to pass that on as a utility for cache invalidation on that backend, but I see that as a one-off not as a globally acceptable mechanism (based upon how the back ends work).
But that being said, I agree, you don't want the overhead of having to do that lookup every time. The mechanism to load in that specific "invalidate" information would need to be smarter than "check if invalidate is set, load, then check cache". I'm not yet sure how I would approach this in a universally acceptable way.
Allowing elegant extension use is never a bad idea (in my opinion).
you either have to check that invalidate key every time, or you can "box" it by having a function that looks at the current time, and on a per-region basis only checks the "invalidate" key every N seconds. So a very active cache region would not be doing this second hit more than every N seconds. A not very active region would be doing the hit for a majority of accesses, but it's not active so not a big deal.
its definitely logic I'd want to have "somewhere else", and nicely tested in isolation against a mock backend.
I'm not sure if this is the best place to ask, but I'm using async_runner to repopulate my cache (memory backend) in the background, and found that calling region.invalidate() forces the next call to do a synchronous/blocking repopulate. I've hunted around but can't find a good way to invalidate the whole region in a way that will continue to allow serving stale data while repopulating via async_runners. Is this possible with the current implementation?
that's a great point, as invalidate() was written to just force a regen immediately. I've broken it out into "hard" and "soft" options in 138d3d7fa9b9ff97a01b2b74c5cac48 where you can see that a "soft" invalidation does the invalidate by faking the creation time to be "now - expiration time", rather than raising a NeedRegen or returning a hard "0" value for creation time. I haven't tested this in an integration context (e.g. with multiple threads), please let me know if this flag solves this issue for you.
That worked beautifully and saves me a bunch of work. Thanks again for the quick fix.
For anyone who later finds this, the use-case I'm using it for:
cache a ton of occasionally changing game metadata from the DB in memory (per app process) so many operations require 0 DB queries.
when someone updates the data via the admin tool, signal app processes to invalidate the cache region
(currently done by polling a 'last_update' in the DB (also async and cached for N secs), later to be via pub/sub)
allow serving of stale content while querying the db in the background to refresh the cache, so no requests get hit w/the query lag.
When running multiple forked processes, you have to invalidate in every process because it doesn't actually delete or invalidate the keys from the backend. Would it be possible to delete an entire region from the backend, and if so could a flag or separate method be added to accomplish this?
@zoomorph it sounds like you're going back around to the beginning of the ticket here. Backends like memcached or redis don't have a keys() function that we could use to "delete the entire region". Hence we do it with invalidation timestamps instead. Those are currently local to a specific Python process that sets that up, but the notion here is, hey lets get that invalidation time from the server instead. great ! but how do we do that and not double our cache accesses, how do we do it without messying up the dogpile internals too much? one answer right now is that each app queries the datastore periodically, like with a background thread, for a single "invalidation" timestamp, and sets it up as needed using region.invalidate(). So this can be rolled entirely on the outside - though that doesn't mean we can't add some helpers or at least examples in the recipes section that talk about this.
@zoomorph The way I handle this is with redis pub/sub. Each process has a redis sub on a cache.purge channel. To purge, publish cache.purge <region name> and each process listens for that message and calls region.invalidate() locally.
A while back I thought about handling this with a custom ProxyBackend-
Create an 'invalidation' ProxyBackend; calls to 'get' first check for
an invalidation timestamp. Then you override get to take this value
This value could probably only be hit it periodically, and cached into
The tricky part though, is this proxy backend would have to hit a
different region :
- It should never expire ( or at least expires 1.x longer than the
'invalidated' backend )
- requests can't use this ProxyBackend or a loop would form
I think the logic would be something like :
value = Region1.get("Value1")
_invalidated = Region2.get("Invalidated-Region1")
if not _invalidated or not _invalided.not_timely :
I ended up not implementing this, because it was easier to construct
the app not to have to deal with stuff like this.
The only time /we/ would necessarily need to refresh an entire region
or 'unknown' keys, occurs on an app deployment. in those cases, we plan
for a downtime longer than a cache expiry. There's also a backup in
place to use key_mangler to version the key name.
Lets revisit this and allow for passing in an override to the _hard_invalidate and _soft_invalidate that can work on the backends. The default can be only within the region, but we just hit this exact issue within Keystone (OpenStack) and we're willing to take the overhead hit of asking the backend for the "expiration time" each time for the benefit of not hitting SQL. 2x Memcache hits will still be better than inconsistent results.
For clarity, the idea is that the region-wide .invalidate would make some calls on an Abstract Base Class (or similar) instead of just setting the values on the region itself. This allows a developer to override but the default can remain local to the in-process.
@Morgan Fainberg you mean you want an extension to use a second "get" from the backend to "get' an invalidation token, right? the idiomatic approach within region.py is that callable objects can be passed in; right now for example you can pass to get_or_create a "should_cache_fn". It seems like we'd add the ability for a "should_invalidate_fn" or similar.
@Michael Bayer Correct, something like that. The only concern I have is that it also needs to hook into the region.invalidate to be as transparent to the developer as possible.
The way I had to (temp until we have something in dogpile) I patched the ._hard_invalidated and ._soft_invalidated with an property that did the work with a setter/deleter. So as long as we can hook the .invalidate method into whatever should_invalidate_fn does, we should be good.
What I am very much hoping not to do is to introduce a plethora of ad-hoc "abstract" classes all over the place as a means of arbitrary extension, because then you have a mess, and also inconsistent vs. the many current region hooks that are currently sent as arbitrary callables.
Where we do have an "abstract" class as an extension point is CacheBackend. f we made "invalidate" a hook that consulted the backend, we have the ProxyBackend which allows you to inject "middleware" of sorts in between region and the actual backend. I wonder if this kind of thing could happen there.
FWIW this touches on an earlier attempt at a PR I had, and allowing the logic of the cache validator to be configurable.
As a quick refresher, the current system validates the cache by managing a dict payload that includes a dogpile API version and timestamp. If that functionality were pluggable, the validity could be based on other factors.
@Michael Bayer Totally fair. I really would alos prefer to not use abstract classes if we can get there without it. I would be happy to have invlidate do the same thing mutex does, let the cachebackend (or via proxy) easily then cover the needs of region wide invalidation.
Defaults can stay the same as today, but it becomes extensible.