1. David Larlet
  2. django-storages

Issues

Issue #178 new

storage.exists() + S3Boto, Storages is making a ton of requests and downloading large amounts of XML

Brent O'Connor
created an issue

When using storage.exists() + S3Boto, I ran into a big issue with easy-thumbnails. When saving several different thumbnails for just one file, the memory for my python process would grow to 2.75 GB. I tracked this down to storage.exists() making tons of requests and downloading large amounts of data. This is because I'm using a bucket on S3 with over 300K objects.

I submitted a patch to easy-thumbnails here, https://github.com/epicserve/easy-thumbnails/commit/e389235ca53462e87d335ace0cca8558f52a8a03.

Is there a way to optimize Storages, so storage.exists() would work faster and not use large amounts of data?

Comments (5)

  1. Ian Lewis

    If your files don't change then you can use AWS_PRELOAD_METADATA=True in your settings to keep a cache of the file names so that storages can check if the file exists or not.

    Currently the cache doesn't update so it's not good for buckets where you are constantly adding/updating/deleting files. :-/

  2. Ian Lewis

    Yah, I think until we update storages to keep a local cache of the files and file names it's not going to get any better.

    Are you updating files via Django/storages or via something else?

  3. Log in to comment