Issues

Issue #178 new

storage.exists() + S3Boto, Storages is making a ton of requests and downloading large amounts of XML

Brent O'Connor avatarBrent O'Connor created an issue

When using storage.exists() + S3Boto, I ran into a big issue with easy-thumbnails. When saving several different thumbnails for just one file, the memory for my python process would grow to 2.75 GB. I tracked this down to storage.exists() making tons of requests and downloading large amounts of data. This is because I'm using a bucket on S3 with over 300K objects.

I submitted a patch to easy-thumbnails here, https://github.com/epicserve/easy-thumbnails/commit/e389235ca53462e87d335ace0cca8558f52a8a03.

Is there a way to optimize Storages, so storage.exists() would work faster and not use large amounts of data?

Comments (5)

  1. Ian Lewis

    If your files don't change then you can use AWS_PRELOAD_METADATA=True in your settings to keep a cache of the file names so that storages can check if the file exists or not.

    Currently the cache doesn't update so it's not good for buckets where you are constantly adding/updating/deleting files. :-/

  2. Ian Lewis

    Yah, I think until we update storages to keep a local cache of the files and file names it's not going to get any better.

    Are you updating files via Django/storages or via something else?

  3. Log in to comment
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.