Issues

Issue #167 new

collectstatic is not overwriting changed images

Lucas Sampaio
created an issue

Everytime I replace an image with another with the same name, collectstatic does not replace it on my S3, no matter how I configure my Django settings. Here's my current related attributes:

AWS_S3_FILE_OVERWRITE = True
AWS_QUERYSTRING_AUTH = False
AWS_PRELOAD_METADATA = True

INSTALLED_APPS += (
   'storages',
)

DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
STATICFILES_STORAGE = 'storages.backends.s3boto.S3BotoStorage'

AWS_STORAGE_BUCKET_NAME = 'my_statics'
STATIC_URL = "http://%s.s3.amazonaws.com/" % AWS_STORAGE_BUCKET_NAME

AWS_ACCESS_KEY_ID = 'FOO'
AWS_SECRET_ACCESS_KEY = 'BAR'

DAY = 86400  # 24h
AWS_HEADERS = {
   'Cache-Control': 'max-age=%d' % (7 * DAY), 
}

Comments (10)

  1. Ian Lewis

    I think this is actually not the best solution as there are a number of timezones that are in play. i.e. The system timezone, the TIME_ZONE setting, the timezone as returned from S3.

    The PR uses the TIME_ZONE setting which can be different from the system timezone which is what the FileSystemStorage uses. The django docs explicitly state that it the system timezone and TIME_ZONE setting could be different (they do need to be the same on Windows though).

    I sent a message to the django-users mailing list to see if there is any consensus on which timezone to use. I have a feeling that the FileSystemStorage is going to be a de facto standard and that the system timezone is actually what we should be using.

    https://groups.google.com/forum/#!topic/django-users/pEIKxsHYN74

  2. gentro

    Okay here's another wrinkle. I assume many people run collectstatic after pushing to a production environment with Git. If there are collaborators on a project in different time zones, the modified time on the post-merge files will reflect the system time on the computer of whoever made the latest commit. As the Git wiki says, "Git sets the current time as the timestamp on every file it modifies". So even if you convert the S3 UTC time to system time on the production machine, that may still not tell you whether the source is newer than the target because you don't know what time zone the source file was created in.

    It's only an issue if the collaborators are in different time zones. I don't know how often that happens on a Django project, but it's not a weird use case.

  3. gentro

    Actually, I was thinking more about this and I take back what I said. File metadata will always have the modified time in epoch seconds, so it'll always be accurate even if there are different versions getting passed around and merged in a vcs.

    That said, the system time based pull request Bob Spryn has now seems like the right solution. It'd be interesting to hear if the mailing list users have an opinion on where the TIME_ZONE setting should be used, but in this particular case turning the S3 timestamp to system time is the correct fix given that FileSystemStorage converts the source file's modified time to system time using datetime.fromtimestamp().

    Hopefully we can get this merged soon. Having to set --clear on collectstatic to get files to upload makes the process unreasonably long.

  4. jaylett

    I'm currently working on a documentation patch for Django (checked with a core dev) that will clarify the Storage API behaviour as returning naive datetime objects based on the "system" (os.environ['TZ']) timezone. (Ideally we'll also come up with a better API for the future with a good BC story.)

  5. Log in to comment