Issue #8701 resolved

Zip snapshots have wrong timestamps

John Peacock
created an issue

It was brought to my attention that downloading a zip snapshot results in a zipfile containing timestamps in the future for all files, regardless of when the original commit was made. All of the timestamps appear to be UTC plus 20 minutes; so for example if I generate a zip archive at 9:14 EST, the timestamps are 13:34.

It is bad enough that there isn't a way to create the zip file with the original commit times, or to at least allow the user to set a timezone (or intuit one via the browser itself). However, that 20 minute offset shows a truly terrifying lack of system administration best practices. It makes the zip snapshot completely useless for anything containing a Makefile, for example. At the very least, there should be a prominent warning.

Comments (4)

  1. Erik van Zijst staff

    Hey John, the offsets you're seeing have nothing to do our system time.

    Let me attempt to shed some light on how Git (I'm assuming you're seeing this on a Git repo, if you're using Mercurial instead, let me know) creates zip files and file timestamps.

    First off, git itself does not keep track of file timestamps. So when you ask it to export a specific commit as a zip, it does the next best thing: it uses the commit's own timestamp for all files that it will export.

    The commit timestamp is created by the user and reflects the user's system's time at the time git commit was typed. This means that the accuracy is entirely in the hands of the user. If someone's clock is fast, then those commits will appear to be in the future. The server that this repo gets pushed to has no effect or control over this.

    When git commit stores the timestamp, it converts the user's local time to UTC and stores that in the commit object, together with the timezone offset.

    Now on to git archive --format=zip that we run on the server.

    When git writes a file entry to the zip file, it stores the (UTC) timestamp of the commit. It ignores the timezone offset, as the ZIP file format does not support localized timestamps.

    Unfortunately, the ZIP specification does not specify how file timestamps are supposed to be interpreted (localized vs UTC). This makes it problematic to unzip archives that were originally created in a different timezone, as the receiver will just have to guess what timezone the timestamps are in.

    What this means is that if you unzip an archive with a tool that interprets the timestamps to be in local time and you are West of Greenwich, the files will get extracted with timestamps that lie in the future. FWIW, this is the behavior I get when I extract an archive with OSX's Finder.

    Funnily enough, extracting this file manually with /usr/bin/zip, the timestamps correctly get interpreted as UTC and converted into local time (PST in my case).

    Now on to your 20 minute offset.

    Without knowing which revision on which repo you were exporting, I can't verify anything here. However, what you should be seeing here is that if you open the specific commit and look at its timestamp, it's minute and seconds part should correspond with what you see on your filesystem. This is fixed. It is the timestamp of the commit (except for the fact that the timezone offset may have gotten messed up as per the above discussion).

    If you download the same archive 5 minutes later, the unzipped files should still have the same local timestamps (and so your 20 difference has now become a 15 minute difference). The unzipped files from both archives should have identical timestamps.

    So:

    • Git cannot preserve individual file timestamps
    • Git uses the timestamp of the commit you export
    • Commit timestamps are as accurate as the developer's system clock
    • ZIP has no notion of timezones, nor mandates UTC (MS-DOS time format)
    • Git writes ZIP file timestamps as UTC (though clients have no way of knowing this)
    • Some zip tools assume UTC, some don't and so zipping and unzipping a file with different tools can mess up the timestamps
    • Bitbucket (nor any git hosting service) has any influence on the timestamps in git's zip files

    If you care about these things then I strongly encourage you to stay away from the ZIP format.

    N.B.
    The ZIP format allows for custom extensions and over the years, many custom fields were added to embed all kinds of additional data. Various applications have added data to improve ZIP's weak MS-DOS timestamps. However, none of that ever entered into the official standard and so again, cannot be relied on.

    If you want, I'm happy to have a look at your specific case if you provide me with your repo details and the download URL you used. If this is information you'd rather keep confidential, please email me at support@bitbucket.org and refer to this issue.

  2. Brian Nguyen staff

    Hi John,

    Since we haven't heard from you in a while, I'm closing this issue as answered. If you have any further questions, feel free to reopen this.

    Cheers, Brian

  3. Log in to comment