provide downloadable tarball of ET

Issue #1229 closed
Roland Haas created an issue

I created a mock up download page (with one paragraph changed) with a tar ball of the current release version of the ET.

http://www.tapir.caltech.edu/~rhaas/ET/

To provide the tarball we would need some way to store data on the web-sever that is not archived via svn (or live with ~200MB commmits). Alternatively I am happy to have the tarball live in in my webspace and keep the link though a more "permanent" storage site might be more useful.

The tarball would need to be updated whenever the release branch(es) change.

Keyword:

Comments (16)

  1. Frank Löffler
    • removed comment

    We could cut the size down by not including repository metadata. This would reduce the size of the final extraced data from about 724MB to about one half. Of course, this would have also the disadvantage that users would not be able to 'update' easily. However, with the current setup users might run into the trouble that their clients might not be compatible with the repository client data format. I usually don't expect RCS metadata when I download a tarball.

    Using another compression also helps. Your uncompressed tarball is 589MB (the rest to the mentioned 724MB seems to be filesystem overhead - understandable with that many small files). The gzipped tarball is 271MB. xz would get this down to 187MB. Combining this with excluding RCS data the compressed tarball should bring us to less than 100MB.

  2. Erik Schnetter
    • removed comment

    Note: It is considered bad style to "update" a tarball. Instead, one would choose a new name for every tarball, e.g. by adding .1, .2, .3 to the version number.

  3. Roland Haas reporter
    • removed comment

    I am fine with removing the metadata, mostly because of incompatible version control software (svn comes to mind which bites me occasionally). I'd rather not use xz or other not as widely used compression algorithms. If possibly I'd like something that can be extracted with stock system software on Linux and OSX (Windows I don't think we need to worry given that we don't even test the ET on Windows machines). A tarball without metadata (and just gzip) is about 114MB. I expect most of our users that report bugs to have used the GetComponents script, however we will have to expect bug reports for "the tarball that was downloaded 3 months ago". GetComponents does not seem to have an option to output (for each module) which version was checked out, does it?

    I appended the release name and date to the tarball name, I prefer dates over appended numbers since numbers give us no hint as to what the tarball contains. Ideally I'd like to have a way of specifying a tag name or something similar that would allow me to get the precise source code versions that are in the tarball.

  4. Erik Schnetter
    • removed comment

    Yes, we definitively want to tag each version that makes it into a tarball. That is, after updating the release branch, we would tag the new version (which may differ via several commits), and then roll a new tarball from that. We can use the date as tag (why not?), but tag name and release tarball name should be the same.

  5. Ian Hinder
    • removed comment

    We have to have a tag representing what ends up in the tarball, as Erik said. As such, we know exactly what is there. Since dates are longer than version numbers, and also potentially cause problems if we want to have more than one version in a given day (e.g. we tag, release, realise we messed up, and then release again a few hours later), I vote for version numbers. That is also the usual way that software is distributed, not with dates. What sorts of names are we talking about for the tarballs? einsteintoolkit-2012-11-Oersted-1.tar.gz, for example?

  6. Frank Löffler
    • removed comment

    I am not so sure about dismissing xz so fast. I agree that some systems might not have that installed (queenbee for instance) - but then, some systems don't have all of svn/git installed and we usually live with the quite fine. Why? Because the system we care most about for the tarball is the development system of a user - a laptop or workstation they have control over. I am pretty sure xz is nowadays available on all Linux systems as native package - and (sorry to say that so direct) a user who doesn't know how to install new software on their own system should probably catch up on the before trying the ET. Odds are other things are missing as well, like a c++ or a fortran compiler.

    On the other hand I agree - it's one more step where a new user might stumble.

  7. Roland Haas reporter
    • removed comment

    Franks argument about multiple releases per day is unfortunately a strong one against my suggestion of using dates as tags I think. We do have (or used to have) release tags in the repositories. If we want to use the same tag in releases tarball as in the VCSs then we have to re-tag every module when we update the release, even if that module did not change (think eg. we update a simfactory optionlist and then have to re-tag Kranc).

    The alternative would be to include a fingerprint file in the tarball, eg subversion revision number and git/hg hashes for the included versions.

  8. Frank Löffler
    • removed comment

    Replying to [comment:9 rhaas]:

    all that is really needed is a C/C++ and Fortran compiler

    Right. However, that is usually not installed on a standard system. Installing these requires the same skills as installing some other native package, or even less because there are probably not multiple versions of xz around. And 'same skills' doesn't even mean to know how to use dselect or aptitude anymore nowadays. :)

  9. Frank Löffler
    • removed comment

    Right now, the numbers are:

    312M    ET_2013_05.tar
    126M    ET_2013_05.tar.bz2
    131M    ET_2013_05.tar.gz
    104M    ET_2013_05.tar.xz
    
  10. Frank Löffler
    • removed comment

    From this it is clear that the bzip2 compressions doesn't add enough benefit for the disadvantage of being a less common format. The other two might be worth providing. 104MB vs. 131MB is quite a difference when downloading stuff.

  11. Log in to comment