Clone wiki

s3vcp / Home

s3vcp

A simple s3 file/directory synchronizer, that has the unique abilities to use multiple threads and only copy files with a changed md5 hash.

Installation

Just use easy_install or pip if possible:

$ sudo easy_install s3vcp

Otherwise, the main module that resides in s3vcp.s3vcp can be run out of the box. That means, that you can easily copy it into a bin folder in your PATH as such:

$ wget -O /usr/local/bin/s3vcp http://bitbucket.org/madssj/s3vcp/raw/tip/s3vcp/s3vcp.py

But again, if you can, use easy_install, pip or python setup.py install.

Usage

Usage: s3vcp.py [options] <mode upload|download> <bucket> <local-source> [local-source ...]

Options:
  -h, --help            show this help message and exit
  -p PREFIX, --remote-prefix=PREFIX
                        prefix all the new keys with PREFIX
  -t NUM, --num-threads=NUM
                        use NUM worker threads
  -c PATH, --copy-from=PATH
                        copy the keys on S3 from the location PATH before
                        uploading, the files md5 sum will be used to detirmine
                        if there are any changes between the local and
                        remote files
  -e, --add-expires     add far future expires headers to all the files
                        uploaded

Examples

To copy some folder into a bucket called bucket.example.com Amazon S3:

$ s3vcp bucket.example.com /path/to/folder/to/copy /other/folder
copying ...
$ 

To copy a bucket called bucket.example.com into a local folder called test:

$ s3vcp bucket.example.com /path/to/my/local/folder
downloading ...
$ 

But what I really designed this tool to do, was to be used to create versionables for Amazon CloudFront, and to speed up uploading a new version as much as possible.

I archived this by first copying an old version to the new version prefix, and then check every key's md5 hash against the local files, thus only uploading the difference from the last version.

Creating an initial version called v1:

$ s3vcp bucket.example.com -p v1 /path/to/folder

Which will upload every file from /path/to/folder into a new public key prefixed with v1. Every new key will then be available on your cloudfront distribution via. http://cloudfront.example.com/v1/path/to/new/key.ext.

The next time you need to upload a version, and copy every file from an old version, the -c option is used.

So to create a new version called v2, and copy the keys from v1 before copying any local files, the following whould be used:

$ s3vcp bucket.example.com -p v2 -c v1 /path/to/folder

Advanced usage

The inner workings of s3vcp can be exploited for usage with a fabfile:

Public method signatures:

s3_copy(s3_bucket, source, target, num_workers)

s3_upload(s3_bucket, lpath, rpath, num_workers, add_expires=False)

s3_download(s3_bucket, lpath, num_workers, remote_path=None, matcher=None)

If you're going to use this, you sould be able to figure out how from the above, otherwise the source is very reader friendly.

Updated