Bitbucket is a code hosting site with unlimited public and private repositories. We're also free for small teams!

Close

rtar.py Manual

Author: Marc 'BlackJack' Rintsch
Contact: marc@rintsch.de
Date: 2005-08-26
Version: 0.3
Copyright: This document has been placed in the public domain.

1   Name

rtar.py -- an archiver.

2   Synopsis

rtar.py [-h|--help|--version]
rtar.py [options] file(s)

3   Description

The program creates compressed tar archives from files and directories. In contrast to the original tar it builds a list of file names first and sorts it in a way that (should) give a better compression ratio.

It also makes the common task of archiving one directory a bit easier by providing a short option that infers the file name of the archive from the name of the directory and the compression algorithm. See Examples for details.

3.1   Why sorting the names?

Way back in the old DOS days the RAR archiver had, and still has, three advantages over ZIP archives when it comes to compression ratio:

  1. RAR uses a slower but better compression algorithm than the standard deflate algorithm of ZIP archives,
  2. it creates solid archives instead of compressing each file separatly to benefit from redundancy between files,
  3. and files are grouped by file name extensions in order to have files with similar contents close to each other and benefit from 2. even more.

With tar archives 1. is true if bzip2 compression is used and 2. is always true as a tar archive is created first and then compressed as a whole.

But the grouping by file name extensions is not done by the standard tar programs. This is what rtar.py is doing.

4   Requirements

The script requires Python 2.4 or higher.

5   Commandline Options

--version show program's version number and exit
-h, --help show this help message and exit
-o FILENAME, -f FILENAME, --file=FILENAME
 write archive to this file instead of STDOUT.
-a, --auto-name
 infer archive file name from the first given directory name. This only works if there is just one directory name given as argument. The archive is named: <directory_name>.tar[.<algorithm>]
--list just dump the sorted file names to STDOUT -- don't create archive.
--compression=ALGORITHM
 select compression algorithm from none, gzip or bzip2. [bzip2]
-z, --gzip use gzip compression.
-j, --bzip2 use bzip2 compression. [default]
-b BLOCKS, --blocking-factor=BLOCKS
 BLOCKS x 512 per record [20]

6   Examples

Compress the contents of directories and all their subdirectories:

rtar.py foo/ > foo.tar.bz2
rtar.py -o foo_and_bar.tar.bz2 foo/ bar/

Create the archives foo.tar.bz2 and bar.tar.gz with the auto naming option:

rtar.py -a foo/
rtar.py --auto-name --gzip bar/

7   History

0.3.0: 2005-08-26

Added -b/--blocking-factor option. Setting it to 1 prevents some blocks full of zero bytes to be appended to the archive. May save some bytes, but generally those blocks are compressed very effectivly anyway.

The program does not crash anymore if it comes across files that can't be read. A warning is printed instead.

Directories given at the command line are archived now too. Before this fix only the contents of the directory were archived but not the top level directory name itself.

0.2a : 2005-01-23

Fixed a really stupid bug that made creating archives with redirecting the output into a file impossible.

While compressing each file name is written to stderr and prefixed with the percentage of files already processed.

The user can select the compression algorithm (none, gzip or bzip2) and the auto naming feature (-a) was implemented.

0.1a : 2005-01-22
Initial release. Can be used to create archives but has a severe bug: it silently ignores problems while creating the file list.

8   ToDo

  • Sort extensions by extension list instead of alphabetically.
  • Group backup files (*{~,.bck,.bak}) with their "master" files and sort just by name without extension.
  • Exclude list/patterns.
  • Option to add a prefix to every file.
  • Change attributes like uid/gid, uname/gname.
  • Color output with ANSI escape sequences if stderr is a tty.

9   Bugs

  • Silently ignores problems while creating the file list.

Recent activity

Marc 'BlackJack' Rintsch

Commits by Marc 'BlackJack' Rintsch were pushed to blackjack/rtar.py

5de3cc7 - Renamed README.txt to README.rst.
Marc 'BlackJack' Rintsch

Commits by Marc 'BlackJack' Rintsch were pushed to blackjack/rtar.py

1b50020 - Changed encoding of README to UTF-8.
Marc 'BlackJack' Rintsch

Commits by Marc 'BlackJack' Rintsch were pushed to blackjack/rtar.py

760234f - Removed svn keyword variables.
Marc 'BlackJack' Rintsch

Commits by Marc 'BlackJack' Rintsch were pushed to blackjack/rtar.py

3ba3c02 - Changed rst2html executable name and deleted default.css as it is embedded automatically by current versions of docutils.
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.