Collectors / docs / storages.txt

Full commit
.. _storages:

How to use the storage backends

By default, *Collectors* uses a simple python :class:`list` for each series. You
can use other storage backends to handle very large amounts of data or to get a
simple *MS Excel* export. You can also add your own storage classes very easily.

All storage classes can be found in submodules of :mod:``
(e.g. :class:``) but you can also import
:class:`` and
:class:`` directly from

You must pass an instance of the storage as keyword argument ``backend`` to a
new Collector. Each storage instance should only be used with one Collector
instance. ::

    from collectors import Collector
    from import MyStorage
    c = Collector(..., backend=MyStorage())


`PyTables <>`_ is not bundled with this package.
Instructions follow:

**Mac OS X (10.6 Snow Leopard)**

You should not use the precompiled version of *HDF5* because it’s linked against
*szip*, which is not bundled with *HDF5* and available under a license you might
not want. So you need to compile it yourself:

1. Download the source from
2. Build and install (*PyTables* will auto detect it if you install it under 

.. sourcecode:: bash

    $ ./configure --prefix=/usr/local
    $ make
    $ sudo make install
3. Finally install *PyTables*

.. sourcecode:: bash

    $ sudo pip install tables

**Ubuntu (9.10 Karmic Koala)**

Ubuntu’s package for PyTables is somehow broken, so you need to build your own.
If *gcc* is already installed, you just need to add the development files for
python and HDF5 before you can build and install PyTables from `PyPI

.. sourcecode:: bash

    $ sudo aptitude install python-dev libhdf5-serial-dev
    $ sudo pip install tables


Download the installer from `here <>`_
and execute it. Further information can be found in the `PyTables manual


    >>> import tables
    >>> from collectors import Collector, get, storage
    >>> class Spam(object):
    ...     a = 1
    ...     b = 2
    >>> spam = Spam()
    >>> h5file = tables.openFile('example.h5', mode='w')
    >>> collector = Collector(get(spam, 'a', 'b'),
    ...         backend=storage.PyTables(h5file, 'spamgroup', ('int', 'int'))
    ... )
    >>> for values in zip(range(10), reversed(range(10))):
    ...     spam.a, spam.b = values
    ...     collector()
    >>> print,
    [0 1 2 3 4 5 6 7 8 9] [9 8 7 6 5 4 3 2 1 0]
    >>> print,
    4.5 9
    >>> h5file.close()

The :class:`` storage stores the results for multiple Collector instances in one file. For each Collector, a new group will be create and each observed variable will have its own array within that group, so the group name musst be unique among all collectors that use the same HDF5 file.

You also need to specifiy the data types of the observed variables. They must be passed as list of strings (like e.g. ``'int'``, ``'float'`` or ``'string'``, see `here <>`_ for more details).

The *series* are now `EArrays <>`_ instead of simple lists. The ``read`` function returns the complete series for that variable as a NumPy array.