EasyXf: format-agnostic tabular dataset library

EasyXf is a format-agnostic tabular dataset library, written in Python.
+ It is based on the excellent Tablib dataset library from Kenneth Reitz.
+ The styling layer is written by Guido Draheim borrowing ideas from xlwt.

Output formats supported: | Plain Data Transfer Format | Document Formats with Styling Options |--------------------------------------------|------------------------------------------------- | CSV (Sets) .. comma-seperated values | XLS (Sets + Books) .. Excel 9X/2003 BIFF | SCSV (Sets) .. semicolon-seperated values | XLSX (Sets + Books) .. Excel 2007 XML/ZIP | TSV (Sets) .. tab-separated similar to CSV | ODS (Sets + Books) .. OpenOffice XML/ZIP | HTML (Sets + Books) .. plain html4 table | XHTML (Sets + Books) .. xhtml5 with full styling | JSON (Sets + Books) .. list or object | MARKDOWN (Sets + Books) .. only bold and dataformat
| YAML (Sets + Books) .. associative list | .

See also the full easyxf-tablib doxygen API documentation and tutorial


easyxf.Dataset() A Dataset is a table of tabular data. It may or may not have a header row. They can be build and manipulated as raw Python datatypes (Lists of tuples|dictionaries). Datasets can be imported from JSON, YAML, and CSV; they can be exported to XLSX, XLS, ODS, JSON, YAML, CSV, TSV, and HTML.

easyxf.Databook() A Databook is a set of Datasets. The most common form of a Databook is an Excel file with multiple spreadsheets. Databooks can be imported from JSON and YAML; they can be exported to XLSX, XLS, ODS, JSON, and YAML.


The main data type is a row which is just a plain list in Python. You can assemble a list of rows anywhere in your program. When ready you convert them to a Dataset with a description of the headers:

data = [
    ('John', 'Adams'),
    ('George', 'Washington')

data = easyxf.Dataset(*data, headers=('first_name', 'last_name'))

From there on you can add more rows by handing over tuples/lists via "append":

>>> data.append(('Henry', 'Ford'))

It is possible to extend with dataset with a new column as well:

>>> data.append_col((90, 67, 83), header='age')

Integers are used to select rows, including slices:

>>> print data[:2]
[('John', 'Adams', 90), ('George', 'Washington', 67)]

Strings are used to select columns that will be returned as a list as well:

>>> print data['first_name']
['John', 'George', 'Henry']

The indexing is not only limited for selection, you can also delete rows:

>>> del data[1]


There are many export/import options available which are plugged in from a formats/* list. Each Dataset and Databook will have a property member that can be used to read/write the binary data for the data format and it will be encoded/decoded correctly.

For example, if you have Dataset that you want to transfer as a json data block, just say ".json" (the getter property).

>>> print data.json
    "last_name": "Adams",
    "age": 90,
    "first_name": "John"
    "last_name": "Ford",
    "age": 83,
    "first_name": "Henry"

Converting to CSV (or SCVS, TSV) is about the same. It depends on whether you have been providing headers= so that a header row appears or not.

>>> print data.csv

It works the same for XLS,XLSX,ODT but you will not usually want to "print" it to the termanal - instead open a file in binary mode (important!) and write the converted data to the file:

>>> open('people.xls', 'wb').write(data.xls)

Likewise you can read most of the document format and transfer data formats back to a dataset by simply assigning to it (the setter property). So it does work to do this:

    >>> data = Dataset()
    data.xls = open('people.xls', 'rb').read()

It's that easy.

There are some gotchas with reading however, so for example the CSV reader assumes that a header line is present, and many readers will loose styling information. Adding styling is otherwise just as easy - the styling hints live in a list of strings parallel to the row data, so you can do this:

>>> import easyxf, datetime
data = easyxf.Dataset(headers=('first name', 'last name', 'born', 'books'))
data.append(("James", "Tobin", datetime.datetime(1919,3,5), 7))
data.rows[-1].style = ["font: bold true",]
data.set("born", format = "MM/DD/YYYY")
>>> print data.markdown
first name  |last name|born      |books
**James**   |Tobin    |03/05/1919|    7


You can fork the project for either tablib or easyxf repository with the changes coming back as pull requests or proper rpm-style patches.