exportshapes is a tool for extracting shape libraries from DjVu documents
into a MySQL database. 

It is made available by the Formal Linguistics Department
of the University of Warsaw. It has been implemented by Piotr Sikora
to facilitate the development of tools for manipulation of DjVu shape
libraries (see https://bitbucket.org/piotr_sikora/moredjvushapetools).

The work is supported by the Ministry of Science and Higher Education's 
grant no. N N519 384036 (cf. https://bitbucket.org/jsbien/ndt). 


Exportshapes has all the requirements of DjVuLibre (see file: INSTALL) plus:

	* Boost C++ library


Currently it needs to be compiled as part of a fork of djvulibre.
To compile it from the main folder type:

make depend
cd tools
make exportshapes


To use exportshapes, you need to provide it with a database name, host name, username, password and
a filename to process. This will extract all shapes from the given document.

An user account used by exportshapes needs to have privileges to CREATE, DROP, INSERT and SELECT a given database.

./exportshapes [-i | -c | -o | -a | -l] [-f <page #>] [-t <page #>] [-a <document_address>] -u <username> -p <password> -h <host> -d <database> <file.djvu>

Options -u, -p, -h, -d supply required parameters.

Option -i inject required tables into the given database.

Option -c creates a database of the given name and then injects required tables as the previous option does.

Option -o does the same as -c, but also drops the database first, if it exists already. Good for testing.

Option -f: its argument specifies a page number from which the processing should start.

Option -t: its argument specifies a limit to the number of pages processed by the program.

Option -a: its argument specifies a remote address associated with the document 
	   (to allow remote users of the database to open the document via e.g. http).

Option -l: only process data linking document pages with inherited dictionaries. Previous versions 
		   of this tool didn't do so and running exportshapes with this tool fixes this shortcoming
		   while being backwards compatible with an already filled database.

E.g. "exportshapes -f 5 -t 10 [...]" would export shapes from pages 5,6,7,8,9 and 10.


TABLE shapes:
id INT not null auto_increment primary key,
original_id INT not null, -- index in the page dictionary the shape comes from
parent_id INT not null, -- references other shapes
width INT,
height INT,
dictionary_id INT not null, --references table of dictionaries
bbox_top INT, bbox_left INT, bbox_right INT, bbox_bottom INT -- bounding box

TABLE blits: -- shapes' locations in document
id INT not null auto_increment primary key,
document_id INT not null, -- references table of documents
page_number INT not null,
shape_id INT not null, -- references table of shapes
b_left SMALLINT UNSIGNED not null,
b_bottom SMALLINT UNSIGNED not null

TABLE documents: -- documents stored in the database
id INT not null auto_increment primary key,
document varchar(60) not null,
document address(100) not null

TABLE dictionaries: -- dictionaries stored in the database
id INT not null auto_increment primary key,
dictionary_name varchar(60) not null,
page_number INT not null, -- which page this dictionary belongs to, inherited dictionaries have -1 in this field
document_id INT not null -- references table of documents

TABLE pages: -- linking inherited dictionaries with pages that use them
document_id INT not null, -- which documents this entry refers to
inh_dict_id INT not null, -- which inherited dictionary this entry refers to
page_number INT not null -- which page uses that dictionary