The code expects the following toolkits to be installed:
- http://opencv.org version 2 (although 3 might also work)
The code depends on a few C-programs in the toolbox, created by Arnold Meijster. These only need to be compiled in place, which can be done by entering the
toolkit folder and running
```(bash) cd toolkit make
# Usage The program is shipped with a trained recognizer, which is stored in `state.p`. So training is not necessary! ## Recognizing a text in a word file To determine all characters and text for each word in a specific word file and accompanying page image, use: ```(bash) ./recognizer.py image.jpg word.xml output.xml
That's it! It should also be able to handle other image formats, such as pgm.
Training on a new dataset
To train the recognizer on a new dataset, or test it on a part of the dataset, a few steps are necessary.
The paths to the dataset are coded in
dataset.py. Here, the function
default_dataset() specifies which datasets to load. The default implementation, which uses the datasets as included in this repository:
When you run
recognizer.py, it will by default only test on 20% of the dataset. To force the recognizer to retrain itself, remove the
state.p file. Then, running the recognizer without arguments will cause it to retrain and test itself again.
The repository contains a number of tools:
test_average_recognizer.pyis a test case for using the same window size for every instance of the same character.
test_bounding_box.pyis a test case for determining the bounding box of a character by using connected components.
test_mask.pyis a test case which handles binarization using Otsu and opening, which is then used to mask the image partially.
test_ngrams.pytests whether certain ngrams occur more than others, and if it would be worth wile to train on ngrams instead of only single characters as well.
test_s_vs_f.pycompares the occurrences of f and s in the dataset.
test_widths.pyplots the distribution of widths per character class.
There are also a few interactive tools, implemented as micro webservices using http://flask.pocoo.org.
webservice.pyis a tool to explore the dataset
webservice_s_vs_f.pyis a very basic program to convert certain s-shaped f's to actual f's.
webservicerecognitiontest.pyallows you to inspect the files that can be dumped by the recognizer, which encode not only the final solution, but also all other candidates, and how they were classified. To create files that can be read by this service, set