Wiki

Clone wiki

bagira / Home

Bagira

This is a simple gem that executes Ghostscript and Tesseract OCR via command line in order to perform ocr on images and pdf documets. It currently supports JPG, PNG and PDF.

Installation

Before using please make sure that you have installed both Ghostscript and Tesseract OCR

On Ubuntu:

sudo apt-get install libgs-dev
sudo apt-get install tesseract-ocr

On Mac (Homebrew):

brew install gs
brew install tesseract

Add this line to your application's Gemfile:

gem 'bagira'

And then execute:

$ bundle

Or install it yourself as:

$ gem install bagira

Usage

bagira = Bagira.new(path_to_your_file)
output = bagira.perform_ocr

Development

rSpec is included with basic tests, you can check the spec/ folder for more details.

Contributing

Bug reports and pull requests are welcome on GitHub at https://bitbucket.org/juanmvallejo/bagira/issues

License

The gem is available as open source under the terms of the MIT License.

Updated