HTTPS SSH

GCIDE to SQL

Generate HTML-useful entries from the Collaborative International Dictionary of English (GCIDE) project — including Webster's Revised Unabridged Dictionary (1913 + 1828).

This bash script takes the raw ASCII GCIDE files, updates the non-standard encoded characters to UTF-8, modifies some tags, and then outputs the entries into SQL (PostgreSQL).

Build

First clone the GCIDE repo into the project folder:

git clone git://git.savannah.gnu.org/gcide.git

Then run the build file:

./build.sh

Once done you can just import the SQL into your Postgres table;

psql -h localhost -U postgres avuncular < CIDE.A-Z.sql

If you get an error about file encoding you can run iconv -t utf-8 -c CIDE.A-Z.sql > CIDE.A-Z.utf8.sql to remove any non utf-8 characters.

There is also a script to run all the definitions through espeak to generate a pronunciation for each. This generates a second sql file which can be run on the same database generated by the build script. It must be run after the CIDE.A-Z.sql is inserted into a database since espeak.sql run update commands based on the definition ids.

./espeak.sh

License

As with the GCIDE project, this project is licensed under the GNU General Public License as published by the Free Software Foundation; you can redistribute it and/or modify it under the terms of the GPL.

Credits

Lookup file partially based on table from WebsterParser