WiktionaryIdioms / README.md

Wiktionary Idiom Classifier & Detector

Grace Muzny and Luke Zettlemoyer. Automatic Idiom Identification in Wiktionary. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013.

The Classifier

The Detector


There is an accompanying ant file. From the WiktionaryIdioms directory, simply type ant or ant dist to build the distribution jar.

If you would like the runnable jars corresponding to classifier.experiments.RunClassifierExperimentFromFiles and detector.experiments.RunDetectorExperimentsFromFiles, these can be built by running the command ant runnables, and will be made in your WiktionaryIdioms/dist/ directory.

Alternatively, the jar files are all available in the Downloads section.


Most of the classes with main files come with descriptions of all parameters that they need to be passed. Here is a description of two key classes and how to run them.

All main classes will work with 4g of memory allocated. (-Xmx4g) Most require at least this much memory.




To work on the project in Eclipse, simply download and import the project WiktionaryIdioms into it.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.