WiktionaryIdioms / README.md

Wiktionary Idiom Classifier & Detector

Grace Muzny and Luke Zettlemoyer. Automatic Idiom Identification in Wiktionary. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2013.

The Classifier

The Detector

Building

There is an accompanying ant file. From the WiktionaryIdioms directory, simply type ant or ant dist to build the distribution jar.

If you would like the runnable jars corresponding to classifier.experiments.RunClassifierExperimentFromFiles and detector.experiments.RunDetectorExperimentsFromFiles, these can be built by running the command ant runnables, and will be made in your WiktionaryIdioms/dist/ directory.

Alternatively, the jar files are all available in the Downloads section.

Running

Most of the classes with main files come with descriptions of all parameters that they need to be passed. Here is a description of two key classes and how to run them.

All main classes will work with 4g of memory allocated. (-Xmx4g) Most require at least this much memory.

RunClassifierExperimentFromFiles

This experiment takes two arguments from the command line:

  • <type> : The type of classifier experiment to run. Choose from "basic", "grid", "compare", or "comparegroups".

  • <config_path> : The path to config/classifierconfig.xml. Depending on the type of experiment you are running, different fields will be drawn from for the config file.

Data

Eclipse

To work on the project in Eclipse, simply download and import the project WiktionaryIdioms into it.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.