What you need
- 2GB RAM; 4 GB of HDD space
- Java 7 SE or newer (Not just the JRE)
- Netbeans 7.0 or Apache Ant
- The tnt tagger
1. Download and install the tools
Note: The tools can run on linux and mac, however most testing and development was done on MS Windows. The following instructions assume you are using Windows.
- Get and install Java and Ant:
one way to do this is to get and install the Java JDK SE+Netbeans bundle (http://www.oracle.com/technetwork/java/javase/downloads/index.html)
- Make Apache Ant accessible - add the NetBeans dir\java\ant\bin directory to the path variable:
- Press Win+Break
- On Windows 7: click Advanced system settings link on the left side
On Win XP: click the Advanced tab
- Click the Environment variables button
- In user variables, find the variable PATH, click Edit.
If the variable does not exist, click New enter PATH as the variable name
- Add the path NetBeans dir\java\ant\bin (usually c:\Program Files\NetBeans 7.0\java\ant\bin) at the end of variable value field (using semicolon as a separator)
Note: Ant can be also downloaded separately (http://ant.apache.org/manual/install.html in that case add ant’s bin directory to the path)
- Unzip the morph system into into a directory without spaces.
- Obtain the tnt tagger (http://www.coli.uni-saarland.de/~thorsten/tnt/; free but a license is required).
Place it into the morph's bin directory (set the tnt-para.bin and tnt.bin variables apropriately if you have it elsewhere)
2. Add language projects
Download language resource(s) separately, and follow the instructions coming with them.
In general:
- you need to edit the custom.properties file coming with them to point to the morph system and possibly to additional resources (corpora) required by the language distribution
- run ant with the sys target to compile the resources puting them into the build/sys directory
- run the dev target to analyze/tag the development corpus. The results can be found in the build/dev directory. Similarly, the test target processes the test corpus, and the any target can be used to process a custom corpus (see build/test and build/any directory for the results)