About Parsimonious Vole
Systemic Functional Linguistics provides a semiotic perspective on language. The text analysis described in Systemic Functional Linguistics (SFL) can be of critical value in real-world applications. But parsing with SFL grammars is computationally intensive task and parsers for this level of description to date have not been able to operate on unrestricted input. This software implements a graph-based method to automatically generate simplified SFL mood and transitivity parses of English sentences from Stanford Dependency parses and a database (PTDB) providing transitivity categories for each verb.
Citing the Parser
The main technical ideas behind how this parser works appears in these papers. Feel free to cite one or more of the following papers depending on what you are using.
- Costetchi, E. (2013). A method to generate simplified Systemic Functional Parses from De-pendency Parses. DepLing 2013, 68.
- Costetchi, E. (2013). Semantic role labelling as SFL transitivity analysis. ESSLLI Student Session 2013 Preproceedings, 29.
- Costetchi, E. (2013). Towards a Discourse Model for Knowledge Elicitation. In RANLP (pp. 38-44).
Parsimonious vole is system independent and should run anywhere Python runs. However it has been tested on Linux. The installation instructions below are also prepared for Linux platform.
- Install some prerequisites
- Get Stanford Parser
- Add STANFORD_PARSER_HOME environment variable (on Linux use to .bashrc)
- Install the parser
sudo apt-get update; sudo apt-get -y install openjdk-8-jdk git unzip python-pip;
Get Stanford Parser
Download Stanford Parser and unzip it into a preferred folder.
I recommend to use the the development version of Stanford Parser 3.4 because the latest version have not yet been tested. To download and unzip into ~/third-party folder run the code below.
mkdir ~/third-party cd ~/third-party wget http://nlp.stanford.edu/software/stanford-parser-full-2014-06-16.zip unzip stanford-parser-full-2014-06-16.zip
To get the latest version (3.7) execute this code:
mkdir ~/third-party cd ~/third-party wget http://nlp.stanford.edu/software/stanford-parser-full-2016-10-31.zip unzip stanford-parser-full-2016-10-31.zip
The environment variable STANFORD_PARSER_HOME
Because Parsimonious Vole depends on Stanford parser it has to be available somewhere on the system, and the path to it MUST be exposed as environment variable.
Every time you will run the parser it will look for STANFORD_PARSER_HOME.
I recommend adding permanently the environment variable to .bashrc . To do so open the file:
Scroll to the end of the file and added the following line (of course change the path accordingly):
export STANFORD_PARSER_HOME=~/work/third-party/stanford-parser-full-2014-06-16 export STANFORD_CORENLP=~/work/third-party/stanford-parser-full-2014-06-16 export STANFORD_MODELS=~/work/third-party/stanford-parser-full-2014-06-16
Get the parser source code
git clone https://bitbucket.org/lps/parsimonious-vole
Install the parser
To install the parser go to the source folder and run the pip install from source.
cd parsimonoius-vole pip install .
This will install the Parsimonious Vole python module (for programmatic usage) and create the command line "vole"
Download the "punct" model using NLTK download UI
Execute the following command. It will open a GUI from NLTK that allows selection and download of various corpora, models, grammars and other NLTK goodies. Browse in "Models" tab and mark punct for download.
python -c "import nltk; nltk.download()";
Test the installation
Check if the environment variable is set.
expected to print the path to the folder wit Stanford parser.
Check if the Python module is installed.
python -c "import magic.guess";
Expect nothing to happen, otherwise an error will be thrown.
Check if the command line works.
echo "Someone went outside but I don't exactly know who." > input.txt vole parse input.txt
Expect a text file to be created comprising a single sentence. This text file is fed as input parameter to "parse" command of the vole command line tool. Expect some debug information to be printed out and input.txt.stp to be created containing dependency parse generated by Stanford Parser and a folder input.txt.html containing the HTML representation of SFL parse.
Command line interface
When Parsimonious Vole is installed it makes available the "vole" command line tool. Currently it offers only one command - parse.
It takes a mandatory input file and optionally, depending on the format, an output file.
For HTML format no output is expected, or rather the output is ignored because the parse result is created based on the input file.
For "JSON", "ADJ_LIST", "YAML", "PICKLE" formats output file is mandatory.
For usage information try
vole --help vole parse --help