1. Eugeniu Costetchi
  2. parsimonious-vole

Overview

HTTPS SSH

About Parsimonious Vole

Systemic Functional Linguistics provides a semiotic perspective on language. The text analysis described in Systemic Functional Linguistics (SFL) can be of critical value in real-world applications. But parsing with SFL grammars is computationally intensive task and parsers for this level of description to date have not been able to operate on unrestricted input. This software implements a graph-based method to automatically generate simplified SFL mood and transitivity parses of English sentences from Stanford Dependency parses and a database (PTDB) providing transitivity categories for each verb.

Citing the Parser

The main technical ideas behind how this parser works appears in these papers. Feel free to cite one or more of the following papers depending on what you are using.

  • Costetchi, E. (2013). A method to generate simplified Systemic Functional Parses from De-pendency Parses. DepLing 2013, 68.
  • Costetchi, E. (2013). Semantic role labelling as SFL transitivity analysis. ESSLLI Student Session 2013 Preproceedings, 29.
  • Costetchi, E. (2013). Towards a Discourse Model for Knowledge Elicitation. In RANLP (pp. 38-44).

Installation

Parsimonious vole is system independent and should run anywhere Python runs. However it has been tested on Linux. The installation instructions below are also prepared for Linux platform.

  • Install some prerequisites
  • Get Stanford Parser
  • Add STANFORD_PARSER_HOME environment variable (on Linux use to .bashrc)
  • Install the parser
  • Test

Prerequisites

sudo apt-get update;
sudo apt-get -y install openjdk-8-jdk git unzip python-pip;

Get Stanford Parser

Download Stanford Parser and unzip it into a preferred folder.

I recommend to use the the development version of Stanford Parser 3.4 because the latest version have not yet been tested. To download and unzip into ~/third-party folder run the code below.

mkdir ~/third-party
cd ~/third-party
wget http://nlp.stanford.edu/software/stanford-parser-full-2014-06-16.zip
unzip stanford-parser-full-2014-06-16.zip

To get the latest version (3.7) execute this code:

mkdir ~/third-party
cd ~/third-party
wget http://nlp.stanford.edu/software/stanford-parser-full-2016-10-31.zip
unzip stanford-parser-full-2016-10-31.zip

The environment variable STANFORD_PARSER_HOME

Because Parsimonious Vole depends on Stanford parser it has to be available somewhere on the system, and the path to it MUST be exposed as environment variable.

Every time you will run the parser it will look for STANFORD_PARSER_HOME.

I recommend adding permanently the environment variable to .bashrc . To do so open the file:

gedit ~/.bashrc

Scroll to the end of the file and added the following line (of course change the path accordingly):

export STANFORD_PARSER_HOME=~/work/third-party/stanford-parser-full-2014-06-16
export STANFORD_CORENLP=~/work/third-party/stanford-parser-full-2014-06-16
export STANFORD_MODELS=~/work/third-party/stanford-parser-full-2014-06-16

Get the parser source code

git clone https://bitbucket.org/lps/parsimonious-vole

Install the parser

To install the parser go to the source folder and run the pip install from source.

cd parsimonoius-vole
pip install .

This will install the Parsimonious Vole python module (for programmatic usage) and create the command line "vole"

Download the "punct" model using NLTK download UI

Execute the following command. It will open a GUI from NLTK that allows selection and download of various corpora, models, grammars and other NLTK goodies. Browse in "Models" tab and mark punct for download.

python -c "import nltk; nltk.download()";

Test the installation

Check if the environment variable is set.

echo ${STANFORD_PARSER_HOME}

expected to print the path to the folder wit Stanford parser.

Check if the Python module is installed.

python -c "import magic.guess";

Expect nothing to happen, otherwise an error will be thrown.

Check if the command line works.

echo "Someone went outside but I don't exactly know who." > input.txt
vole parse input.txt

Expect a text file to be created comprising a single sentence. This text file is fed as input parameter to "parse" command of the vole command line tool. Expect some debug information to be printed out and input.txt.stp to be created containing dependency parse generated by Stanford Parser and a folder input.txt.html containing the HTML representation of SFL parse.

Command line interface

When Parsimonious Vole is installed it makes available the "vole" command line tool. Currently it offers only one command - parse.

vole

It takes a mandatory input file and optionally, depending on the format, an output file.

For HTML format no output is expected, or rather the output is ignored because the parse result is created based on the input file.

For "JSON", "ADJ_LIST", "YAML", "PICKLE" formats output file is mandatory.

For usage information try

vole --help
vole parse --help

Old Wiki Page

https://bitbucket.org/lps/parsimonious-vole/wiki/Home