Clone wiki

alto / INLG2017

INLG 2017 paper

This is the documentation for the experiments described in Koller & Engonopoulos, "Integrated sentence generation with charts", to appear at INLG 2017.

Introduction

The paper describes an algorithm for integrated referring expression generation and surface realization of full sentences, which makes use of Semantically Interpreted Grammars (SIGs). A SIG is a special case of an IRTG with three interpretations:

  • relational (I_R in the paper)
  • semantic (I_N)
  • string (I_S)

We treat the task of generating a sentence for a given semantics as IRTG parsing with such a grammar. The input of the problem is:

  1. the SIG grammar in form of a Template IRTG
  2. the model of the world in form of a list of semantic relations. Each n-ary relation is represented as a list of n-sized tuples of individuals signaling membership in the relation.
  3. the target set of referents. The returned derivation tree will be constrained to refer only to these individuals, i.e. its I_R interpretation will be exactly this set.
  4. the target set of semantic atoms. The returned derivation tree will be constrained to express at least these semantic atoms, i.e. its I_N interpretation will be a superset of this set.

Note that events are treated as any other semantic individual, in that a derivation tree can "refer" to an event as it can "refer" to a (e.g. physical) object.

Using Alto for integrated sentence generation

The class that implements the algorithm and heuristics described in the paper is de.up.ling.irtg.script.SurfaceRealizer.

To generate a sentence, download and build Alto and then run:

java -cp ./target/alto-2.2-SNAPSHOT-jar-with-dependencies.jar de/up/ling/irtg/script/SurfaceRealizer <grammar_file> <model_file> --ref <target_referents> --sem <target_semantics>

The output will be the string interpretation of the shortest found derivation tree. Adding the argument --print-derivations will also print the derivation tree itself.

You can start experimenting with the example grammar and models that we used for computing runtimes.

Note that Alto is written Java, therefore it is necessary to warmup the Java VM and average over several runs in order to get reliable runtime measurements. To do this, add the --warmup and --avg arguments when running the surface realizer.

For example, to run model_2.txt with a warmup and averaging of 100 runs for the semantics push(e5, b5) and target referent e5 you can run:

java -cp ./target/alto-2.2-SNAPSHOT-jar-with-dependencies.jar de/up/ling/irtg/script/SurfaceRealizer ../crisp-charts-17/source/scrisp.tirtg ../crisp-charts-17/benchmarks/paper_model_2.txt --warmup 100 --avg 100 --ref {e5} --sem 'push(e5,b5)'  --print-derivations

CRISP

To use the CRISP planning-based sentence generator for comparison, you can download the code at bitbucket.org/tclup/crisp-nlg.

You can download the example grammar and models, in the format used by CRISP, which we used for computing the runtimes presented in the paper.

To generate a sentence using CRISP, download and build the code under crisp-nlg/trunk/Code/crisp and then run:

java -jar ./target/crisp-1.1.5-SNAPSHOT-jar-with-dependencies.jar ../../../../crisp-charts-17/source/metric-scrisp-grammar.xml ../../../../crisp-charts-17/benchmarks/paper_model_2.xml

For any further questions please do not hesitate to write an e-mail to nikos AT coli.uni-saarland.de

Updated