HTTPS SSH

This parser is basically the Berkeley parser 1.7 (http://nlp.cs.berkeley.edu/software.shtml) with some small adaptations. If you use this code, please cite:

"Parser Adaptation for Social Media by Integrating Normalization." Rob van der Goot and Gertjan van Noord in ACL 2017

"Improved Inference for Unlexicalized Parsing" Slav Petrov and Dan Klein in HLT-NAACL 2007

I enabled the parsing of word graphs in the following format:

0 new 1 0.931861
0 new 1 0.041028
0 new 1 0.027111
1 pix 2 0.994940
1 pic 2 0.002540
1 photos 2 0.002520
2 comming 3 0.782531
2 coming 3 0.210903
2 coming 3 0.006566
3 tomorroe 4 0.904690
3 tomorrow 4 0.079266
3 tomorrow 4 0.016044
.

This makes it work together with MoNoise (https://bitbucket.org/robvanderg/monoise); download MoNoise if your goal is to parse Tweets.

The weights for the normalization can be tuned with "-latticeWeight"

other small additions:

  • outputChartSize: Prints the chart size after pruning (at the last parse level)
  • server: runs the jar as a server application, can be used to communicate trough sockets as supported by MoNoise. Specify the port as argument on the commandline.