add raw_output option to return CoreNLP's XML as a dictionary without converting the format
Andrew Yates
Branch: andrewyates/corenlp-python:master
Branch: torotoki/corenlp-python:master
Merged
Merged pull request
Merged in andrewyates/corenlp-python (pull request #1)
This patch adds a boolean raw_output keyword argument to batch_parse. If true, CoreNLP's XML is returned as a dictionary without converting the format.
This functionality is useful because corenlp-python's dictionary format does not preserve all the information present in CoreNLP's XML output. For example, I am interested in the 'collapsed-dependencies' with token indices included. corenlp-python's current dictionary format does not include token indices with the dependency relations, and it outputs 'basic-dependencies' instead of 'collapsed-dependencies.'