1. Hiroyoshi Komatsu
  2. corenlp-python
Issue #8 new

Error: 'ascii' codec can't encode character

Yogesh Rajashekharaiah
created an issue

yogeshr@yogeshr-VirtualBox:~$ python --version Python 2.7.5+

Step1) I started the JSONRPC server, starts successfully, details below:

yogeshr@yogeshr-VirtualBox:~/corenlp-python$ ./corenlp/corenlp.py -S /home/yogeshr/stanford-corenlp-full-2013-11-12/ java -Xmx2g -cp /home/yogeshr/stanford-corenlp-full-2013-11-12//stanford-corenlp-3.3.0.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//stanford-corenlp-3.3.0-models.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//xom.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//joda-time.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//jollyday.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//ejml-0.23.jar edu.stanford.nlp.pipeline.StanfordCoreNLP -props /home/yogeshr/corenlp-python/corenlp/default.properties Loading Models: 5/5 Serving on http://127.0.0.1:8080

Step 2) I scraped a few lines from http://en.wikipedia.org/wiki/Enterprise_resource_planning and saved as a text file, attached file: s.txt , 1.4Kb

Step3) rpc client script (named gettoks.py) is:

!/usr/bin/python

import sys import os import jsonrpclib import pprint from simplejson import loads

if len(sys.argv) != 2: print( "usage: %s filename" % file) sys.exit(1) else: fl = sys.argv[1]

server = jsonrpclib.Server("http://localhost:8080") lines = open(fl, 'r').read() print(len(lines)) result = loads(server.parse(lines))

pprint.pprint(result)

Step4) I launched the RPC client script as below yogeshr@yogeshr-VirtualBox:~$ ./gettoks.py s.txt and i see the below error on the RPC server and the server restarts

'ascii' codec can't encode character u'\u2013' in position 1272: ordinal not in range(128) java -Xmx2g -cp /home/yogeshr/stanford-corenlp-full-2013-11-12//stanford-corenlp-3.3.0.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//stanford-corenlp-3.3.0-models.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//xom.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//joda-time.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//jollyday.jar:/home/yogeshr/stanford-corenlp-full-2013-11-12//ejml-0.23.jar edu.stanford.nlp.pipeline.StanfordCoreNLP -props /home/yogeshr/corenlp-python/corenlp/default.properties Loading Models: 5/5

On the RPC client, i see no error.

Side Note: If i use a small inline text as below, i do see the result

!/usr/bin/python

import jsonrpclib import pprint from simplejson import loads

server = jsonrpclib.Server("http://localhost:8080") lines = "Hello my world who are you who are you. Where does the knowledge lie. Are we sure we are good to go" print(len(lines)) result = loads(server.parse(lines))

pprint.pprint(result)

First few lines from the result: yogeshr@yogeshr-VirtualBox:~$ ./stoks.py {u'coref': [[[[u'who', 0, 4, 4, 5], [u'my', 0, 1, 1, 2]]], [[[u'you', 0, 6, 6, 7], [u'we sure we are good to go', 2, 1, 1, 8]]]],

Comments (0)

  1. Log in to comment