Commits

yanchuan sim  committed d38a79e

ignore errors for to_ascii

  • Participants
  • Parent commits 6833957

Comments (0)

Files changed (1)

File ycutils/tokenize.py

   :param text: text to convert to ASCII.
   :returns: ASCII text."""
 
-  return unicodedata.normalize('NFKC', unicode(text)).encode('ascii', 'ignore')
+  return unicodedata.normalize('NFKC', unicode(text, errors='ignore')).encode('ascii', 'ignore')
 #end def
 
 def words(text, strip_unicode=False, normalize=__DEFAULT_NORMALIZE__, tag_list=__DEFAULT_TAG_LIST__, filter_stopwords=False, not_punctuations='', return_tags=False, process_token=None):