WordCountStep needs to be simplified

Issue #693 new
YvesS created an issue

The current WordCountStep is rather slow and it is the counting part that is slow (not the annotations). We should try to speed it up, possibly getting a slightly less accurate count.

Important: Any change should be through an option or a new step so we do not change the count output from existing applications.

Comments (1)

  1. Jim Hargrave (OLD)

    We already have an existing step called "SimpleWordCountStep" that does most of what we want. This step uses only ICU4J word counts.

  2. Log in to comment