Paice-Husk stemmer documentation

Issue #4 resolved
John Sampson
created an issue

At there is the statement "The Paice-Husk algorithm allows custom stemming rule sets, so the paicehusk module also includes a PaiceHuskStemmer class you can instantiate with custom rules." Are there instructions how to do this? Where would I find them?

Comments (1)

  1. Matt Chaput repo owner

    The Paice-Husk algorithm's stemming rules are configured using a DSL. You can see the default rules in stemming.paicehusk.defaultrules:

    I don't know if I ever understood the DSL, but if I did, I forget now ;) -- I just copied the default rules and translated the algorithm to Python. But if you come up with your own rules, you can load them into a stemming.paicehusk.PaiceHuskStemmer object and use its stem() method:

    myrules = "..."
    mystemmer = PaiceHuskStemmer(myrules)
    stemmed_word = mystemmer.stem(word)
  2. Log in to comment