Issue #4 resolved

Paice-Husk stemmer documentation

John Sampson
created an issue

At https://pypi.python.org/pypi/stemmer/1.0 there is the statement "The Paice-Husk algorithm allows custom stemming rule sets, so the paicehusk module also includes a PaiceHuskStemmer class you can instantiate with custom rules." Are there instructions how to do this? Where would I find them?

Comments (1)

  1. Matt Chaput repo owner

    The Paice-Husk algorithm's stemming rules are configured using a DSL. You can see the default rules in stemming.paicehusk.defaultrules: https://bitbucket.org/mchaput/stemming/src/default/stemming/paicehusk.py?at=default#cl-119

    I don't know if I ever understood the DSL, but if I did, I forget now ;) -- I just copied the default rules and translated the algorithm to Python. But if you come up with your own rules, you can load them into a stemming.paicehusk.PaiceHuskStemmer object and use its stem() method:

    myrules = "..."
    mystemmer = PaiceHuskStemmer(myrules)
    stemmed_word = mystemmer.stem(word)
    
  2. Log in to comment