Broken stopwords for russian language

Create issue
Issue #344 resolved
Владислав created an issue

I've test whoosh==2.5.1 and 54c0271 with python 2.7.5. Simple script:

from whoosh.lang import stopwords_for_language
stopwords = stopwords_for_language('ru')

stopwords will be contain frozenset of unreachable words. See screenshot:

whoosh unreachable stopwords

Because of LanguageAnalyzer doesn't work.

Comments (6)

  1. Владислав reporter

    Function whoosh.compat.u breaks strings.

    In [12]: print u('строка')
    ÑÑÑока
    
  2. Log in to comment