SpanishStemmer raises IndexError: string index out of range

Create issue
Issue #493 new
Stephane Boisson created an issue

Some words cause an exception.

Not sure if issues is in original algorithm or in python implementation.

Example reproducing the issue using Whoosh 2.7.4:

# -*- coding: utf-8 -*-
from whoosh.lang.snowball.spanish import SpanishStemmer

stemmer = SpanishStemmer()
print stemmer.stem(u"B\xe8gue")

Results:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/sboisson/Documents/venv/lib/python2.7/site-packages/whoosh/lang/snowball/spanish.py", line 239, in stem
    if len(word) >= 2 and word[-2:] == "gu" and rv[-1] == "u":
IndexError: string index out of range

Comments (2)

  1. Log in to comment