Issue #293 resolved

error in dawg.py when TEXT(analyzer=StemmingAnalyzer(), spelling=True)

argonaut
created an issue

When a TEXT() field is used with the StemmingAnalyzer and spelling=True, an error is generated in dawg.GraphWriter.insert() during writer.commit(). The stemming function if not a factor, nor substituting StemmingFilter with PyStemmingFilter.

The snippet below reproduced the problem: line1,line2,line3: generates Exception("Inserted %r before starting a field" % key) line2 by itself generates: Exception("Inserted %r before starting a field" % key) line2 followed by line3 DOES NOT generate any error

from whoosh.index import create_in
from whoosh.fields import *
from whoosh.analysis import StemmingAnalyzer
schema = Schema(content=TEXT(analyzer= StemmingAnalyzer(), spelling= True))
ix = create_in("./", schema, "test")
writer = ix.writer()
writer.add_document(content=u"IPFSTD1 IPFSTD_kdwq134 Kaminski-all Study00:00:00") # line1
writer.add_document(content=u"IPFSTD1 IPFSTD_kdwq134 Kaminski-all Study") #line2
writer.add_document(content=u"This is the first document we've added!") #line3
writer.commit()

Comments (9)

  1. Log in to comment