AndMaybeMatcher IndexError in quality
This appears to be a regression from commit 9859aa6f4bb2. The original inherited quality from AdditiveBiMatcher checks first that a and b are active before checking the ids:
def quality(self): q = 0.0 if self.a.is_active(): q += self.a.quality() if self.b.is_active(): q += self.b.quality() return q
The updated AndMaybeMatcher quality function does not check for a and b active, and may result in an IndexError as seen further below:
def quality(self): if self.a.id() == self.b.id(): return self.a.quality() + self.b.quality() else: return self.a.quality()
Also related; the AndMaybeMatcher is_active() function only returns the value of a.is_active(). The IntersectionMatcher returns a.is_active() and b.is_active(), should AndMaybeMatcher be the same?
There are a lot of locations where Matcher objects id() is called and therefor a number of possibilities for this same IndexError to crop up. I don't really understand how the internals work, but based on this problem, it seems like the id() call either needs to be safer, or some more is_active() calls need to be added. For AndMaybeMatcher in id(), quality(), weight() and score() for example.
IndexError Traceback (most recent call last)
/Users/jasl8r/Development/sequoia/sequoia/<ipython console> in <module>()
/Users/jasl8r/Development/sequoia/sequoia/sequoia/data.py in match_media(self) 136 query += u' ANDMAYBE year:%d' % show.get('year') 137 --> 138 for result in searcher.search(qparser.parse(query), limit = 1): 139 matches[series_id] = result['id'] 140
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in search(self, q, limit, sortedby, reverse, groupedby, optimize, scored, filter, collector) 572 collector.reverse = reverse 573 --> 574 return collector.search(self, q, filter=filter) 575 576
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in search(self, searcher, q, filter) 682 break 683 else: --> 684 self.add_searcher(searcher, q) 685 686 if self.timer:
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in add_searcher(self, searcher, q) 708 """ 709 --> 710 return self.add_matches(searcher, q.matcher(searcher)) 711 712 def score(self, searcher, matcher):
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in add_matches(self, searcher, matcher) 750 return self.add_all_matches(searcher, matcher) 751 else: --> 752 return self.add_top_matches(searcher, matcher) 753 754 def add_top_matches(self, searcher, matcher):
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in add_top_matches(self, searcher, matcher) 766 greedy = self.greedy 767 --> 768 for id, quality in self.pull_matches(matcher, usequality): 769 if timelimited and not greedy and self.timesup: 770 raise TimeLimit
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/searching.pyc in pull_matches(self, matcher, usequality) 881 # posting has higher quality than the minimum before yielding it.
882 if usequality:
--> 883 postingquality = matcher.quality() 884 if postingquality > self.minquality: 885 yield (id, postingquality)
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/matching.pyc in quality(self) 1299 1300 def quality(self): -> 1301 if self.a.id() == self.b.id(): 1302 return self.a.quality() + self.b.quality() 1303 else:
/Library/Python/2.6/site-packages/Whoosh-1.7.7-py2.6.egg/whoosh/filedb/filepostings.pyc in id(self) 150 151 def id(self): --> 152 return self.block.ids[self.i] 153 154 def items_as(self, astype):
IndexError: array index out of range