Commits

Author Commit Message Labels Comments Date
Matt Chaput
Changed default blocksize to 16K.
Matt Chaput
Added "us" to the stop list (to go with "you").
Matt Chaput
Fixed major bug in TableReader._seek_postings where it was confusing parts of the posting info structure. TableReader.__getitem__ is now TableReader.get() so it can be patched based on TableReader.haspostings. Changed TermReader.weights() to use new Format.read_weight which may be more efficient when all you want is the weight.
Matt Chaput
TableWriter.add_row() now requires a value.
Matt Chaput
Docstring and argument name cleanup. Added results.extend, results.increase, and results.increase_and_extend methods.
Matt Chaput
Added guards against using searcher.doc_field_length() for a field that does not store field lengths. Update MultiFieldSorter to work with "missingfirst" and added a docstring.
Matt Chaput
Docstring cleanup.
Matt Chaput
Removed unused import.
Matt Chaput
Added experimental BoostTextFilter. Added StopFilter to StemmingAnalyzer.
Matt Chaput
Reimplemented TableWriter/TableReader as single classes that can have optional postings. Fixed bugs in DocReader/MultiDocReader related to new implementation of field length storage.
Matt Chaput
Added ability to specify where documents should be sorted (beginning or end) in a sorter when the document doesn't contain the sort field. Minor cleanups.
Matt Chaput
Changed implementation of field length storage. Changed term_count() to frequency(). Fixed scoring implementations. Fixed Searcher iteration.
Matt Chaput
Added convenience __setitem__ method that calls either set() or clear().
Matt Chaput
Simplified how And and Or work. Changed Phrase to work with either per-posting positions or a position vector.
Matt Chaput
Changed fields.TEXT to use per-posting positions instead of a vector.
Matt Chaput
Fixed bugs in multi-run merging.
Matt Chaput
Minor formatting, changed default block size.
Matt Chaput
Fixed directory deletion.
Matt Chaput
Fixed docstring. Changed Index.searcher() to pass keyword arguments to the Searcher constructor. Added Index.unlock() before cleaning old files when create = True.
Matt Chaput
Clarified docstring, removed obsolete attribute.
Matt Chaput
First version of a setup.py script for Whoosh.
Matt Chaput
Docstring cleanup.
Matt Chaput
highlight.py: cleaned up some dumb decisions. qparser.py: clarified comment.
Matt Chaput
"Protected" methods need to be locked with an RLock, not a Lock.
Matt Chaput
Returned to a Lucene-like highlighting system in highlight.py. Removed code from passages.py.
Matt Chaput
Small changes for simplification and consistency.
Matt Chaput
Changed from_ back to iter_from.
Matt Chaput
Changed "checkclosed" decorator to "protected" and added thread synchronization to it.
Matt Chaput
Changed the way Multi*Reader objects are instantiated, to make it easier to use hierarchical Readers.
Matt Chaput
analysis.py: - Consolidated CommaSeparatedAnalyzer and SpaceSeparatedAnalyzer into KeywordAnalyzer. - Default is now for StopFilter to remove stopped words from the token stream. classify.py: - Updated code to work with Results object implementation of important_terms(). fields.py: - unstopped() wrapper removes stopped words from token stream before indexing. index.py: - Minor fixes. passages.py: - Rewrite to revert back to my original conception of highlighting, backtracking from using the Minion code. Not finished yet. tables.py: - Fixed copy_data -- should have had separate arguments for the key on the incoming table and the key on the outgoing table. writing.py: - Fixed logic on when to call _merge_segments. - Minor style cleanup. - Changed implementation of sorting in scoring.py and searching.py. Scoring/sorting now happens in the Searcher instead of the Results. - @checkclosed decorator on methods checks whether the parent object has been closed before proceeding. - Minor docstring formatting cleanup.
  1. Prev
  2. Next