Commits

Author Commit Message Labels Comments Date
Matt Chaput
Removed unused import.
Matt Chaput
Added experimental BoostTextFilter. Added StopFilter to StemmingAnalyzer.
Matt Chaput
Reimplemented TableWriter/TableReader as single classes that can have optional postings. Fixed bugs in DocReader/MultiDocReader related to new implementation of field length storage.
Matt Chaput
Added ability to specify where documents should be sorted (beginning or end) in a sorter when the document doesn't contain the sort field. Minor cleanups.
Matt Chaput
Changed implementation of field length storage. Changed term_count() to frequency(). Fixed scoring implementations. Fixed Searcher iteration.
Matt Chaput
Added convenience __setitem__ method that calls either set() or clear().
Matt Chaput
Simplified how And and Or work. Changed Phrase to work with either per-posting positions or a position vector.
Matt Chaput
Changed fields.TEXT to use per-posting positions instead of a vector.
Matt Chaput
Fixed bugs in multi-run merging.
Matt Chaput
Minor formatting, changed default block size.
Matt Chaput
Fixed directory deletion.
Matt Chaput
Fixed docstring. Changed Index.searcher() to pass keyword arguments to the Searcher constructor. Added Index.unlock() before cleaning old files when create = True.
Matt Chaput
Clarified docstring, removed obsolete attribute.
Matt Chaput
First version of a setup.py script for Whoosh.
Matt Chaput
Docstring cleanup.
Matt Chaput
highlight.py: cleaned up some dumb decisions. qparser.py: clarified comment.
Matt Chaput
"Protected" methods need to be locked with an RLock, not a Lock.
Matt Chaput
Returned to a Lucene-like highlighting system in highlight.py. Removed code from passages.py.
Matt Chaput
Small changes for simplification and consistency.
Matt Chaput
Changed from_ back to iter_from.
Matt Chaput
Changed "checkclosed" decorator to "protected" and added thread synchronization to it.
Matt Chaput
Changed the way Multi*Reader objects are instantiated, to make it easier to use hierarchical Readers.
Matt Chaput
analysis.py: - Consolidated CommaSeparatedAnalyzer and SpaceSeparatedAnalyzer into KeywordAnalyzer. - Default is now for StopFilter to remove stopped words from the token stream. classify.py: - Updated code to work with Results object implementation of important_terms(). fields.py: - unstopped() wrapper removes stopped words from token stream before indexing. index.py: - Minor fixes. passages.py: - Rewrite to revert back to my original conception of highlighting, backtracking from using the Minion code. Not finished yet. tables.py: - Fixed copy_data -- should have had separate arguments for the key on the incoming table and the key on the outgoing table. writing.py: - Fixed logic on when to call _merge_segments. - Minor style cleanup. - Changed implementation of sorting in scoring.py and searching.py. Scoring/sorting now happens in the Searcher instead of the Results. - @checkclosed decorator on methods checks whether the parent object has been closed before proceeding. - Minor docstring formatting cleanup.
Matt Chaput
Minor formatting.
Matt Chaput
Removed unused import.
Matt Chaput
Changed vector_as to use Format.interpreter(). Minor docstring formatting.
Matt Chaput
Removed unused import.
Matt Chaput
Added experimental CharacterBoosts format, cleaned up PositionBoosts. Added Format.interpreter() method to return data_to_X method. Minor docstring formatting.
Matt Chaput
Minor formatting change.
Matt Chaput
Minor docstring fixes. Moved some __x__ methods. Removed obsolete code.
  1. Prev
  2. Next