Matt Chaput committed babfe9e

Added recipes for word length filtering and case-sensitive searches.

Comments (0)

Files changed (1)


     stored_fields = searcher.stored_fields(docnum)
+Eliminate words shorter/longer than N
+Use a :class:`~whoosh.analysis.StopFilter` and the ``minsize`` and ``maxsize``
+keyword arguments. If you just want to filter based on size and not common
+words, set the ``stoplist`` to None::
+    sf = analysis.StopFilter(stoplist=None, minsize=2, maxsize=40)
+Allow optional case-sensitive searches
+A quick and easy way to do this is to index both the original and lowercased
+versions of each word. If the user searches for an all-lowercase word, it acts
+as a case-insensitive search, but if they search for a word with any uppercase
+characters, it acts as a case-sensitive search::
+    class CaseSensitivizer(analysis.Filter):
+        def __call__(self, tokens):
+            for t in tokens:
+                yield t
+                if t.mode == "index":
+                   low = t.text.lower()
+                   if low != t.text:
+                       t.text = low
+                       yield t
+    ana = analysis.RegexTokenizer() | CaseSensitivizer()
+    [t.text for t in ana("The new SuperTurbo 5000", mode="index")]
+    # ["The", "the", "new", "SuperTurbo", "superturbo", "5000"]
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.