Related to using the latest (greatest?) pickle protocol. Did a bit of research, the main advantages of protocols 3 and 4 over protocol 2 are:-
Protocol version 3 was added in Python 3.0. It has explicit support for bytes objects and cannot be unpickled by Python 2.x. This is the default protocol, and the recommended protocol when compatibility with other Python 3 versions is required. Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. Refer to PEP 3154 for information about improvements brought by protocol 4.
I submit that compatibility is a greater requirement for search than increased object size (unlikely to matter due to search constraints) and more kinds of objects (whoosh probably will not hit that). Also, as whoosh doesn't even support non-unicode input (if I'm not mistaken), version 3 has zero benefit.
Attached is a patch (also submitted to the mailing list before I realized it belonged here) to change all occurences of protocol to 2 (as well as fix two dump commands which were using the default protocol of 3 in whoosh3.py)