Issue #265 resolved

Accent folding is using 'n' as a replacement for 'ñ'

Antonio Barcia
created an issue


In Spanish, 'ñ' and 'n' are different letters, 'ñ' is not an accented version of 'n'. But in support.charset.accent_map 'Ñ' and 'ñ' are being folded into 'n'.

This can lead to very confusing search results when using whoosh in a Spanish language context.

Comments (2)

  1. Matt Chaput repo owner

    Accent folding is for when you're in a multi-language or English-major environment and, for example, when someone types "jalapeno" you want it to match documents containing "jalapeño" (and vice-versa). You usually don't want to use accent folding if you have Spanish-speakers searching Spanish content.

    But if you do need it for some reason, you should think about how you want the search to work, and write a smarter analyzer to accomplish that. For example, you could write an analyzer so that if a user types "jalapeno" it will match "jalapeno" or "jalapeño", but if the user types "jalapeño", it will only match "jalapeño". If you want to do something like that, let me know and I can help :)

  2. Log in to comment