Commit Graph

7 Commits

Author SHA1 Message Date
TuringVerified
2eeec46040 Remove unnecessary assert statement, Fix regex_replace docstring and simplify regex_replace 2025-04-01 18:54:30 +05:30
TuringVerified
6d5a4a20c5 Update documentation, optimise regex_replace, add tests 2025-04-01 18:54:30 +05:30
TuringVerified
4665ea3e77 Add generic preprocessor 2025-04-01 18:54:30 +05:30
Sarah Hoffmann
55c3176957 strip normalisation results of normal and special spaces 2025-02-19 14:40:35 +01:00
Sarah Hoffmann
efc09a5cfc add japanese phrase preprocessing
Code adapted from GSOC code by @miku.
2025-01-09 09:24:10 +01:00
Sarah Hoffmann
fbb6edfdaf add documentation for new query preprocessing 2024-12-13 16:53:08 +01:00
Sarah Hoffmann
2b87c016db generalize normalization step for search query
It is now possible to configure functions for changing the query
input before it is analysed by the tokenizer.

Code is a cleaned-up version of the implementation by @miku.
2024-12-13 14:31:08 +01:00