add japanese phrase preprocessing

Code adapted from GSOC code by @miku.
This commit is contained in:
Sarah Hoffmann
2025-01-08 19:43:25 +01:00
parent 86ad9efa8a
commit efc09a5cfc
3 changed files with 96 additions and 0 deletions

View File

@@ -1,4 +1,5 @@
query-preprocessing:
- step: split_japanese_phrases
- step: normalize
normalization:
- ":: lower ()"