icu tokenizer: move transliteration rules in separate file

The tokenizer configuration has become difficult to handle
due to the additional manual transliteration rules. Allow
to have a separate rule file that is given to the ICU library
as is.
This commit is contained in:
Sarah Hoffmann
2021-05-26 20:50:34 +02:00
parent de4fac33dc
commit 6ba00e6aee
4 changed files with 4958 additions and 4951 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff