add warning about experimental nature of ICU tokenizer

This commit is contained in:
Sarah Hoffmann
2021-07-04 10:44:58 +02:00
parent 62d5984b1b
commit affe1300d9

View File

@@ -9,11 +9,11 @@ different configuration options. This sections describes the tokenizers and how
they can be configured. they can be configured.
!!! important !!! important
The use of a tokenizer is tied to a database installation. You need to choose The use of a tokenizer is tied to a database installation. You need to choose
and configure the tokenizer before starting the initial import. Once the import and configure the tokenizer before starting the initial import. Once the import
is done, you cannot switch to another tokenizer anymore. Reconfiguring the is done, you cannot switch to another tokenizer anymore. Reconfiguring the
chosen tokenizer is very limited as well. See the comments in each tokenizer chosen tokenizer is very limited as well. See the comments in each tokenizer
section. section.
## Legacy tokenizer ## Legacy tokenizer
@@ -44,6 +44,10 @@ normalization functions are hard-coded.
## ICU tokenizer ## ICU tokenizer
!!! danger
This tokenizer is currently in active development and still subject
to backwards-incompatible changes.
The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to
normalize names and queries. It also offers configurable decomposition and normalize names and queries. It also offers configurable decomposition and
abbreviation handling. abbreviation handling.