forked from hans/Nominatim
make token analyzers configurable modules
Adds a mandatory section 'analyzer' to the token-analysis entries which define, which analyser to use. Currently there is exactly one, generic, which implements the former ICUNameProcessor.
This commit is contained in:
@@ -72,7 +72,8 @@ def analyzer(tokenizer_factory, test_config, monkeypatch,
|
||||
cfgstr = {'normalization': list(norm),
|
||||
'sanitizers': sanitizers,
|
||||
'transliteration': list(trans),
|
||||
'token-analysis': [{'variants': [{'words': list(variants)}]}]}
|
||||
'token-analysis': [{'analyzer': 'generic',
|
||||
'variants': [{'words': list(variants)}]}]}
|
||||
(test_config.project_dir / 'icu_tokenizer.yaml').write_text(yaml.dump(cfgstr))
|
||||
tok.loader = ICURuleLoader(test_config)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user