mirror of
https://github.com/osm-search/Nominatim.git
synced 2026-02-16 15:47:58 +00:00
Merge pull request #3675 from TuringVerified/generic-preprocessors
Add generic preprocessors
This commit is contained in:
@@ -67,7 +67,13 @@ Here is an example configuration file:
|
||||
|
||||
``` yaml
|
||||
query-preprocessing:
|
||||
- normalize
|
||||
- step: split_japanese_phrases
|
||||
- step: regex_replace
|
||||
replacements:
|
||||
- pattern: https?://[^\s]* # Filter URLs starting with http or https
|
||||
replace: ''
|
||||
- step: normalize
|
||||
|
||||
normalization:
|
||||
- ":: lower ()"
|
||||
- "ß > 'ss'" # German szet is unambiguously equal to double ss
|
||||
@@ -88,8 +94,8 @@ token-analysis:
|
||||
replacements: ['ä', 'ae']
|
||||
```
|
||||
|
||||
The configuration file contains four sections:
|
||||
`normalization`, `transliteration`, `sanitizers` and `token-analysis`.
|
||||
The configuration file contains five sections:
|
||||
`query-preprocessing`, `normalization`, `transliteration`, `sanitizers` and `token-analysis`.
|
||||
|
||||
#### Query preprocessing
|
||||
|
||||
@@ -106,6 +112,19 @@ The following is a list of preprocessors that are shipped with Nominatim.
|
||||
heading_level: 6
|
||||
docstring_section_style: spacy
|
||||
|
||||
##### regex-replace
|
||||
|
||||
::: nominatim_api.query_preprocessing.regex_replace
|
||||
options:
|
||||
members: False
|
||||
heading_level: 6
|
||||
docstring_section_style: spacy
|
||||
description:
|
||||
This option runs any given regex pattern on the input and replaces values accordingly
|
||||
replacements:
|
||||
- pattern: regex pattern
|
||||
replace: string to replace with
|
||||
|
||||
|
||||
#### Normalization and Transliteration
|
||||
|
||||
|
||||
Reference in New Issue
Block a user