mirror of
https://github.com/osm-search/Nominatim.git
synced 2026-02-25 18:48:15 +00:00
complete documentation for new clean-houseunubmers sanatizer
This commit is contained in:
@@ -181,6 +181,13 @@ The following is a list of sanitizers that are shipped with Nominatim.
|
|||||||
rendering:
|
rendering:
|
||||||
heading_level: 6
|
heading_level: 6
|
||||||
|
|
||||||
|
##### clean-housenumbers
|
||||||
|
|
||||||
|
::: nominatim.tokenizer.sanitizers.clean_housenumbers
|
||||||
|
selection:
|
||||||
|
members: False
|
||||||
|
rendering:
|
||||||
|
heading_level: 6
|
||||||
|
|
||||||
|
|
||||||
#### Token Analysis
|
#### Token Analysis
|
||||||
|
|||||||
@@ -5,7 +5,11 @@
|
|||||||
# Copyright (C) 2022 by the Nominatim developer community.
|
# Copyright (C) 2022 by the Nominatim developer community.
|
||||||
# For a full list of authors see the git log.
|
# For a full list of authors see the git log.
|
||||||
"""
|
"""
|
||||||
Sanitizer that cleans and normalizes house numbers.
|
Sanitizer that preprocesses address tags for house numbers. The sanitizer
|
||||||
|
allows to
|
||||||
|
|
||||||
|
* define which tags are to be considered house numbers (see 'filter-kind')
|
||||||
|
* split house number lists into individual numbers (see 'delimiters')
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
delimiters: Define the set of characters to be used for
|
delimiters: Define the set of characters to be used for
|
||||||
|
|||||||
@@ -28,6 +28,10 @@ sanitizers:
|
|||||||
- step: split-name-list
|
- step: split-name-list
|
||||||
- step: strip-brace-terms
|
- step: strip-brace-terms
|
||||||
- step: clean-housenumbers
|
- step: clean-housenumbers
|
||||||
|
filter-kind:
|
||||||
|
- housenumber
|
||||||
|
- conscriptionnumber
|
||||||
|
- streetnumber
|
||||||
- step: tag-analyzer-by-language
|
- step: tag-analyzer-by-language
|
||||||
filter-kind: [".*name.*"]
|
filter-kind: [".*name.*"]
|
||||||
whitelist: [bg,ca,cs,da,de,el,en,es,et,eu,fi,fr,gl,hu,it,ja,mg,ms,nl,no,pl,pt,ro,ru,sk,sl,sv,tr,uk,vi]
|
whitelist: [bg,ca,cs,da,de,el,en,es,et,eu,fi,fr,gl,hu,it,ja,mg,ms,nl,no,pl,pt,ro,ru,sk,sl,sv,tr,uk,vi]
|
||||||
|
|||||||
Reference in New Issue
Block a user