Sarah Hoffmann
856925d19b
remove analyze() from PlaceInfo class
...
The function creates circular dependencies.
2022-07-07 12:06:58 +02:00
Sarah Hoffmann
cbbcbb1fd7
move country_info into data submodule
2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd
move PlaceInfo into data submodule
...
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
612d34930b
handle postcodes properly on word table updates
...
update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5be320368c
add documentation for postcode customization
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7f2ad4ac7e
fix linting issue
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0f00f4968c
fix up BDD tests for postcode changes
...
Includes smaller code fixes found by the tests.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
37b2c6a830
port legacy tokenizer to new postcode handling
...
Also documents the changes to the SQL functions of the tokenizer.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
e86db3001f
fix postcode pattern for Mozambique
...
Optional groups are not implemented yet.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
67dfa38e60
fix liniting problems
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
2eca9fc8af
cache postcode normalization
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b5e5efc131
only add well-formatted postcodes to location table
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
80ea13437d
move postcode matcher in a separate file
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
bf86b45178
move postcode centroid computation to Python
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
4885fdf0f9
add class for online centroid computation
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b7704833e4
icu: switch postcodes to using the pre-formatted one
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
ca7b46511d
introduce and use analyzer for postcodes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a
postcodes: introduce a default pattern for countries without postcodes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5ba75df507
postcode: generate a generic form
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9172696324
postcodes: add support for optional spaces
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
baee6f3de0
postcodes: strip leading country codes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
90d4d339db
initial postcode cleaner for simple patterns
...
Moves postcodes that are either in countries without a postcode
system or don't correspond to the local pattern for postcodes into
a field for a normal address part. Makes them searchable but not as
a special address. This has two consequences: they are no longer a
skippable part of the address and the postcodes cannot be searched
on their own.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
8080625747
remove postcodes from countries that don't have them
...
The postcodes will only be removed as a 'computed postcode' they
are still searchable for the given object.
2022-06-23 23:42:31 +02:00
Luflosi
3ea87169ac
Fix typo
2022-06-20 20:41:00 +02:00
Sarah Hoffmann
cbb4749996
change indexing order for interpolations
...
Interpolations are now indexed after rank 30 objects. The housenumber
nodes no longer need information from the interpolations while the
interpolations can make use of precomputed postcodes.
2022-06-02 15:16:46 +02:00
Sarah Hoffmann
218c56f9a6
use getattr() instead of __getattr__
...
Makes the linter happy.
2022-06-01 21:26:13 +02:00
Sarah Hoffmann
12a3d51bcc
Merge pull request #2731 from lonvia/cleanup-special-phrases
...
Minor code reorganisation around special phrase parsing
2022-05-31 17:13:56 +02:00
Sarah Hoffmann
1821f68ca0
exclude addr:inclusion from search
2022-05-31 14:19:19 +02:00
Sarah Hoffmann
46689df668
custom comparison for SpecialPhrase
...
Duplicate elemination only works when a custom hash/equal function
is implemented that is based on the members.
2022-05-30 16:30:41 +02:00
Sarah Hoffmann
e828d0d3f7
move quoting hack to wiki loader
...
The bad quotes around the type for special phrases
specifically occure in the Wiki pages, so it should be
removed by the loader and not in the generic SpecialPhrase
object.
2022-05-30 14:40:33 +02:00
Sarah Hoffmann
cce0e5ea38
convert special phrase loaders to generators
...
Generators simplify the code quite a bit compared to the previous
Iterator approach.
2022-05-30 14:12:46 +02:00
Sarah Hoffmann
042e314589
remove the language parameter in the SPWikiLoader
...
Languages must always be configured through config or environment.
Also use monkeypatched environment in tests.
2022-05-30 10:26:20 +02:00
Sarah Hoffmann
61d813bfef
add get_str_list() for config
...
Converts a config value written as a comma-sparated list into
a Python list of strings.
2022-05-29 13:53:50 +02:00
Sarah Hoffmann
dc6c4bf22e
add offline import mode
...
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
2022-05-11 15:03:02 +02:00
Sarah Hoffmann
3ba975466c
fix spacing
...
Some versions of pylint are oddly picky.
2022-05-11 10:36:09 +02:00
Sarah Hoffmann
d14a585cc9
pylint: disable no-self-use check
...
This checker encourages bad behaviour (namely changing the static
status of a function during inheritence) and will be made optional
in upcoming versions of pylint.
2022-05-11 10:25:00 +02:00
Sarah Hoffmann
7f7a7df3a2
solve assorted issue with newer pylint versions
...
Includes more use of 'with', adding encodings to open statements
and a couple of issues with parameter renaming.
2022-05-11 10:22:14 +02:00
Sarah Hoffmann
5d5f40a82f
use context management when processing Tiger data
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
ae6b029543
remove redundant 'u' prefixes for unicode strings
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
bb2bd76f91
pylint: avoid explicit use of format() function
...
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
4e1e166c6a
add a function to return a formatted version
...
Replaces the various repeated format strings throughout the code.
2022-05-11 09:01:24 +02:00
Sarah Hoffmann
7e70e5f503
always state encoding when opening files in text mode
...
Also applies to Path.write_text().
2022-05-10 15:36:29 +02:00
Sarah Hoffmann
ed6fda6968
Merge pull request #2702 from lonvia/move-country-names-into-includes
...
Clean up country name settings
2022-05-10 09:21:16 +02:00
Marc Tobias
821dabb138
add git commit hash to --version output
2022-05-09 23:56:13 +02:00
Sarah Hoffmann
9d468f6da0
support arbitrary prefixes in country name list
...
This means we can now get rid of the last special cases for names.
2022-05-09 11:55:26 +02:00
Marc Tobias
0de83c4a51
fix typos of name Nominatim
2022-05-05 01:04:47 +02:00
Marc Tobias
a79ab41782
new nominatim --version CLI argument
2022-05-04 01:33:25 +02:00
Sarah Hoffmann
3d58254462
skip wikipedia table test on reverse-only installations
...
Wikipedia importances are not imported on reverse-only imports.
2022-04-29 14:12:55 +02:00
Sarah Hoffmann
8bcdba1a14
add check for wikipedia importance data
...
Adds a new check level WARNING because missing wikipedia importances
are not necessarily an error. If the database is run for reverse
requests only, then it is fine to go without them.
2022-04-29 12:14:53 +02:00
Sarah Hoffmann
4f59644cc2
add tests for new data invalidation functions
2022-04-14 14:52:13 +02:00