Commit Graph

540 Commits

Author SHA1 Message Date
Sarah Hoffmann
e37cfc64d2 add type annotations to ICU tokenizer helper modules 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d35e3c25b6 add type annotations for token analysis
No annotations for ICU types yet.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
62eedbb8f6 add type hints for sanitizers 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
5617bffe2f add type annotations for indexer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
8adab2c6ca add typing information for postcode formatter 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d0c44431d0 add typing information for place_info and country_info 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
282a61ce51 add typing information for utils submodule 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7a1d22ff15 type annotations for non-blocking DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
0dff71a410 add type annotations for SQL preprocessor 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
26f30bff28 add type annotation to DB utils
As a cursor is needed as type, make this a public type.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6775e713c add typing information to DB properties 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
69f9122bef add typing annotations for DB status module
Requires TypedDict which is only available from Python 3.8. Require
therefore typing_extensions to make the functions available for
earlier Python versions.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
845c43137a add type annotations to freeze functions 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
aaf2b6032e fix uses of config.get_path() to expect None 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
c4928c646d define type for enivronment dictionaries 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f12fe54d2b restrict return type more 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
fc03c0266a add type annotations to exec_utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
681aad7e0d avoid issues with Python < 3.9 and linting 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f22fa992f7 move complex typing annotations to extra file 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
992e6f72cf type annotations for DB utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6ee3c772c type annotations for DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
95ed95c616 add type annotations to config module 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
bf36f33e79 add type annotations for version.py 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
9b636fdc10 mypy: minimal annotations to enable a clean run 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
856925d19b remove analyze() from PlaceInfo class
The function creates circular dependencies.
2022-07-07 12:06:58 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd move PlaceInfo into data submodule
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
612d34930b handle postcodes properly on word table updates
update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5be320368c add documentation for postcode customization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7f2ad4ac7e fix linting issue 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0f00f4968c fix up BDD tests for postcode changes
Includes smaller code fixes found by the tests.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
37b2c6a830 port legacy tokenizer to new postcode handling
Also documents the changes to the SQL functions of the tokenizer.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
e86db3001f fix postcode pattern for Mozambique
Optional groups are not implemented yet.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
67dfa38e60 fix liniting problems 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
2eca9fc8af cache postcode normalization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b5e5efc131 only add well-formatted postcodes to location table 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
80ea13437d move postcode matcher in a separate file 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
bf86b45178 move postcode centroid computation to Python 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
4885fdf0f9 add class for online centroid computation 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b7704833e4 icu: switch postcodes to using the pre-formatted one 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
ca7b46511d introduce and use analyzer for postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a postcodes: introduce a default pattern for countries without postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5ba75df507 postcode: generate a generic form 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9172696324 postcodes: add support for optional spaces 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
baee6f3de0 postcodes: strip leading country codes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
90d4d339db initial postcode cleaner for simple patterns
Moves postcodes that are either in countries without a postcode
system or don't correspond to the local pattern for postcodes into
a field for a normal address part. Makes them searchable but not as
a special address. This has two consequences: they are no longer a
skippable part of the address and the postcodes cannot be searched
on their own.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
8080625747 remove postcodes from countries that don't have them
The postcodes will only be removed as a 'computed postcode' they
are still searchable for the given object.
2022-06-23 23:42:31 +02:00
Luflosi
3ea87169ac Fix typo 2022-06-20 20:41:00 +02:00
Sarah Hoffmann
cbb4749996 change indexing order for interpolations
Interpolations are now indexed after rank 30 objects. The housenumber
nodes no longer need information from the interpolations while the
interpolations can make use of precomputed postcodes.
2022-06-02 15:16:46 +02:00