Commit Graph

100 Commits

Author SHA1 Message Date
Sarah Hoffmann
83054af46f remove typing_extensions requirement
The typing_extensions package is only necessary now when running mypy.
It won't be used at runtime anymore.
2022-07-18 09:55:58 +02:00
Sarah Hoffmann
a849f3c9ec add type annotations for command line functions 2022-07-18 09:55:54 +02:00
Sarah Hoffmann
aaf2b6032e fix uses of config.get_path() to expect None 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
61d813bfef add get_str_list() for config
Converts a config value written as a comma-sparated list into
a Python list of strings.
2022-05-29 13:53:50 +02:00
Sarah Hoffmann
dc6c4bf22e add offline import mode
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
2022-05-11 15:03:02 +02:00
Sarah Hoffmann
bb2bd76f91 pylint: avoid explicit use of format() function
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
4e1e166c6a add a function to return a formatted version
Replaces the various repeated format strings throughout the code.
2022-05-11 09:01:24 +02:00
Sarah Hoffmann
4f59644cc2 add tests for new data invalidation functions 2022-04-14 14:52:13 +02:00
Sarah Hoffmann
c3f1d34b71 add new commands for forced invalidation before indexing 2022-04-14 11:05:43 +02:00
Sarah Hoffmann
4c66c35ed6 reinit the tokenizer directory on website refresh
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
c170d323d9 add tests for cleaning housenumbers 2022-01-20 23:47:20 +01:00
Sarah Hoffmann
344a2bfc1a add new command for cleaning word tokens
Just pulls outdated housenumbers for the moment.
2022-01-20 20:05:15 +01:00
Sarah Hoffmann
c3788d765e add consistent SPDX copyright headers 2022-01-03 16:23:58 +01:00
Sarah Hoffmann
54d35ddfe9 split cli tests by subcommand and extend coverage 2021-12-02 23:45:48 +01:00
Sarah Hoffmann
10e979e841 only instantiate indexer once for replication
Also makes sure that indexer object exists everywhere were needed.

See #2518.
2021-11-19 14:48:58 +01:00
Sarah Hoffmann
345c812e43 better error reporting when API script does not exist
Check if the API script exists on the expected location before
running php-cli. This way we can add a useful hint about the
project directory.

Fixes #2513.
2021-11-10 11:58:20 +01:00
Sarah Hoffmann
2c4b798f9b further refactor setup to keep function small 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
9934421442 make word count computation part of the import
Accurate word counts are now essential when using
the ICU tokenizer and don't hurt for the legacy one.

Adds about an hour import time.
2021-10-26 12:00:13 +02:00
Sarah Hoffmann
1098ab732f allow relative paths for flatnode file 2021-10-22 17:32:51 +02:00
Sarah Hoffmann
0ae8d7ac08 have ADDRESS_LEVEL_CONFIG use load_sub_configuration
This means that relative paths now are looked up in the
project directory.
2021-10-22 16:36:52 +02:00
Sarah Hoffmann
c77df2d1eb replace NOMINATIM_PHRASE_CONFIG with command line option 2021-10-22 14:41:14 +02:00
Sarah Hoffmann
c1fa70639b add new replication mode catch-up
This mode gets updates until the server reports no new diffs
anymore.

Also adds additional indexing, when the main indexing step left
a couple of objects to process. This happens only when the
next update is expected to be more than 40min away.
2021-10-20 22:05:15 +02:00
Sarah Hoffmann
12643c5986 run Tiger import with parallel threads per default 2021-10-19 15:00:26 +02:00
Sarah Hoffmann
e8e2502e2f make word recount a tokenizer-specific function 2021-10-19 11:21:16 +02:00
Sarah Hoffmann
47417d1871 update and extend man page
Provide extended descriptions for most subcommands.
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
8e1d4818ac use yaml config loader for country info 2021-09-04 00:22:55 +02:00
Sarah Hoffmann
7e7dd769fd remove language and partition from name import 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
79da96b369 read partition and languages from config file 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
78fcabade8 move country name generation to country_info module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
284645f505 move generation of country tables in own module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
75a5c7013f split up large setup function 2021-08-15 12:24:13 +02:00
Sarah Hoffmann
87dedde5d6 allow multiple files for the import command
The files are forwarded to osm2pgsql which is now able to merge
them correctly.
2021-08-14 21:42:21 +02:00
Sarah Hoffmann
e42349c963 replace add-data function with native Python code 2021-07-26 10:41:37 +02:00
Sarah Hoffmann
878835e4bd move add-data subcommand into a separate file 2021-07-25 18:14:12 +02:00
Sarah Hoffmann
cf98cff2a1 more formatting fixes
Found by flake8.
2021-07-12 17:45:42 +02:00
AntoJvlt
3676310efe Improved performance of the postcodes query and some code cleaning 2021-06-12 15:46:08 +02:00
AntoJvlt
a4733eed90 Use place instead of placex to compute postcodes 2021-06-09 09:31:32 +02:00
Sarah Hoffmann
72625dc72a call freeze after running and non-updateable import
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
cc2f152d70 commit changes to replication log table
Fixes #2350.
2021-05-26 11:47:08 +02:00
Sarah Hoffmann
a0e85cc17c only initialise tokenizer for refresh functions where needed
Fixes #2347.
2021-05-25 19:16:22 +02:00
AntoJvlt
3206bf59df Resolve conflicts 2021-05-17 13:52:35 +02:00
AntoJvlt
8b8dfc46eb Added --no-replace command for special phrases importation and added corresponding tests 2021-05-17 13:25:06 +02:00
AntoJvlt
06aab389ed Code cleaning and SPLoader deleted 2021-05-16 16:59:12 +02:00
Darkshredder
e5ffc59cd5 feat: Added reverse-only-search validation 2021-05-14 02:36:21 +05:30
Sarah Hoffmann
bf864b2c54 index postcodes after refreshing 2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a4aba23a83 move filling of postcode table to python
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.

External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
AntoJvlt
9d83da830f Introduction of SPCsvLoader to load special phrases from a csv file 2021-05-10 23:26:39 +02:00
AntoJvlt
00959fac57 Refactoring loading of external special phrases and importation process by introducing SPLoader and SPWikiLoader 2021-05-10 21:49:31 +02:00