Commit Graph

438 Commits

Author SHA1 Message Date
Sarah Hoffmann
497e27bb9a move partial token into a separate field in the query struct
There is exactly one token to be expected and the token is usually
present.
2025-04-11 08:57:34 +02:00
Sarah Hoffmann
97d9e3c548 allow updating postcodes without a project directory
Postcodes will then be updated without looking for external postcodes.
2025-04-09 20:04:01 +02:00
Sarah Hoffmann
2ce2d031fa Merge pull request #3702 from lonvia/remove-tokenizer-dir
Remove automatic setup of tokenizer directory

So far the tokenizer factory would create a directory for private data for the tokenizer and then hand in the directory location to the tokenizer.

ICU tokenizer doesn't need any extra data anymore, so it doesn't make sense to create a directory which then remains empty. If a tokenizer needs such a directory in the future, it needs to create it on its own and make sure to handle the situation correctly where no project directory is used at all.
2025-04-03 09:04:48 +02:00
Sarah Hoffmann
186f562dd7 remove automatic setup of tokenizer directory
ICU tokenizer doesn't need any extra data anymore, so it doesn't
make sense to create a directory which then remains empty. If a
tokenizer needs such a directory in the future, it needs to create
it on its own and make sure to handle the situation correctly where
no project directory is used at all.
2025-04-02 20:20:04 +02:00
Sarah Hoffmann
c5bbeb626f Merge pull request #3700 from lonvia/ignore-inherited-addresses
Ignore POIs with inherited addresses for the address layer
2025-04-02 12:00:45 +02:00
Sarah Hoffmann
3bc77629c8 ignore POIs with inherited addresses for the address layer
We know that there is a building which describes the address as a
polygon and is therefore more suitable.
2025-04-02 10:30:45 +02:00
Sarah Hoffmann
6cf1287c4e Merge pull request #3686 from astridx/output_names
Output names as setting
2025-04-01 20:16:15 +02:00
TuringVerified
2eeec46040 Remove unnecessary assert statement, Fix regex_replace docstring and simplify regex_replace 2025-04-01 18:54:30 +05:30
TuringVerified
6d5a4a20c5 Update documentation, optimise regex_replace, add tests 2025-04-01 18:54:30 +05:30
astridx
12ad95067d output names as setting 2025-03-31 16:55:05 +02:00
Sarah Hoffmann
be4ba370ef adapt tests to extended results 2025-03-31 14:52:50 +02:00
Sarah Hoffmann
35baf77b18 make query upper-case when parsing postcodes
The postcode patterns expect upper-case letters.
2025-03-21 09:44:15 +01:00
Sarah Hoffmann
f5755a7a82 remove code for setting osm2pgsql via config.lib_dir
With the internal osm2pgsql gone, configuration of the binary location
via settings is the only option left that makes sense.
2025-03-11 09:04:05 +01:00
Sarah Hoffmann
4cc788f69e enable flake for Python tests 2025-03-09 15:33:24 +01:00
Sarah Hoffmann
6b0d58d9fd restrict postcode parsing in typed phrases
Postcodes can only appear in postcode-type phrases and must then
cover the full phrase
2025-03-05 10:09:33 +01:00
Sarah Hoffmann
afb89f9c7a add unit tests for postcode parser 2025-03-04 16:25:00 +01:00
Sarah Hoffmann
a574b98e4a remove postcode computation for word table during import 2025-03-04 08:57:59 +01:00
Sarah Hoffmann
6759edfb5d make word generation from query a class method 2025-03-04 08:57:37 +01:00
Sarah Hoffmann
e362a965e1 search: merge QueryPart array with QueryNodes
The basic information on terms is pretty much always used together
with the node inforamtion. Merging them together saves some
allocation while making lookup easier at the same time.
2025-03-04 08:57:37 +01:00
Sarah Hoffmann
13db4c9731 replace datrie library with a more simple pure-Python class 2025-02-24 10:24:21 +01:00
Sarah Hoffmann
49bd18b048 replace PhraseType enum with simple int constants 2025-02-21 16:44:12 +01:00
Sarah Hoffmann
31412e0674 replace TokenType enum with simple char constants 2025-02-21 10:23:41 +01:00
Sarah Hoffmann
4577669213 replace BreakType enum with simple char constants 2025-02-21 09:57:48 +01:00
Sarah Hoffmann
95e2d8c846 adapt tests to changed wikimedia importance test table 2025-01-14 14:19:17 +01:00
Sarah Hoffmann
efc09a5cfc add japanese phrase preprocessing
Code adapted from GSOC code by @miku.
2025-01-09 09:24:10 +01:00
Sarah Hoffmann
4760e8341b move lua scripts into a separate directory 2024-12-16 10:26:55 +01:00
Sarah Hoffmann
2b87c016db generalize normalization step for search query
It is now possible to configure functions for changing the query
input before it is analysed by the tokenizer.

Code is a cleaned-up version of the implementation by @miku.
2024-12-13 14:31:08 +01:00
Sarah Hoffmann
5160a1d577 get bbox of postcode areas into results 2024-09-30 08:58:40 +02:00
Sarah Hoffmann
90e207a497 drop automatic migration from versions <4.3 2024-09-27 12:07:48 +02:00
Sarah Hoffmann
a690605a96 remove support for unindexed tokens
This was a special feature of the legacy tokenizer who would not
index very frequent tokens.
2024-09-22 10:39:10 +02:00
Sarah Hoffmann
b54ff7d766 remove all references to a module path
No longer used now that legacy tokenizer is gone.
2024-09-21 17:39:01 +02:00
Sarah Hoffmann
b87d6226fb remove legacy tokenizer and direct tests 2024-09-21 11:38:08 +02:00
Sarah Hoffmann
7717bbf59d remove remaining references to php code 2024-09-15 15:33:59 +02:00
Sarah Hoffmann
6bc044d9c7 remove website setup
The website directory was for PHP scripts only and is no longer
needed.
2024-09-15 11:58:55 +02:00
Sarah Hoffmann
882fb16881 restrict use of os.environ in Configuration
Only use the OS environment, when the environ parameter is set
to None. Currently it would use the OS env on an empty dict.
2024-09-01 16:17:30 +02:00
Sarah Hoffmann
7f11de0db9 allow None and str for project_dir in NominatimAPI init 2024-08-22 22:49:12 +02:00
Sarah Hoffmann
c2594aca40 make NominatimAPI[Async] a context manager
If close() isn't properly called, it can lead to odd error messages
about uncaught exceptions.
2024-08-19 11:31:38 +02:00
Sarah Hoffmann
8e8f7a641b use custom result formatters in CLI commands 2024-08-16 19:30:57 +02:00
Sarah Hoffmann
5a61d3d5f6 configurable error formatting and content type in result formatter 2024-08-14 12:00:08 +02:00
Sarah Hoffmann
0c25e80be0 make formatting module non-static 2024-08-13 22:39:43 +02:00
Sarah Hoffmann
d22ca186e4 remove v1-specific functions from ASGIAdaptor 2024-08-13 19:38:14 +02:00
Sarah Hoffmann
6527b7cdcd fail if osm2pgsql is not recent enough 2024-08-09 19:25:15 +02:00
marc tobias
f0390cfe85 add-data: warn and exit if database is frozen 2024-08-05 16:14:19 +02:00
Sarah Hoffmann
9659afbade port code to psycopg3 2024-07-29 08:50:19 +02:00
Sarah Hoffmann
3742fa2929 make DB helper functions free functions
Also changes the drop function so that it can drop multiple tables
at once.
2024-07-29 08:49:30 +02:00
Sarah Hoffmann
4da4cbfe27 reduce from 3 to 2 packages 2024-06-28 09:13:22 +02:00
Sarah Hoffmann
44d5148e5f fix merge issues 2024-06-26 11:52:47 +02:00
Sarah Hoffmann
2bab0ca060 port unit tests to new python package layout 2024-06-26 11:52:47 +02:00
Sarah Hoffmann
5b02cd22b9 add tests for new importance CSV import 2024-05-16 15:23:54 +02:00
Sarah Hoffmann
60b03d506f add CSV format for importance import 2024-05-16 15:23:54 +02:00